In eukaryotic cells, a key regulatory step in target gene expression is the recognition of the core promoter by the TFIID complex. TFIID is composed of the TATA-box binding protein (TBP) and TBP-Associated Factors (TAFs) (Goodrich and Tjian, 2010; Hochheimer and Tjian, 2003; Naar et al., 2001). To understand gene regulation at the core promoter, I have been focusing on one component of the core promoter recognition complex-TAF3 that was originally identified as a subunit of the TFIID complex in HeLa cells (Gangloff et al., 2001). It was later found that, while other TFIID subunits are destroyed during myogenesis, TAF3 is selectively retained in myotubes in a specialized complex with TBP-related factor 3, TRF3 (Deato and Tjian, 2007) and the sub-nuclear localization of TAF3 serves as another potential mechanism to influence transcription during myogenesis (Yao et al., 2011). Intriguingly, TAF3 recognizes trimethylated histone H3 lysine 4 (H3K4me3) (Vermeulen et al., 2007), which is associated with actively transcribed genes and also with silent developmental genes that are poised for activation upon ES cell differentiation (Mikkelsen et al., 2007).
In our recent studies, surprisingly it was found that TAF3 is highly enriched in ES cells. More importantly, during in vitro EB formation, which mimics early embryogenesis, TAF3 protein levels were selectively reduced. However, the levels of other canonical TAFs and TBP remained stable. Functional and gene expression analysis revealed that high levels of TAF3 are dispensable for ES cell self-renewal but are required for endoderm lineage differentiation and to prevent premature specification of neuroectoderm and mesoderm. Genome-wide binding studies by ChIP-seq found that unlike other canonical TFIID subunits such as TBP and TAF1 that bind only promoters, TAF3 targets both promoters and distal enhancer-like sites in ES cells.
Importantly, TAF3 sites were further clustered into four distinct classes based on the context of co-localization with other factors. Briefly, as well as TAF3, class 1 regions are promoter regions enriched for TFIID, Pol II and H3K4me3. Their proximity (< 1kb) to the TSS of known genes confirms that they are predominantly TFIID-bound promoters. By contrast, Class 2 regions have low levels of TFIID and H3K4me3 but are enriched for Oct4/Nanog/Sox2 and mediator components. Class 3 regions are specifically enriched for TAF3, CTCF and cohesin subunits. Finally, class 4 regions are not enriched for any of the factors we considered besides TAF3. Regions from the last three Classes are generally far away (~10's kb) from the TSS.
Further studies revealed that genes down-regulated upon TAF3 depletion are selectively associated with regions enriched for TAF3, CTCF and cohesin (Class 3) and TAF3 only (Class 4), while no such association was seen for genes up-regulated upon TAF3 depletion. Thus, it's likely that class 3 and class 4 regions are required for the efficient expression of genes in their vicinity and depletion of TAF3 interferes with this function. Indeed, our parallel biochemical experiments confirmed a direct protein-protein interaction between that TAF3 and CTCF. This novel interaction was further shown to mediate DNA looping between promoter distal sites and core promoters to regulate proper transcription activation. Notably, we also found that the class 2 binding regions are enriched around up-regulated genes upon TAF3 depletion, consistent with a mechanism of transcription repression by TAF3 in association with Oct4/Nanog/Sox2. Together, our data support the model that TAF3 orchestrates a complex long-distance chromatin interaction network that safeguards the finely-balanced transcriptional programs underlying ES pluripotency.