Gene-embedded transposable elements (TE) significantly influence RNA processing. A variety of RNA-binding proteins (RBPs) exert post-transcriptional regulation via TE binding. Transcriptome-wide identification of RBP binding sites can be accomplished by UV crosslinking and immunoprecipitation, followed by sequencing (CLIP-seq). However, the technical demands of CLIP and the repetitive nature of TEs present challenges to the large-scale investigation of the interplay between RBPs and TEs. Addressing these challenges requires the development of specialized computational approaches.
In the first part of the dissertation, we present a dedicated RBP-centric computational framework for the systematic study of RBP-TE interactions. In this framework, we use both multi-mapped reads and uniquely mapped reads to recover RBP binding sites on transposable element. By applying this framework to a unified resource of 223 eCLIP-seq datasets from ENCODE, we observed extensive binding of a wide range of RNA-binding proteins to three major TE families: L1, L2 and Alu. For most RBPs, their motif frequencies in TE families with which they interact are higher than the average frequency of the motif over all TE families. Furthermore, we investigate the functional effects of RBP-TE interaction on TE exonization, a process of incorporation of intronic TEs into mature RNAs. This process usually have undesirable consequences, so mechanisms exist for repressing it. (e.g., MATR3 for repressing exonization of antisense L1 elements and HNRNPC for repressing exonization of antisense Alu elements). We identified two novel repressors for TE exonization: HNRNPM for antisense L1 and XRCC6(Ku70) for antisense Alu. XRCC6(Ku70) is previously known as a DNA-binding protein engaged in the DNA repair pathway. We found the selective repression of a set of antisense Alu exons by XRCC6(Ku70) and the strengthened XRCC6 binding in the close vicinity of 3’ splice sites of these exons. More intriguingly, our analysis showed that XRCC6 can provide additional repressiveness for Alu exons which have a relatively short continuous U-tract in the proximal upstream of 3’SS, on which the effects of the global Alu repressor HNRNPC are compromised.
In the second chapter, we further disclose the functional implication of RBP-TE interactions on other post-transcriptional events, including RNA editing and RNA stability. By integrating RBP binding with differential RNA editing, we found that ILF3 can suppress RNA editing at sites in inverted repeat Alu elements. Besides, we showed that UPF1, the core factor of the pathway of nonsense-mediated mRNA decay, can enable RNA decay by binding to Alu elements on 3’UTR.
Taken together, our analysis improves our understanding of RBP-TE interplay and further illustrates functional implications of these interactions in post-transcriptional regulation.