It has been more than a decade since the human genome was sequenced, but a complete understanding of the functional elements in the human genome is still lacking, especially for the non-coding part of the genome. The lack of complete understanding of the genome makes interpreting the function of genetic variants a daunting challenge. Here I exploited multiple ways to decipher the function of genetic variants by leveraging knowledge about transcriptional regulation and three-dimension genome organization.
First, we developed SNP-SELEX, a high throughput method to assess the effect of SNPs on transcription factor (TF) binding. I demonstrated the superior performance of SNP-SELEX over previous delta PWM models, and applied results of SNP-SELEX to identify putative causal variants for type 2 diabetes. Furthermore, I employed deltaSVM algorithm to develop models that could predict the effect of SNPs on TF binding for any non-coding variants. Those models not only outperform delta PWM models in vitro and in vivo but also could help identify novel master regulator for complex traits and diseases.
Next, I co-led a study to investigate the effect of genetic variants on three-dimensional (3D) chromatin conformation. I identified thousands of regions across the genome where 3D chromatin conformation varies between individuals and found those variations often accompany changes in other genome functions. Moreover, I found DNA sequence variations could influence 3D chromatin conformation and mapped hundreds of Quantitative Trait Loci (QTLs) associated with 3D chromatin features, some of which confer disease risk.
Finally, I analyzed Hi-C data from human embryonic stem cells differentiated to beta cell progenitors to characterize changes in chromatin organizations during differentiation. I identified chromatin loops that are dynamic during different stages and found those loops are also associated with transcriptional regulation. Further, I revealed that chromatin loops form interaction hubs that are related to the establishment of stage-specific transcriptional programs.