- Wainberg, Michael;
- Sinnott-Armstrong, Nasa;
- Mancuso, Nicholas;
- Barbeira, Alvaro N;
- Knowles, David A;
- Golan, David;
- Ermel, Raili;
- Ruusalepp, Arno;
- Quertermous, Thomas;
- Hao, Ke;
- Björkegren, Johan LM;
- Im, Hae Kyung;
- Pasaniuc, Bogdan;
- Rivas, Manuel A;
- Kundaje, Anshul
Transcriptome-wide association studies (TWAS) integrate genome-wide association studies (GWAS) and gene expression datasets to identify gene-trait associations. In this Perspective, we explore properties of TWAS as a potential approach to prioritize causal genes at GWAS loci, by using simulations and case studies of literature-curated candidate causal genes for schizophrenia, low-density-lipoprotein cholesterol and Crohn's disease. We explore risk loci where TWAS accurately prioritizes the likely causal gene as well as loci where TWAS prioritizes multiple genes, some likely to be non-causal, owing to sharing of expression quantitative trait loci (eQTL). TWAS is especially prone to spurious prioritization with expression data from non-trait-related tissues or cell types, owing to substantial cross-cell-type variation in expression levels and eQTL strengths. Nonetheless, TWAS prioritizes candidate causal genes more accurately than simple baselines. We suggest best practices for causal-gene prioritization with TWAS and discuss future opportunities for improvement. Our results showcase the strengths and limitations of using eQTL datasets to determine causal genes at GWAS loci.