Genome-wide association studies (GWAS) have identified thousands of regions in the genome containing risk variants for complex traits. Due to the correlation structure between ge- netic variants, there is a need for computational methods that can tease apart causal from non-causal variants in these implicated regions. This dissertation presents three statistical methods that aim to improve our detection of causal variants at risk regions and ultimately better our understanding of the genetic basis of complex disease.
The first method aims to fine-map genetic regions impacting multiple correlated traits at once, employing the Multivariate Normal (MVN) distribution to jointly model association statistics at a risk region.
The second method performs hierarchical fine-mapping on risk regions that show evidence for a SNP impacting gene expression through an epigenetic feature, such as histone modifi- cations. It uses both the MVN as well as the Matrix-variate Normal distribution to jointly model effects from SNP to epigenetic mark to gene expression.
The third method builds on existing summary statistics imputation methods by integrating functional annotation data to improve prediction of associations at untyped SNPs.