Search

Article
Peer Reviewed

Predictive analyses of regulatory sequences with EUGENe.

UC San Diego Previously Published Works (2023)

Deep learning has become a popular tool to study cis-regulatory function. Yet efforts to design software for deep-learning analyses in regulatory genomics that are findable, accessible, interoperable and reusable (FAIR) have fallen short of fully meeting these criteria. Here we present elucidating the utility of genomic elements with neural nets (EUGENe), a FAIR toolkit for the analysis of genomic sequences with deep learning. EUGENe consists of a set of modules and subpackages for executing the key functionality of a genomics deep learning workflow: (1) extracting, transforming and loading sequence data from many common file formats; (2) instantiating, initializing and training diverse model architectures; and (3) evaluating and interpreting model behavior. We designed EUGENe as a simple, flexible and extensible interface for streamlining and customizing end-to-end deep-learning sequence analyses, and illustrate these principles through application of the toolkit to three predictive modeling tasks. We hope that EUGENe represents a springboard towards a collaborative ecosystem for deep-learning applications in genomics research.

Cover page: Predictive analyses of regulatory sequences with EUGENe.

Article
Peer Reviewed

GRIEVOUS: your command-line general for resolving cross-dataset genotype inconsistencies

UC San Diego Previously Published Works (2024)

Summary

Harmonizing variant indexing and allele assignments across datasets is crucial for data integrity in cross-dataset studies such as multi-cohort genome-wide association studies, meta-analyses, and the development, validation, and application of polygenic risk scores. Ensuring this indexing and allele consistency is a laborious, time-consuming, and error-prone process requiring a certain degree of computational proficiency. Here, we introduce GRIEVOUS, a command-line tool for cross-dataset variant homogenization. By means of an internal database and a custom indexing methodology, GRIEVOUS identifies, formats, and aligns all biallelic single nucleotide polymorphisms (SNPs) across all summary statistic and genotype files of interest. Upon completion of dataset harmonization, GRIEVOUS can also be used to extract the maximal set of biallelic SNPs common to all datasets.

Availability and implementation

GRIEVOUS and all supporting documentation and tutorials can be found at https://github.com/jvtalwar/GRIEVOUS. It is freely and publicly available under the MIT license and can be installed via pip.

Cover page: GRIEVOUS: your command-line general for resolving cross-dataset genotype inconsistencies

Article
Peer Reviewed

Autoimmune HLA Alleles and Neoepitope Presentation Predict Post-Allogenic Transplant Relapse

UC San Diego Previously Published Works (2023)

Introduction

Allogeneic hematopoietic stem cell transplantation (allo-HSCT) can cure patients with high-risk myelodysplastic syndromes (MDS) and acute myeloid leukemia (AML). However, many patients relapse or develop debilitating graft-versus-host disease. Transplant restores T-cell reactivity against tumor cells, implicating patient human leukocyte antigen (HLA)-dependent antigen presentation via the major histocompatibility complex as a determinant of response. We sought to identify characteristics of the HLA genotype that influence response in allo-HSCT patients.

Methods

We collected HLA genotype and panel-based somatic mutation profiles for 55 patients with AML and MDS and available data treated at the University of California San Diego Moores Cancer Center between May 2012 and January 2019. We evaluated characteristics of the HLA genotype relative to relapse-free time and overall survival (OS) post-allo-HSCT using univariable and multivariable regression.

Results

In multivariable regression, the presence of an autoimmune allele was significantly associated with relapse-free time (hazard ratio [HR], 0.25; p = 0.01) and OS (HR, 0.16; p < 0.005). The better potential of the donor HLA type to present peptides harboring driver mutations trended toward better relapse-free survival (HR, 0.45; p = 0.07) and significantly correlated with longer OS (HR, 0.33; p = 0.01) though only a minority of cases had an HLA mismatch.

Conclusion

In this single institution retrospective study of patients receiving allo-HSCT for relapsed AML/MDS, characteristics of an individual's HLA genotype (presence of an autoimmune allele and potential of the donor HLA to better present peptides representing driver mutations) were significantly associated with better outcomes. These findings suggest that HLA type may guide the optimal application of allo-HSCT and merit evaluation in larger cohorts. ClinicalTrials.gov Identifier: NCT02478931.

Cover page: Autoimmune HLA Alleles and Neoepitope Presentation Predict Post-Allogenic Transplant Relapse

Article

PRState: Incorporating Genetic Ancestry in Prostate Cancer Risk scores for African American Men

UC San Diego Previously Published Works (2022)

Prostate cancer (PrCa) is one of the most genetically driven solid cancers with heritability estimates as high as 57%. African American men are at an increased risk of PrCa; however, current risk prediction models are based on European ancestry groups and may not be broadly applicable. In this study, we define an African ancestry group of 4,533 individuals to develop an African ancestry-specific PrCa polygenic risk score (PRState). We identified risk loci on chromosomes 3, 8, and 11 in the African ancestry group GWAS and constructed a polygenic risk score (PRS) from 10 African ancestry-specific PrCa risk SNPs, achieving an AUC of 0.61 [0.60-0.63] and 0.65 [0.64-0.67], when combined with age and family history. Performance dropped significantly when using ancestry-mismatched PRS models but remained comparable when using trans-ancestry models. Importantly, we validated the PRState score in the Million Veteran Program, demonstrating improved prediction of PrCa and metastatic PrCa in African American individuals. This study underscores the need for inclusion of individuals of African ancestry in gene variant discovery to optimize PRS.

Article
Peer Reviewed

PRState: Incorporating genetic ancestry in prostate cancer risk scores for men of African ancestry

UC San Diego Previously Published Works (2022)

Background

Prostate cancer (PrCa) is one of the most genetically driven solid cancers with heritability estimates as high as 57%. Men of African ancestry are at an increased risk of PrCa; however, current polygenic risk score (PRS) models are based on European ancestry groups and may not be broadly applicable. The objective of this study was to construct an African ancestry-specific PrCa PRS (PRState) and evaluate its performance.

Methods

African ancestry group of 4,533 individuals in ELLIPSE consortium was used for discovery of African ancestry-specific PrCa SNPs. PRState was constructed as weighted sum of genotypes and effect sizes from genome-wide association study (GWAS) of PrCa in African ancestry group. Performance was evaluated using ROC-AUC analysis.

Results

We identified African ancestry-specific PrCa risk loci on chromosomes 3, 8, and 11 and constructed a polygenic risk score (PRS) from 10 African ancestry-specific PrCa risk SNPs, achieving an AUC of 0.61 [0.60-0.63] and 0.65 [0.64-0.67], when combined with age and family history. Performance dropped significantly when using ancestry-mismatched PRS models but remained comparable when using trans-ancestry models. Importantly, we validated the PRState score in the Million Veteran Program (MVP), demonstrating improved prediction of PrCa and metastatic PrCa in individuals of African ancestry.

Conclusions

African ancestry-specific PRState improves PrCa prediction in African ancestry groups in ELLIPSE consortium and MVP. This study underscores the need for inclusion of individuals of African ancestry in gene variant discovery to optimize PRSs and identifies African ancestry-specific variants for use in future studies.

Cover page: PRState: Incorporating genetic ancestry in prostate cancer risk scores for men of African ancestry

Article
Peer Reviewed

Autoimmune alleles at the major histocompatibility locus modify melanoma susceptibility.

UC San Diego Previously Published Works (2023)

Autoimmunity and cancer represent two different aspects of immune dysfunction. Autoimmunity is characterized by breakdowns in immune self-tolerance, while impaired immune surveillance can allow for tumorigenesis. The class I major histocompatibility complex (MHC-I), which displays derivatives of the cellular peptidome for immune surveillance by CD8+ T cells, serves as a common genetic link between these conditions. As melanoma-specific CD8+ T cells have been shown to target melanocyte-specific peptide antigens more often than melanoma-specific antigens, we investigated whether vitiligo- and psoriasis-predisposing MHC-I alleles conferred a melanoma-protective effect. In individuals with cutaneous melanoma from both The Cancer Genome Atlas (n = 451) and an independent validation set (n = 586), MHC-I autoimmune-allele carrier status was significantly associated with a later age of melanoma diagnosis. Furthermore, MHC-I autoimmune-allele carriers were significantly associated with decreased risk of developing melanoma in the Million Veteran Program (OR = 0.962, p = 0.024). Existing melanoma polygenic risk scores (PRSs) did not predict autoimmune-allele carrier status, suggesting these alleles provide orthogonal risk-relevant information. Mechanisms of autoimmune protection were neither associated with improved melanoma-driver mutation association nor improved gene-level conserved antigen presentation relative to common alleles. However, autoimmune alleles showed higher affinity relative to common alleles for particular windows of melanocyte-conserved antigens and loss of heterozygosity of autoimmune alleles caused the greatest reduction in presentation for several conserved antigens across individuals with loss of HLA alleles. Overall, this study presents evidence that MHC-I autoimmune-risk alleles modulate melanoma risk unaccounted for by current PRSs.

Cover page: Autoimmune alleles at the major histocompatibility locus modify melanoma susceptibility.

Article
Peer Reviewed

Germline modifiers of the tumor immune microenvironment implicate drivers of cancer risk and immunotherapy response

UC San Diego Previously Published Works (2023)

With the continued promise of immunotherapy for treating cancer, understanding how host genetics contributes to the tumor immune microenvironment (TIME) is essential to tailoring cancer screening and treatment strategies. Here, we study 1084 eQTLs affecting the TIME found through analysis of The Cancer Genome Atlas and literature curation. These TIME eQTLs are enriched in areas of active transcription, and associate with gene expression in specific immune cell subsets, such as macrophages and dendritic cells. Polygenic score models built with TIME eQTLs reproducibly stratify cancer risk, survival and immune checkpoint blockade (ICB) response across independent cohorts. To assess whether an eQTL-informed approach could reveal potential cancer immunotherapy targets, we inhibit CTSS, a gene implicated by cancer risk and ICB response-associated polygenic models; CTSS inhibition results in slowed tumor growth and extended survival in vivo. These results validate the potential of integrating germline variation and TIME characteristics for uncovering potential targets for immunotherapy.

Cover page: Germline modifiers of the tumor immune microenvironment implicate drivers of cancer risk and immunotherapy response