Genome-Wide Analysis of the Impact of Diet on Liver and Intestines in Mouse and SNPs in Human HNF4a Binding Sites
- Martinez Lomeli, Jose
- Advisor(s): Sladek, Frances M
Abstract
The use of next-generation sequencing technology has been extremely useful for genome-wide analysis in many different organisms. For example, transcriptomic analysis based on RNA-seq can identify all changes in gene expression in a given tissue in a single experiment. Other types of experiments, such as ChIP-seq, allow mapping of transcription factor binding sites across the genome. Much of the work in this dissertation focuses on the transcription factor hepatocyte nuclear factor 4a (HNF4a) which is a member of the nuclear receptor superfamily of ligand-dependent transcription factors and abundantly expressed in the liver and intestines. HNF4a has been linked to several diseases including diabetes, liver and colon cancer, inflammatory bowel disease and others. The expression of HNF4a is driven by two promoters (P1 and P2) which result in the expression of 12 different isoforms. Additionally, HNF4a has as its endogenous ligand an essential fatty acid (linoleic acid, LA) that must be obtained from the diet and is known to play a role in gluconeogenesis during fasting. Therefore, we examine fasted vs fed conditions as well as high fat diets with different amounts of LA on gene expression in the liver and intestines, respectively. Finally, HNF4a target genes have been extensively studied and mutations in HNF4a binding sites in regulatory regions of select target genes associated with disease have been identified. However, to date, there has been no systematic approach to identify variants in HNF4a binding sites of target genes on a genome-wide scale.In Chapter 1 of this dissertation, we provide an introduction to the biological and research problems addressed in the remainder of the dissertation and review the relevant literature. In Chapter 2, we analyze transcriptomic (RNA-seq) data from male mice fed three different high fat diets (HFDs) with different amounts of LA from four different tissues across the mouse intestinal tract. We found that different portions of the intestines have unique gene profiles, especially in terms of nuclear receptor signaling, xenobiotic and drug metabolism, and intestinal epithelial barrier function. We found that different types of HFDs impact gene expression in the intestinal tract in different ways, including genes linked to diseases such as inflammatory bowel disease and colon cancer. A network analysis revealed that mice fed the HFDs had altered expression of intestinal genes involved in crucial body functions such as the immune system and the intestinal microbiome which could facilitate bacterial and viral infections such as SARS-CoV-2. To our knowledge, this is the most comprehensive dataset including multiple diets and multiple parts of the intestines and should provide an excellent resource for the scientific community for some time to come. In Chapter 3, we compared the impact of fasting and HNF4a isoforms on liver gene expression and alternative splicing, comparing and contrasting two RNA-seq datasets from male and female mice. We also compared wild type mice (express predominantly P1-HNF4a isoforms) to exon swap mice (a7HMZ) which express only the P2 isoforms of HNF4a which are typically not expressed in the adult liver. We found that a 12-hour fast has a significant effect on alternative splicing and that the P2-HNF4a isoforms seem to play a role. Interestingly, there was a different pattern of alternative splicing in female livers. In Chapter 4, we analyze the impact of single nucleotide polymorphisms (SNPs) on HNF4a binding sites using publicly available datasets of HNF4a ChIP-seq (chromatin immunoprecipitation) from human liver and the liver cancer cell line HepG2 which expresses HNF4a and exhibits a hepatocyte phenotype. For this purpose, we trained a support vector machine (SVM) to predict binding affinity scores based on DNA sequences known to bind HNF4a from protein binding microarray (PBM) data. This model allowed us to identify 10 putative affinity altering SNPs (aaSNPs) in the human liver HNF4a ChIP-seq and six in the HepG2 ChIP-seq that impact the binding of HNF4a to the chromatin. Additionally, some of these identified SNPs were found in genes related to cancer, suggesting a potential role in personalized medicine. These results present a proof concept of affinity disruption in binding between HNF4a and some target genes in a genome-wide analysis. Finally, in Chapter 5, we provide an example of how transcriptomic data could be used to generate new biological hypotheses and test them. Additionally, we provide future directions for Chapter 3 and Chapter 4.