Skip to main content
eScholarship
Open Access Publications from the University of California

Open Access Policy Deposits

This series is automatically populated with publications deposited by UC San Diego Department of Computer Science & Engineering researchers in accordance with the University of California’s open access policies. For more information see Open Access Policy Deposits and the UC Publication Management System.
Cover page of Macrophages on the run: Exercise balances macrophage polarization for improved health

Macrophages on the run: Exercise balances macrophage polarization for improved health

(2024)

Objective

Exercise plays a crucial role in maintaining and improving human health. However, the precise molecular mechanisms that govern the body's response to exercise or/compared to periods of inactivity remain elusive. Current evidence appears to suggest that exercise exerts a seemingly dual influence on macrophage polarization states, inducing both pro-immune response M1 activation and cell-repair-focused M2 activation. To reconcile this apparent paradox, we leveraged a comprehensive meta-analysis of 75 diverse exercise and immobilization published datasets (7000+ samples), encompassing various exercise modalities, sampling techniques, and species.

Methods

75 exercise and immobilization expression datasets were identified and processed for analysis. The data was analyzed using boolean relationships which uses binary gene expression relationships in order to increase the signal to noise achieved from the data, allowing for the use of comparison across such a diverse set of datasets. We utilized a boolean relationship-aided macrophage gene model [1], to model the macrophage polarization state in pre and post exercise samples in both immediate exercise and long term training.

Results

Our modeling uncovered a key temporal dynamic: exercise triggers an immediate M1 surge, while long term training transitions to sustained M2 activation. These patterns were consistent across different species (human vs mouse), sampling methods (blood vs muscle biopsy), and exercise type (resistance vs endurance), and routinely showed statistically significant results. Immobilization was shown to have the opposite effect of exercise by triggering an immediate M2 activation. Individual characteristics like gender, exercise intensity and age were found to impact the degree of polarization without changing the overall patterns. To model macrophages within the specific context of muscle tissue, we identified a focused gene set signature of muscle resident macrophage polarization, allowing for the precise measurement of macrophage activity in response to exercise within the muscle.

Conclusions

These consistent patterns across all 75 examined studies suggest that the long term health benefits of exercise stem from its ability to orchestrate a balanced and temporally-regulated interplay between pro-immune response (M1) and reparative macrophage activity (M2). Similarly, it suggests that an imbalance between pro-immune and cell repair responses could facilitate disease development. Our findings shed light on the intricate molecular choreography behind exercise-induced health benefits with a particular insight on its effect on the macrophages within the muscle.

Cover page of Artificial intelligence in food and nutrition evidence: The challenges and opportunities.

Artificial intelligence in food and nutrition evidence: The challenges and opportunities.

(2024)

Science-informed decisions are best guided by the objective synthesis of the totality of evidence around a particular question and assessing its trustworthiness through systematic processes. However, there are major barriers and challenges that limit science-informed food and nutrition policy, practice, and guidance. First, insufficient evidence, primarily due to acquisition cost of generating high-quality data, and the complexity of the diet-disease relationship. Furthermore, the sheer number of systematic reviews needed across the entire agriculture and food value chain, and the cost and time required to conduct them, can delay the translation of science to policy. Artificial intelligence offers the opportunity to (i) better understand the complex etiology of diet-related chronic diseases, (ii) bring more precision to our understanding of the variation among individuals in the diet-chronic disease relationship, (iii) provide new types of computed data related to the efficacy and effectiveness of nutrition/food interventions in health promotion, and (iv) automate the generation of systematic reviews that support timely decisions. These advances include the acquisition and synthesis of heterogeneous and multimodal datasets. This perspective summarizes a meeting convened at the National Academy of Sciences, Engineering, and Medicine. The purpose of the meeting was to examine the current state and future potential of artificial intelligence in generating new types of computed data as well as automating the generation of systematic reviews to support evidence-based food and nutrition policy, practice, and guidance.

Cover page of Artificial intelligence-generated feedback on social signals in patient-provider communication: technical performance, feedback usability, and impact.

Artificial intelligence-generated feedback on social signals in patient-provider communication: technical performance, feedback usability, and impact.

(2024)

OBJECTIVES: Implicit bias perpetuates health care inequities and manifests in patient-provider interactions, particularly nonverbal social cues like dominance. We investigated the use of artificial intelligence (AI) for automated communication assessment and feedback during primary care visits to raise clinician awareness of bias in patient interactions. MATERIALS AND METHODS: (1) Assessed the technical performance of our AI models by building a machine-learning pipeline that automatically detects social signals in patient-provider interactions from 145 primary care visits. (2) Engaged 24 clinicians to design usable AI-generated communication feedback for their workflow. (3) Evaluated the impact of our AI-based approach in a prospective cohort of 108 primary care visits. RESULTS: Findings demonstrate the feasibility of AI models to identify social signals, such as dominance, warmth, engagement, and interactivity, in nonverbal patient-provider communication. Although engaged clinicians preferred feedback delivered in personalized dashboards, they found nonverbal cues difficult to interpret, motivating social signals as an alternative feedback mechanism. Impact evaluation demonstrated fairness in all AI models with better generalizability of provider dominance, provider engagement, and patient warmth. Stronger clinician implicit race bias was associated with less provider dominance and warmth. Although clinicians expressed overall interest in our AI approach, they recommended improvements to enhance acceptability, feasibility, and implementation in telehealth and medical education contexts. DISCUSSION AND CONCLUSION: Findings demonstrate promise for AI-driven communication assessment and feedback systems focused on social signals. Future work should improve the performance of this approach, personalize models, and contextualize feedback, and investigate system implementation in educational workflows. This work exemplifies a systematic, multistage approach for evaluating AI tools designed to raise clinician awareness of implicit bias and promote patient-centered, equitable health care interactions.

Cover page of HDBind: encoding of molecular structure with hyperdimensional binary representations.

HDBind: encoding of molecular structure with hyperdimensional binary representations.

(2024)

Traditional methods for identifying hit molecules from a large collection of potential drug-like candidates rely on biophysical theory to compute approximations to the Gibbs free energy of the binding interaction between the drug and its protein target. These approaches have a significant limitation in that they require exceptional computing capabilities for even relatively small collections of molecules. Increasingly large and complex state-of-the-art deep learning approaches have gained popularity with the promise to improve the productivity of drug design, notorious for its numerous failures. However, as deep learning models increase in their size and complexity, their acceleration at the hardware level becomes more challenging. Hyperdimensional Computing (HDC) has recently gained attention in the computer hardware community due to its algorithmic simplicity relative to deep learning approaches. The HDC learning paradigm, which represents data with high-dimension binary vectors, allows the use of low-precision binary vector arithmetic to create models of the data that can be learned without the need for the gradient-based optimization required in many conventional machine learning and deep learning methods. This algorithmic simplicity allows for acceleration in hardware that has been previously demonstrated in a range of application areas (computer vision, bioinformatics, mass spectrometery, remote sensing, edge devices, etc.). To the best of our knowledge, our work is the first to consider HDC for the task of fast and efficient screening of modern drug-like compound libraries. We also propose the first HDC graph-based encoding methods for molecular data, demonstrating consistent and substantial improvement over previous work. We compare our approaches to alternative approaches on the well-studied MoleculeNet dataset and the recently proposed LIT-PCBA dataset derived from high quality PubChem assays. We demonstrate our methods on multiple target hardware platforms, including Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs), showing at least an order of magnitude improvement in energy efficiency versus even our smallest neural network baseline model with a single hidden layer. Our work thus motivates further investigation into molecular representation learning to develop ultra-efficient pre-screening tools. We make our code publicly available at https://github.com/LLNL/hdbind .

Cover page of Y and mitochondrial chromosomes in the heterogeneous stock rat population

Y and mitochondrial chromosomes in the heterogeneous stock rat population

(2024)

Genome-wide association studies typically evaluate the autosomes and sometimes the X Chromosome, but seldom consider the Y or mitochondrial (MT) Chromosomes. We genotyped the Y and MT Chromosomes in heterogeneous stock (HS) rats (Rattus norvegicus), an outbred population created from 8 inbred strains. We identified 8 distinct Y and 4 distinct MT Chromosomes among the 8 founders. However, only 2 types of each nonrecombinant chromosome were observed in our modern HS rat population (generations 81-97). Despite the relatively large sample size, there were virtually no significant associations for behavioral, physiological, metabolome, or microbiome traits after correcting for multiple comparisons. However, both Y and MT Chromosomes were strongly associated with the expression of a few genes located on those chromosomes, which provided a positive control. Our results suggest that within modern HS rats there are no Y and MT Chromosomes differences that strongly influence behavioral or physiological traits. These results do not address other ancestral Y and MT Chromosomes that do not appear in modern HS rats, nor do they address effects that may exist in other rat populations, or in other species.

Cover page of Clearing the plate: a strategic approach to mitigate well-to-well contamination in large-scale microbiome studies.

Clearing the plate: a strategic approach to mitigate well-to-well contamination in large-scale microbiome studies.

(2024)

UNLABELLED: Large-scale studies are essential to answer questions about complex microbial communities that can be extremely dynamic across hosts, environments, and time points. However, managing acquisition, processing, and analysis of large numbers of samples poses many challenges, with cross-contamination being the biggest obstacle. Contamination complicates analysis and results in sample loss, leading to higher costs and constraints on mixed sample type study designs. While many researchers opt for 96-well plates for their workflows, these plates present a significant issue: the shared seal and weak separation between wells leads to well-to-well contamination. To address this concern, we propose an innovative high-throughput approach, termed as the Matrix method, which employs barcoded Matrix Tubes for sample acquisition. This method is complemented by a paired nucleic acid and metabolite extraction, utilizing 95% (vol/vol) ethanol to stabilize microbial communities and as a solvent for extracting metabolites. Comparative analysis between conventional 96-well plate extractions and the Matrix method, measuring 16S rRNA gene levels via quantitative polymerase chain reaction, demonstrates a notable decrease in well-to-well contamination with the Matrix method. Metagenomics, 16S rRNA gene amplicon sequencing (16S), and untargeted metabolomics analysis via liquid chromatography-tandem mass spectrometry (LC-MS/MS) confirmed that the Matrix method recovers reproducible microbial and metabolite compositions that can distinguish between subjects. This advancement is critical for large-scale study design as it minimizes well-to-well contamination and technical variation, shortens processing times, and integrates with automated infrastructure for enhancing sample randomization and metadata generation. IMPORTANCE: Understanding dynamic microbial communities typically requires large-scale studies. However, handling large numbers of samples introduces many challenges, with cross-contamination being a major issue. It not only complicates analysis but also leads to sample loss and increased costs and restricts diverse study designs. The prevalent use of 96-well plates for nucleic acid and metabolite extractions exacerbates this problem due to their wells having little separation and being connected by a single plate seal. To address this, we propose a new strategy using barcoded Matrix Tubes, showing a significant reduction in cross-contamination compared to conventional plate-based approaches. Additionally, this method facilitates the extraction of both nucleic acids and metabolites from a single tubed sample, eliminating the need to collect separate aliquots for each extraction. This innovation improves large-scale study design by shortening processing times, simplifying analysis, facilitating metadata curation, and producing more reliable results.

Cover page of Differential gut microbiota composition in β-Thalassemia patients and its correlation with iron overload.

Differential gut microbiota composition in β-Thalassemia patients and its correlation with iron overload.

(2024)

Recent research highlights the significant impact of the gut microbiota on health and disease. Thalassemia, a hereditary blood disorder, requires regular blood transfusions, leading to an accumulation of iron in the body. Such changes could potentially alter the intestinal microbiota, thereby increasing the susceptibility of thalassemic patients to infection. In this study, we analyzed the fecal microbiota of 70 non-transfusion-dependent (NTDT) β-thalassemia/HbE patients and 30 healthy controls. Our findings indicate that iron chelation intervention had no detectable effect on the microbiome profile of thalassemic patients. However, the cross-sectional analysis revealed that the bacterial diversity and community structure in patients were significantly less diverse and distinct compared to those of healthy subjects. Using reference frames, we were also able to demonstrate that bacterial taxa that are known to produce short chain fatty acids, from the genera Alistipes, Coprococcus, and Oscillospira, and those from the family Ruminococcaceae, were less prevalent in the patients. In contrast, bacterial taxa associated with an unhealthy gut, including the genus Clostridium and those from the families Fusobacteriaceae, Enterobacteriaceae, and Peptostrptococcaceae, were more prevalent in patients and found to be correlated with higher levels of ferritin. Collectively, these changes in the microbiota could be regarded as markers of raised ferritin levels, and therefore, awareness should be exercised as they could interfere, albeit indirectly, with the treatment of the co-morbidities of thalassemia.

Cover page of CoRAL accurately resolves extrachromosomal DNA genome structures with long-read sequencing.

CoRAL accurately resolves extrachromosomal DNA genome structures with long-read sequencing.

(2024)

Extrachromosomal DNA (ecDNA) is a central mechanism for focal oncogene amplification in cancer, occurring in ∼15% of early-stage cancers and ∼30% of late-stage cancers. ecDNAs drive tumor formation, evolution, and drug resistance by dynamically modulating oncogene copy number and rewiring gene-regulatory networks. Elucidating the genomic architecture of ecDNA amplifications is critical for understanding tumor pathology and developing more effective therapies. Paired-end short-read (Illumina) sequencing and mapping have been utilized to represent ecDNA amplifications using a breakpoint graph, in which the inferred architecture of ecDNA is encoded as a cycle in the graph. Traversals of breakpoint graphs have been used to successfully predict ecDNA presence in cancer samples. However, short-read technologies are intrinsically limited in the identification of breakpoints, phasing together complex rearrangements and internal duplications, and deconvolution of cell-to-cell heterogeneity of ecDNA structures. Long-read technologies, such as from Oxford Nanopore Technologies, have the potential to improve inference as the longer reads are better at mapping structural variants and are more likely to span rearranged or duplicated regions. Here, we propose Complete Reconstruction of Amplifications with Long reads (CoRAL) for reconstructing ecDNA architectures using long-read data. CoRAL reconstructs likely cyclic architectures using quadratic programming that simultaneously optimizes parsimony of reconstruction, explained copy number, and consistency of long-read mapping. CoRAL substantially improves reconstructions in extensive simulations and 10 data sets from previously characterized cell lines compared with previous short- and long-read-based tools. As long-read usage becomes widespread, we anticipate that CoRAL will be a valuable tool for profiling the landscape and evolution of focal amplifications in tumors.