Skip to main content
eScholarship
Open Access Publications from the University of California

UC Berkeley

UC Berkeley Electronic Theses and Dissertations bannerUC Berkeley

Statistical and Machine Learning Methods for Understanding Spatial Biology and Cellular Communication

No data is associated with this publication.
Abstract

Recent advancements in spatial transcriptomics (ST) and single-cell RNA sequencing (scRNA-seq) technologies have revolutionized our understanding of cellular heterogeneity and spatial organization within tissues. This dissertation presents computational methods and applications for analyzing these data modalities to investigate the tumor microenvironment, cell-cell interactions, and immune cell migration across tissues. Our first work utilizes XYZeq data to analyze transcriptome at a single-cell resolution while maintaining spatial context. This includes applying non-negative matrix factorization to identify spatially variable gene modules and performing trajectory inference to examine gene expression profiles differentially expressed across the cell proximity in the tumor microenvironment. Next, we present ggPair (gene-gene Pair), a new computational method for predicting ligand-receptor pairs using ST data. Our approach diverges from traditional methods by employing a convolutional autoencoder for unsupervised feature extraction followed by a siamese neural network to predict gene-gene interactions, thereby enabling the discovery of previously unrecognized interacting genes. Moreover, focusing on T cell differentiation and migration, we detail our analysis investigating the transcriptional signatures that distinguish immune cell populations in the context of an autoimmune disease model, using scRNA-seq data and Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) knock-out data. Our analysis differentiates transcriptional activities of T regulatory cells (Tregs) from T effector cells, providing insights into immune cell dynamics and migration in disease states. In summary, we provide different computational strategies to dissect complex biological data, extending our understanding of cellular heterogeneity, spatial organization, and the cellular communication.

Main Content

This item is under embargo until September 27, 2026.