Magee, Lucas James

Extracting Graph Structure from Data via Topological Methods with Applications to Neuroscience

2024

Magee, Lucas James
Advisor(s): Wang, Yusu

Abstract

Datasets are often noisy, high-dimensional, and complex, but they frequently contain intrinsic structures that can aid in both understanding the data and enabling downstream applications. One such structure is graphs, and in particular, trees. This dissertation develops efficient methodologies using geometric and topological ideas to extract graph-like structures from both low- and high-dimensional datasets, with applications in neuroscience.

The first direction focuses on extracting tree structures from 2D and 3D imaging data. Specifically, we aim to extract neuronal tree morphologies from mouse brain imaging datasets. We employ discrete Morse (DM) graph reconstruction to improve neuronal process segmentation and single neuron skeletonization. Additionally, we have published both 2D and 3D full brain skeletonization frameworks on Github.

In the second direction, we explore decomposing full brain neuronal process skeletonizations into the individual neurons that make up the graph. This involves decomposing a graph with node density into a sum of monotone trees, which model individual neurons. We demonstrate that this generalization and several related problems are NP-complete, establish approximation bounds, and present approximation algorithms for solving these problems.

In the third direction, we extend our approach to handle high-dimensional, noisy point cloud datasets (PCDs). This requires us to view the DM algorithm from a filtration perspective instead of a density perspective. We propose a generalized algorithm that guarantees lex-optimal cycles in output graphs and combine this with the sparse weighted Rips filtration for efficient and effective graph extraction from PCDs.

In the final direction, we combine the generalized algorithm with a filtration defined with respect to Jaccard index to develop a DM graph reconstruction algorithm for scRNA-seq datasets. Output graphs are then used to accurately analyze gene expression gradients between cell types, develop cell type taxonomies, and quantify changes in gene expression over Alzheimer’s disease progression.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC San Diego

Extracting Graph Structure from Data via Topological Methods with Applications to Neuroscience