UC Berkeley Electronic Theses and Dissertations

Sort By:

Show:

Object-Centric Perception for Real-World Robotics

Mishra, Nikhil
Advisor(s): Abbeel, Pieter

(2024)

Deep learning has resulted in incredible progress in many applications of artificial intelligence.However, these techniques often fall short when applied to robotics, due to their inability to reason about the ambiguity that often arises in the real world. Much of this ambiguity stems from the real world’s long-tail visual diversity – in particular, the huge variety of objects that robots must interact with. Such shortcomings are only exacerbated by the strict requirements for autonomous, high-throughput operation that deployed systems must meet, as well as the cost and difficulty of obtaining the large-scale training datasets that modern deep learning methods require.

In this thesis, we explore two primary avenues of addressing these challenges. First, we introduce models that can better express uncertainty in challenging or ambiguous situations, across a variety of 2D and 3D perception tasks. Real-world robots can incorporate these models to reason explicitly about ambiguity, in flexible ways depending on their specific tasks. Second, we extend the capabilities of neural renderers to develop a sim2real2sim method that can drastically reduce the amount of data needed to train such models. From only a handful of in-the-wild examples, our method learns to generate synthetic scenes, targeted to specific real objects and environments, that can be used to train downstream perception models for a variety of tasks.

Application of Engineered Cell Culture Models to Study Glioma Stem Cell Motility

Amofa, Kwasi Yeboa
Advisor(s): Kumar, Sanjay

(2024)

In the past two decades, glioma stem cells (GSCs) have been increasingly implicated in driving tumor initiation, resistance, recurrence, growth and invasion in glioblastoma (GBM). Despite in vivo evidence identifying GSCs at invasive regions of GBM tissue, GSC invasion and by extension motility has been less well appreciated. Furthermore, GBM tumor invasion is informed by a multitude of biochemical factors, such as cytokines and growth factors, and biophysical factors, such as matrix and stroma geometry and mechanics. GBM tumors invade slowly through the hyaluronic acid (HA)-rich parenchyma and then rapidly along microvascular tracks of varying geometries. How biophysical and biochemical factors together contribute to driving invasion of GBM tumor cells such as GSCs within these regions is not well understood. Progress in understanding this combinatorial effect is limited by a lack of physiologically representative cell culture models that can enable systematic investigations to gain better understanding of regulators of invasion.

In this dissertation, we first applied an HA-based hydrogel model of the brain parenchyma to study transforming growth factor beta (TGF- β) induced invasion of GSCs. We demonstrate that in response to TGF-β, GSCs differ in their ability to invade HA in a way that can be predicted from TGF-β receptor 2 expression and SMAD2 phosphorylation. Additionally, we found an association between TGF-β responsiveness and GSC subtype. Interestingly, TGF- β stimulated GSC invasion exhibited a strong dependence on the presence of RGD peptides. Next, we deployed protein micropattern lines with vessel-like geometries to understand the emergent cell migration behavior of GSCs along a confined environment. We tested multiple GSC lines and found that vessel-like geometries enhanced both migration speed and persistence in GSCs. However, no individual GSC line demonstrated both enhanced migration speed and persistence, suggesting that vessel-like geometric confinement differentially influence migration dynamics of GSCs.

Designing Machine Learning-Enhanced Tools and Physics-based Techniques for Force Field and Electrostatic Models

Guan, Xingyi
Advisor(s): Head-Gordon, Teresa

(2024)

In recent years, the landscape of molecular science has been profoundly transformed by the integration of data-driven methodologies alongside traditional deterministic and stochastic approaches. Historically, the study of molecular behavior and interactions relied heavily on deterministic algorithms, which follow a fixed sequence of computational steps to simulate molecular dynamics, and stochastic simulations, which incorporate randomness to explore various molecular states and pathways. These methods were complemented by physical models grounded in the established principles of chemistry and physics, forming the backbone of theoretical molecular science. However, these conventional approaches often faced limitations in scalability, computational cost, and generalizability for complex systems. The improvements in computational hardware, coupled with the accumulation of vast amounts of molecular data, have enabled the development of models that can surpass traditional methods in both accuracy and efficiency, leveraging both physics-based and machine learning (ML) approaches. This dissertation focuses on the development of new models utilizing more accessible data, provides guidelines for computational data generation, and explores the synergy between data acquisition strategies and data-driven models. These studies demonstrate that by carefully designing data acquisition strategies and integrating data-driven models with physics-based approaches, it is possible to enhance the predictive capabilities of computational methods in chemistry, particularly in force field development and electrostatic modeling. Through a series of studies, this work illustrates the potential of combining the strengths of both traditional and modern computational techniques to achieve more accurate and efficient predictions in molecular science.

The accurate prediction of electrostatic interactions is a critical aspect of understanding molecular behavior. The electrostatic potential (ESP) is a property of great research interest for understanding and predicting electrostatic charge distributions that drive interactions between molecules. However, traditional approaches often rely on detailed quantum mechanical calculations, which can be computationally expensive. In Chapter 2, I introduce a coarse-grained electron model (C-GEM), whose parameters are fitted to computationally generated high-quality Density Functional Theory (DFT) data, that offers a balance between accuracy and computational efficiency. Extensive validation against high-level quantum mechanical calculations demonstrates that C-GEM can reliably predict electrostatic potentials and interaction energies in proteins. The model's implementation in large-scale molecular simulations shows significant reductions in computational cost, making it a viable tool for studying complex biological systems.

The generation of reference data for deep learning models poses significant challenges for reactive systems, especially for combustion reactions due to the extreme conditions that produce radical species and alternative spin states. In Chapter 3, intrinsic reaction coordinate (IRC) calculations are extended with \textit{ab initio} molecular dynamics (MD) simulations and normal mode displacement calculations to comprehensively map the potential energy surface (PES) for 19 reaction channels involved in hydrogen combustion. This extensive dataset comprises approximately 290,000 potential energies and 1,260,000 nuclear force vectors, evaluated using a high-quality range-separated hybrid density functional, $\omega$B97X-V. The dataset includes detailed information on transition state configurations as well as geometries along the reactive path way that links reactant to product, providing a robust reference for training deep learning models aimed at studying hydrogen combustion reactions. This benchmark dataset not only serves as a valuable resource for understanding the intricate mechanistic pathways of hydrogen combustion but also provide a paradigm for building dataset that facilitates the development and validation of machine learning models for reactive chemistry.

Building on the extensive benchmark dataset for hydrogen combustion detailed in Chapter 3, an initial machine learning model is trained to predict energies and forces for hydrogen combustion reactive system using NewtonNet, a physics inspired equivariant message passing neural network(MPNN). This reactive gas phase chemistry network is particularly challenging due to the need for comprehensive potential energy surfaces that accurately represent a wide range of molecular configurations. Traditional approaches often rely on chemical intuition to select training data, which can result in incomplete PESs in an ML setting. To address this challenge, I employ an active learning strategy to systematically explores diverse energy landscapes using metadynamics simulations and continuously adding unseen data for retraining, helping to create a ML model that avoids unforeseen high-energy or unphysical configurations. By integrating metadynamics, the active learning process more rapidly converges the PES, also allowing a hybrid of ML and ab initio molecular dynamics (MD) that initiates rare calls to external ab initio sources when discrepancies are detected by the query by committee models. This hybrid ML-physics approach reduces computational costs by two orders of magnitude and eliminates the need for excessive ML retraining. The enhanced model accurately predicts free energy changes and transition state mechanisms for several hydrogen combustion reaction channels, demonstrating the efficacy of combining advanced data acquisition strategies with robust ML techniques to achieve high precision and efficiency in molecular simulations.

To summarize, this dissertation underscores the potential of combining data-driven models with physics-based approaches to overcome the limitations of traditional computational methods in molecular science. Through the development of the coarse-grained electron model (C-GeM), the creation of a comprehensive benchmark dataset for hydrogen combustion, and the implementation of an active learning workflow for reactive force field development, insights are provided in developing new computational tools and leveraging them to better understand molecular interactions and reactivity.

Empowering Large Language Models with Efficient and Automated Systems

Li, Zhuohan
Advisor(s): Stoica, Ion

(2024)

Large Language Models (LLMs) have shown remarkable capabilities in a variety of tasks, including chatting, programming, and searching. However, the high costs of LLMs are preventing these models from being deployed for the vast majority of applications. In this dissertation, we focus on building efficient and automated systems to reduce costs and democratize access to large language models.

We first introduce systems to optimize computational efficiency and reduce the engineering overhead for distributed LLM training. We develop TeraPipe, which proposes a new dimension to perform pipeline parallel training for LLMs, and also Alpa, the world’s first compiler capable of automatically distributing arbitrary neural networks with all existing parallelization methods.

While training is typically a one-time cost, deploying and serving an LLM requires running LLM inference continuously, which is the top blocker for the real-world deployment of LLMs. We improve the serving scalability with AlpaServe through model parallelism, and increase the memory utilization and the LLM inference throughput with a new attention algorithm, PagedAttention, and an end-to-end serving system, vLLM.

Overall, these systems provide comprehensive solutions that significantly improve both training and inference efficiency for large language models. Together, these systems lower the high costs associated with large language models, democratizing their deployment across various real-world applications.

Super-Resolution Microscopy and Single-Molecule Diffusivity Mapping: Applications in Cell Biology and Biophysics

Unger, Bret
Advisor(s): Xu, Ke

(2024)

Fluorescence microscopy has allowed for decades of elegant and compelling biological discovery. However, the diffraction limit of light caps the spatial resolution of fluorescence microscopy to 250-300 nanometers. As the size of an average protein molecule is ~ 3 nm, diffraction-limited spatial resolution is often more than an order of magnitude larger than what is required to resolve nanoscale cellular structures and processes. With the advent of single-molecule localization microscopy (SMLM) and super-resolution microscopy (SRM), the spatial resolution of fluorescence microscopy has been improved more than 10-fold relative to conventional methods. Further, the fundamental principles of SMLM have engendered multiple other experimental methods to simultaneously probe a second informational domain, in addition to precise spatial information. For example, much of the work in this writing utilizes a functional SRM method known as single-molecule diffusivity mapping (SMdM), which will be overviewed in Chapter 1.3. We use SMdM to show that in the mammalian cell, the assembly and disassembly of the vimentin cytoskeleton is highly sensitive to the protein net charge state. Starting with the intriguing observation that the vimentin cytoskeleton fully disassembles under hypotonic stress yet reassembles within seconds upon osmotic pressure recovery, we pinpoint ionic strength as its underlying driving factor. Further modulating the pH and expressing vimentin constructs with differently charged linkers, we converge on a model in which the vimentin cytoskeleton is destabilized by Coulomb repulsion when its mass-accumulated negative charges are less screened or otherwise intensified. Additionally, we identify a key molecular player, DELE1, in relaying mitochondrial stress to the cytosol and triggering the integrated stress response. Then, using SMdM we corroborate the aforementioned finding and show that the intraorganellar diffusivity of both DELE1 and cytochrome c implies the presence of unique electrostatic interactions in the mitochondrial intermembrane space. Together, these studies represent some of the first applications of SMdM to study native cellular proteins rather than exogenous tracer proteins.

Perceptual Alignment for Human-Centered Design Computing: Quantifying Similarity and Semantic Representations

Nandy, Ananya
Advisor(s): Goucher-Lambert, Kosa

(2024)

During early-stage design processes, designers must navigate significant uncertainty and make sense of abstract, multi-dimensional goals (e.g., function, aesthetics, ergonomics), eventually synthesizing them into design outcomes. Data-driven design is a paradigm that aims to leverage data and computational methods to support decision making, allowing designers to surpass cognitive limits (e.g., idea fixation). However, concepts fundamental to decision making during early-stage design (e.g., ‘What are similar design ideas?’ and ‘Will the design reflect dependability?’) are ill-defined, cognitively complex, and not well-represented by computation. Therefore, a key challenge is to align computational representations with how humans perceive and process information, enabling designers to accurately express their intent. To address this challenge, my dissertation research explores behavioral studies and computational techniques to understand and quantify representations (both cognitive and reflected within design artifacts) of these complex concepts throughout the design process. First, I demonstrate how function can be quantifiably compared across engineered systems and products, and how human perceptions of similarity align. Then, I show how intangible semantic prompts (e.g., dependable, versatile, comfortable) can be tangibly reflected in designs, by humans and through human-in-the-loop computation. The insights derived from this work contribute to human-centered computing for early-stage design, enabling designers to more easily and effectively design innovative products.

Free resolutions, linkage, and representation theory

Ni, Xianglong
Advisor(s): Eisenbud, David

(2024)

Spanning two papers from 1989 and 2018, Weyman unearthed a fascinating connection between commutative algebra and representation theory in his study of generic free resolutions of length three. This thesis is devoted to analyzing this connection further. In the first half, we show that certain Kazhdan-Lusztig varieties provide generic examples of ideals in the linkage class of a complete intersection. For those of embedding codimension three, we also compute the free resolutions of their coordinate rings. We later show that these specialize to resolutions of all grade three licci ideals.

In the second half, we develop the machinery of higher structure maps originating from Weyman's generic ring. Using the free resolutions constructed previously, we disprove Hochster's conjecture on finite generation of generic rings. The two perspectives converge in the final chapter of the thesis, in which we develop an ADE correspondence to completely classify grade three perfect ideals with small type and deviation.

Until the Music Stopped: The Second Concert in European Inter-State Relations, 1878-1908

Smith, Richard Keith
Advisor(s): Connelly, John

(2024)

The causes of the First World War remain a central preoccupation for international relations scholars. Some find them in the actions of particular aggressors, others in the logic of zero-sum competition between bipolar alliance blocs. Still others describe how an ever more mechanical European state system became increasingly inflexible until it seized up and exploded. I turn this perennial query on its head, asking not why war erupted in 1914, but how Europe’s political class was able to avoid wars during the thirty-three years of pan-Great Power peace stretching from the Berlin Congress (1878) to the Italo-Turkish War (1911).

I argue that the Berlin Congress founded an international regime, which, like its predecessor founded at the Congress of Vienna (1814-15), was framed as a “Concert of Europe,” and was predicated not on a balance of power but on normative principles of international relations, chief among them being the inviolability of member state territorial integrity and sovereignty. Like its Vienna predecessor, this Second Concert recognized subordinate principles, namely minority protections, national self-determination, and human rights. By surveying reportage on the Concert during the regime-challenging crises these subordinate, and state-challenging, principles instigated, I show how the Concert-loyalty of the regime’s member states led to affirmations of the supremacy of the territorial state, thereby preserving both the Concert regime and general peace. This era of tranquility ended with the Bosnian Crisis (1908-09), which saw the first violation of Concert principles by one of its members since the regime’s founding, resulting in the Concert’s dissolution. Europe, plunged back into an international state of nature in which power alone ruled, experienced rapidly escalating violence that culminated in general war.

Scalable and Efficient Systems for Large Deep Learning Models

(2024)

Recent advancements in machine learning have primarily been driven by large-scale deep learning models, particularly large language models. The large scale and new capabilities of these models present challenges in designing infrastructure systems to support their entire lifecycle, from training and serving to evaluation. To meet the high computational and memory requirements of these models, while fully utilizing and accurately evaluating their capabilities, we need to redesign many system components, such as compilers, distributed computing platforms, programming systems, and evaluation methods.

In this dissertation, we introduce a suite of systems designed and built to support large models, covering training, serving, and evaluation phases. First, we discuss Alpa, a system for large-scale model-parallel training, which automatically generates distributed execution plans integrating both inter- and intra-operator parallelism. Moving on to serving, we introduce Ansor, a compiler that produces high-performance implementations of tensor programs for various hardware backends. We also explore SGLang, a system for deploying large language models that includes both a flexible front-end programming interface and an optimized back-end runtime for fast inference. Lastly, in the evaluation phase, we detail our efforts in model evaluation, which include Chatbot Arena, a crowdsourced live benchmark platform, and LLM-as-a-Judge, an automated evaluation pipeline. These tools collectively form a full-stack system for the continuous improvement of large models.

Modeling Surface Stabilization in Nanoclusters and Nanoparticles

McCandler, Caitlin
Advisor(s): Persson, Kristin

(2024)

Nanoclusters, with their atomically-precise structures that have metal cores and molecular-like electronic structures, are interesting materials with a broad range of applications in energy and chemical industries. Gold clusters are of specific interest due to their high quantum efficiency, bio-compatibility, catalytic activity and selectivity. Current characterization techniques lack the temporal and spatial resolution required to understand the behavior of nanoclusters in relevant environments, and as such, synthetic efforts to design new nan- oclusters largely rely on trial and error. This thesis describes several approaches to better understand the synthesis and stability of nanoclusters using theory and modeling.

As with all modeling, there are trade-offs between model complexity and efficiency. In this thesis, nanoclusters and nanoparticles are simulated with appropriate modeling techniques to answer the relevant questions that are posed at different length and time-scales. In the first part of this thesis, density functional theory (DFT) simulations are employed to map the potential energy surfaces of gold nanoclusters stabilized by ligands. Considering ensembles of clusters was very important in this work, and over 10,000 phosphine-stabilized gold clusters were generated with a ligation algorithm and their stabilities as a function of environment were calculated. These simulations give insight into the importance of ligands in determining stable cluster conformations, as well as the impact of cluster size and ligation on electronic structure and bonding. Next, simulations over longer time periods than would be tractable to calculate with DFT are performed, made possible by the development of an interatomic potential for thiolate-protected gold nanoclusters. The interatomic potential is fitted to many examples of DFT calculated nanoclusters and learns the energy-structure relationships that are present in all thiolate-protected gold nanoclusters. The potential is used to perform long (∼0.1μs) simulations of Au25(SR)18, a known nanocluster that is remarkably stable. Interesting mechanisms are uncovered in the simulations which are not yet possible to observe experimentally. Finally, in order to understand the surprising miscibility of the immiscible elements Au and Rh in ultra-small nanoparticles, I developed a continuum model informed by DFT calculations but applicable to any size of nanoparticle that depends on the enthalpy and entropy of mixing as well as surface energies and surface affinities to adsorbate species present in the synthesis conditions. The unusual mixing behavior observed experimentally is in fact due to the nature of the surface environment of the particle. Overall, at any length scale, the conclusion remains the same: the surface environment has a remarkable impact on nanocluster and nanoparticle energetics and behavior, and care must be taken to model them appropriately to achieve the goal of synthesis by design.