Search

Scholarly Works (7 results)

Sort By:

Article
Peer Reviewed

Experimental Inferential Structure Determination of Ensembles for Intrinsically Disordered Proteins

UC Berkeley Previously Published Works (2016)

We develop a Bayesian approach to determine the most probable structural ensemble model from candidate structures for intrinsically disordered proteins (IDPs) that takes full advantage of NMR chemical shifts and J-coupling data, their known errors and variances, and the quality of the theoretical back-calculation from structure to experimental observables. Our approach differs from previous formulations in the optimization of experimental and back-calculation nuisance parameters that are treated as random variables with known distributions, as opposed to structural or ensemble weight optimization or use of a reference ensemble. The resulting experimental inferential structure determination (EISD) method is size extensive with O(N) scaling, with N = number of structures, that allows for the rapid ranking of large ensemble data comprising tens of thousands of conformations. We apply the EISD approach on singular folded proteins and a corresponding set of ∼25 000 misfolded states to illustrate the problems that can arise using Boltzmann weighted priors. We then apply the EISD method to rank IDP ensembles most consistent with the NMR data and show that the primary error for ranking or creating good IDP ensembles resides in the poor back-calculation from structure to simulated experimental observable. We show that a reduction by a factor of 3 in the uncertainty of the back-calculation error can improve the discrimination among qualitatively different IDP ensembles for the amyloid-beta peptide.

Cover page: Experimental Inferential Structure Determination of Ensembles for Intrinsically Disordered Proteins

Article
Peer Reviewed

Family of Oxygen–Oxygen Radial Distribution Functions for Water

UC Berkeley Previously Published Works (2015)

In a typical X-ray diffraction experiment, the elastically scattered intensity, I(Q), is the experimental observable. I(Q) contains contributions from both intramolecular as well as intermolecular correlations embodied in the scattering factors, HOO(Q) and HOH(Q), with negligible contributions from HHH(Q). Thus, to accurately define the oxygen-oxygen radial distribution function, gOO(r), a model of the electron density is required to accurately weigh the HOO(Q) component relative to the intramolecular and oxygen-hydrogen correlations from the total intensity observable. In this work, we carefully define the electron density model and its underlying assumptions and more explicitly utilize two restraints on the allowable gOO(r) functions, which must conform to both very low experimental errors at high Q and the need to satisfy the isothermal compressibility at low Q. Although highly restrained by these conditions, the underdetermined nature of the problem is such that we present a family of gOO(r) values that provide equally good agreement with the high-Q intensity and compressibility restraints and with physically correct behavior at small r.

Cover page: Family of Oxygen–Oxygen Radial Distribution Functions for Water

Article
Peer Reviewed

On the sparsity of fitness functions and implications for learning

UC Berkeley Previously Published Works (2022)

Fitness functions map biological sequences to a scalar property of interest. Accurate estimation of these functions yields biological insight and sets the foundation for model-based sequence design. However, the fitness datasets available to learn these functions are typically small relative to the large combinatorial space of sequences; characterizing how much data are needed for accurate estimation remains an open problem. There is a growing body of evidence demonstrating that empirical fitness functions display substantial sparsity when represented in terms of epistatic interactions. Moreover, the theory of Compressed Sensing provides scaling laws for the number of samples required to exactly recover a sparse function. Motivated by these results, we develop a framework to study the sparsity of fitness functions sampled from a generalization of the NK model, a widely used random field model of fitness functions. In particular, we present results that allow us to test the effect of the Generalized NK (GNK) model's interpretable parameters-sequence length, alphabet size, and assumed interactions between sequence positions-on the sparsity of fitness functions sampled from the model and, consequently, the number of measurements required to exactly recover these functions. We validate our framework by demonstrating that GNK models with parameters set according to structural considerations can be used to accurately approximate the number of samples required to recover two empirical protein fitness functions and an RNA fitness function. In addition, we show that these GNK models identify important higher-order epistatic interactions in the empirical fitness functions using only structural information.

Cover page: On the sparsity of fitness functions and implications for learning

Article
Peer Reviewed

Role of Hydrophilicity and Length of Diblock Arms for Determining Star Polymer Physical Properties

UC Berkeley Previously Published Works (2015)

We present a molecular simulation study of star polymers consisting of 16 diblock copolymer arms bound to a small adamantane core by varying both arm length and the outer hydrophilic block when attached to the same hydrophobic block of poly-δ-valerolactone. Here we consider two biocompatible star polymers in which the hydrophilic block is composed of polyethylene glycol (PEG) or polymethyloxazoline (POXA) in addition to a polycarbonate-based polymer with a pendant hydrophilic group (PC1). We find that the different hydrophilic blocks of the star polymers show qualitatively different trends in their interactions with aqueous solvent, orientational time correlation functions, and orientational correlation between pairs of monomers of their polymeric arms in solution, in which we find that the PEG polymers are more thermosensitive compared with the POXA and PC1 star polymers over the physiological temperature range we have investigated.

Cover page: Role of Hydrophilicity and Length of Diblock Arms for Determining Star Polymer Physical Properties

Article
Peer Reviewed

PB‐AM: An open‐source, fully analytical linear poisson‐boltzmann solver

UC Berkeley Previously Published Works (2017)

We present the open source distributed software package Poisson-Boltzmann Analytical Method (PB-AM), a fully analytical solution to the linearized PB equation, for molecules represented as non-overlapping spherical cavities. The PB-AM software package includes the generation of outputs files appropriate for visualization using visual molecular dynamics, a Brownian dynamics scheme that uses periodic boundary conditions to simulate dynamics, the ability to specify docking criteria, and offers two different kinetics schemes to evaluate biomolecular association rate constants. Given that PB-AM defines mutual polarization completely and accurately, it can be refactored as a many-body expansion to explore 2- and 3-body polarization. Additionally, the software has been integrated into the Adaptive Poisson-Boltzmann Solver (APBS) software package to make it more accessible to a larger group of scientists, educators, and students that are more familiar with the APBS framework. © 2016 Wiley Periodicals, Inc.

Cover page: PB‐AM: An open‐source, fully analytical linear poisson‐boltzmann solver

Article
Peer Reviewed

Finding Our Way in the Dark Proteome

UC Berkeley Previously Published Works (2016)

The traditional structure-function paradigm has provided significant insights for well-folded proteins in which structures can be easily and rapidly revealed by X-ray crystallography beamlines. However, approximately one-third of the human proteome is comprised of intrinsically disordered proteins and regions (IDPs/IDRs) that do not adopt a dominant well-folded structure, and therefore remain "unseen" by traditional structural biology methods. This Perspective considers the challenges raised by the "Dark Proteome", in which determining the diverse conformational substates of IDPs in their free states, in encounter complexes of bound states, and in complexes retaining significant disorder requires an unprecedented level of integration of multiple and complementary solution-based experiments that are analyzed with state-of-the art molecular simulation, Bayesian probabilistic models, and high-throughput computation. We envision how these diverse experimental and computational tools can work together through formation of a "computational beamline" that will allow key functional features to be identified in IDP structural ensembles.

Cover page: Finding Our Way in the Dark Proteome

Article
Peer Reviewed

Improvements to the APBS biomolecular solvation software suite

UC San Diego Previously Published Works (2018)

The Adaptive Poisson-Boltzmann Solver (APBS) software was developed to solve the equations of continuum electrostatics for large biomolecular assemblages that have provided impact in the study of a broad range of chemical, biological, and biomedical applications. APBS addresses the three key technology challenges for understanding solvation and electrostatics in biomedical applications: accurate and efficient models for biomolecular solvation and electrostatics, robust and scalable software for applying those theories to biomolecular systems, and mechanisms for sharing and analyzing biomolecular electrostatics data in the scientific community. To address new research applications and advancing computational capabilities, we have continually updated APBS and its suite of accompanying software since its release in 2001. In this article, we discuss the models and capabilities that have recently been implemented within the APBS software package including a Poisson-Boltzmann analytical and a semi-analytical solver, an optimized boundary element solver, a geometry-based geometric flow solvation model, a graph theory-based algorithm for determining pK_a values, and an improved web-based visualization tool for viewing electrostatics.