Skip to main content
eScholarship
Open Access Publications from the University of California
Cover page of Machine learning-led semi-automated medium optimization reveals salt as key for flaviolin production in Pseudomonas putida.

Machine learning-led semi-automated medium optimization reveals salt as key for flaviolin production in Pseudomonas putida.

(2025)

Although synthetic biology can produce valuable chemicals in a renewable manner, its progress is still hindered by a lack of predictive capabilities. Media optimization is a critical, and often overlooked, process which is essential to obtain the titers, rates and yields needed for commercial viability. Here, we present a molecule- and host-agnostic active learning process for media optimization that is enabled by a fast and highly repeatable semi-automated pipeline. Its application yielded 60% and 70% increases in titer, and 350% increase in process yield in three different campaigns for flaviolin production in Pseudomonas putida KT2440. Explainable Artificial Intelligence techniques pinpointed that, surprisingly, common salt (NaCl) is the most important component influencing production. The optimal salt concentration is very high, comparable to seawater and close to the limits that P. putida can tolerate. The availability of fast Design-Build-Test-Learn (DBTL) cycles allowed us to show that performance improvements for active learning are rarely monotonous. This work illustrates how machine learning and automation can change the paradigm of current synthetic biology research to make it more effective and informative, and suggests a cost-effective and underexploited strategy to facilitate the high titers, rates and yields essential for commercial viability.

Cover page of A novel approach for large-scale wind energy potential assessment

A novel approach for large-scale wind energy potential assessment

(2025)

Increasing wind energy generation is central to grid decarbonization, yet methods to estimate wind energy potential are not standardized, leading to inconsistencies and even skewed results. This study aims to improve the fidelity of wind energy potential estimates through an approach that integrates geospatial analysis and machine learning (i.e., Gaussian process regression). We demonstrate this approach to assess the spatial distribution of wind energy capacity potential in the Contiguous United States (CONUS). We find that the capacity-based power density ranges from 1.70 MW/km2 (25th percentile) to 3.88 MW/km2 (75th percentile) for existing wind farms in the CONUS. The value is lower in agricultural areas (2.73 ± 0.02 MW/km2, mean ± 95 % confidence interval) and higher in other land cover types (3.30± 0.03 MW/km2). Notably, advancements in turbine manufacturing could reduce power density in areas with lower wind speeds by adopting low specific-power turbines, but improve power density in areas with higher wind speeds (>8.35 m/s at 120m above the ground), highlighting opportunities for repowering existing wind farms. Wind energy potential is shaped by wind resource quality and is regionally characterized by land cover and physical conditions, revealing significant capacity potential in the Great Plains and Upper Texas. The results indicate that areas previously identified as hot spots using existing approaches (e.g., the west of the Rocky Mountains) may have a limited capacity potential due to low wind resource quality. Improvements in methodology and capacity potential estimates in this study could serve as a new basis for future energy systems analysis and planning.

Cover page of The tier system: a host development framework for bioengineering

The tier system: a host development framework for bioengineering

(2025)

Development of microorganisms into mature bioproduction host strains has typically been a slow and circuitous process, wherein multiple groups apply disparate approaches with minimal coordination over decades. To help organize and streamline host development efforts, we introduce the Tier System for Host Development, a conceptual model and guide for developing microbial hosts that can ultimately lead to a systematic, standardized, less expensive, and more rapid workflow. The Tier System is made up of three Tiers, each consisting of a unique set of strain development Targets, including experimental tools, strain properties, experimental information, and process models. By introducing the Tier System, we hope to improve host development activities through standardization and systematization pertaining to nontraditional chassis organisms.

Cover page of In planta production of the nylon precursor beta-ketoadipate

In planta production of the nylon precursor beta-ketoadipate

(2025)

Beta-ketoadipate (βKA) is an intermediate of the βKA pathway involved in the degradation of aromatic compounds in several bacteria and fungi. Beta-ketoadipate also represents a promising chemical for the manufacturing of performance-advantaged nylons. We established a strategy for the in planta synthesis of βKA via manipulation of the shikimate pathway and the expression of bacterial enzymes from the βKA pathway. Using Nicotiana benthamiana as a transient expression system, we demonstrated the efficient conversion of protocatechuate (PCA) to βKA when plastid-targeted bacterial-derived PCA 3,4-dioxygenase (PcaHG) and 3-carboxy-cis,cis-muconate cycloisomerase (PcaB) were co-expressed with 3-deoxy-D-arabinoheptulosonate 7-phosphate synthase (AroG) and 3-dehydroshikimate dehydratase (QsuB). This metabolic pathway was reconstituted in Arabidopsis by introducing a construct (pAtβKA) with stacked pcaG, pcaH, and pcaB genes into a PCA-overproducing genetic background that expresses AroG and QsuB (referred as QsuB-2). The resulting QsuB-2 x pAtβKA stable lines displayed βKA titers as high as 0.25% on a dry weight basis in stems, along with a drastic reduction in lignin content and improvement of biomass saccharification efficiency compared to wild-type controls, and without any significant reduction in biomass yields. Using biomass sorghum as a potential crop for large-scale βKA production, techno-economic analysis indicated that βKA accumulated at titers of 0.25% and 4% on a dry weight basis could be competitively priced in the range of $2.04-34.49/kg and $0.47-2.12/kg, respectively, depending on the selling price of the residual biomass recovered after βKA extraction. This study lays the foundation for a more environmentally-friendly synthesis of βKA using plants as production hosts.

Cover page of Microbial Pathways for Cost-Effective Low-Carbon Renewable Indigoidine

Microbial Pathways for Cost-Effective Low-Carbon Renewable Indigoidine

(2025)

Indigoidine is a bioadvantaged platform molecule with diverse applications, including use as a textile dye, biotransistor, biosolar cell, biosensor, and food coloring. There are multiple microbial hosts and carbon sources that can be used and optimized for its production, yet there is limited guidance for which options have the greatest commercial potential. Here, we consider five different host microbes and combine genome-scale metabolic models with techno-economic and lifecycle assessment models. Pseudomonas putida currently outperforms synthetic indigo production and other indigoidine-producing hosts, using glucose, xylose, and lignin-derived aromatics to produce indigoidine at a minimum selling price of $2.9/kg and a greenhouse gas (GHG) footprint of 3.5 kgCO2e/kg. Optimizing pathways-achieving 90% of the theoretical indigoidine yield from sugars and aromatics-can reduce costs 6-7-fold and GHG emissions 3-10-fold. From a cost perspective, microbes that co-utilize aromatics are advantageous, while selecting hosts that coproduce other value-added molecules can reduce GHG emissions. System-wide improvements and the use of a low-cost, low-carbon nitrogen source are crucial for commercial viability in all cases.

Cover page of Prenol production in a microbial host via the “Repass” Pathways

Prenol production in a microbial host via the “Repass” Pathways

(2025)

Prenol and isoprenol are promising advanced biofuels and serve as biosynthetic precursors for pharmaceuticals, fragrances, and other industrially relevant compounds. Despite engineering improvements that circumvent intermediate cytotoxicity and lower energy barriers, achieving high titer 'mevalonate (MVA)-derived' prenol has remained elusive. Difficulty in selective prenol production stems from the necessary isomerization of isopentenyl diphosphate (IPP) to dimethylallyl diphosphate (DMAPP) as well as the intrinsic toxicity of these diphosphate precursors. Here, the expression of specific isopentenyl monophosphate kinases with model-guided enzyme substitution of diphosphate isomerases and phosphatases enabled selective cycling of monophosphates and diphosphates, dramatically improving prenol titers and selectivity in Escherichia coli. Pairing this approach with the canonical MVA pathway resulted in 300 mg/L prenol at a 30:1 ratio with isoprenol. Further pairing with the "IPP-Bypass" pathway resulted in 526 mg/L prenol at a 72:1 ratio with isoprenol, the highest and purest MVA-derived prenol titer to date. Additionally, modifying this "IPP-Repass" for DMAPP production and coexpressing the prenyltransferase acPT1 yielded 48.3 mg/L of the potential therapeutic precursor drupanin from p-coumarate. These novel repass pathways establish a unique strategy for tuning diphosphate precursors to drive isoprenoid biosynthesis and prenylation reactions.

Cellular morphometric biomarkers and large language model predict prognosis and treatment response in neuroblastoma patients: A retrospective and double-blind prospective single arm clinical study

(2025)

Background

The heterogeneity of Neuroblastoma (NB) leads to variation in response to treatment and outcomes. The aim of the current study is to discover AI-empowered cellular morphometric biomarkers (CMBs), to establish the corresponding CMB risk score (CMBRS), CMB risk group (CMBRG), large language model driven CMB risk score (CMB-LLM-RS), and large language model driven CMB risk group (CMB-LLM-RG), and to investigate and validate their prognostic and predictive power in NB.

Methods

In this study, the retrospective cohort enrolled 84 primary NBs between 1/2020 and 12/2021, followed up through 11/22/2024; the prospective cohort enrolled 67 primary NBs between 1/2022 and 7/2023, followed up through 11/22/2024.

Results

We identified 9 CMBs from a retrospective NB cohort, enabling the CMBRS, CMBRG, CMB-LLM-RS, and CMB-LLM-RG. Both CMBRG and CMB-LLM-RG are significantly associated with prognosis (p < 0.0001) and treatment response (p < 0.0001). Furthermore, we double-blindly validated the predictive power of CMBRG and CMB-LLM-RG in a prospective NB cohort, which confirms their potential value in real clinical settings. Importantly, CMBRG provides clinical value independent of the International Neuroblastoma Risk Group (INRG) classification system in both retrospective and prospective NB cohorts (p < 0.05); and the combination of CMBRG and INRG significantly increases prognostic and predictive performance for NB patients.

Conclusions

These findings suggest that CMBRG and CMB-LLM-RG have prognostic and predictive value for NB and warrants evaluation in larger multicenter cohorts.

Enzymatic cleavage of model lignin dimers depends on pH, enzyme, and bond type

(2025)

Lignin is composed of phenylpropanoid monomers linked by ether and carbon-carbon bonds to form a complex heterogeneous structure. Bond-specific studies of lignin-modifying enzymes (LMEs; e.g., laccases and peroxidases) are limited by the polymerization of model lignin substrates and repolymerization of cleavage products. Here we present a high throughput platform to screen LME activities on four tagged model lignin compounds that represent the β-O-4', β-β', 5-5', and 4-O-5' linkages in lignin. We utilized nanostructure-initiator mass spectrometry (NIMS) and model lignin compounds with tags containing perfluorinated and cationic moieties, which effectively limit polymerization and condensation of the substrates and their degrading products. Sub-microliter sample droplets were printed on the NIMS chip with a novel robotics method. This rapid platform enabled characterization of LMEs across a range of pH 3-10 and relative quantification of modified (typically oxidized), cleaved, and polymerized products. All tested enzymes oxidized the four substrates and cleaved the β-O-4' and β-β' substrates to monomeric products. We discovered that the active pH range depended on both the substrate and the enzyme type. This has important applications for biomass conversion to biofuels and bioproducts, where the relative percentages of different bond types in lignin varies depending on feedstock and chemical pretreatment methods.