Skip to main content
eScholarship
Open Access Publications from the University of California

The landscape of unfolding with machine learning

(2025)

Recent innovations from machine learning allow for data unfolding, without binning and including correlations across many dimensions. We describe a set of known, upgraded, and new methods for ML-based unfolding. The performance of these approaches are evaluated on the same two datasets. We find that all techniques are capable of accurately reproducing the particle-level spectra across complex observables. Given that these approaches are conceptually diverse, they offer an exciting toolkit for a new class of measurements that can probe the Standard Model with an unprecedented level of detail and may enable sensitivity to new phenomena.

Cover page of Streaming Large-Scale Microscopy Data to a Supercomputing Facility

Streaming Large-Scale Microscopy Data to a Supercomputing Facility

(2025)

Data management is a critical component of modern experimental workflows. As data generation rates increase, transferring data from acquisition servers to processing servers via conventional file-based methods is becoming increasingly impractical. The 4D Camera at the National Center for Electron Microscopy generates data at a nominal rate of 480 Gbit s-1 (87,000 frames s-1), producing a 700 GB dataset in 15 s. To address the challenges associated with storing and processing such quantities of data, we developed a streaming workflow that utilizes a high-speed network to connect the 4D Camera's data acquisition system to supercomputing nodes at the National Energy Research Scientific Computing Center, bypassing intermediate file storage entirely. In this work, we demonstrate the effectiveness of our streaming pipeline in a production setting through an hour-long experiment that generated over 10 TB of raw data, yielding high-quality datasets suitable for advanced analyses. Additionally, we compare the efficacy of this streaming workflow against the conventional file-transfer workflow by conducting a postmortem analysis on historical data from experiments performed by real users. Our findings show that the streaming workflow significantly improves data turnaround time, enables real-time decision-making, and minimizes the potential for human error by eliminating manual user interactions.

Nuclear Recoil Calibration at Sub-keV Energies in LUX and Its Impact on Dark Matter Search Sensitivity

(2025)

Dual-phase xenon time projection chamber (TPC) detectors offer heightened sensitivities for dark matter detection across a spectrum of particle masses. To broaden their capability to low-mass dark matter interactions, we investigated the light and charge responses of liquid xenon (LXe) to sub-keV nuclear recoils. Using neutron events from a pulsed Adelphi Deuterium-Deuterium neutron generator, an in situ calibration was conducted on the LUX detector. We demonstrate direct measurements of light and charge yields down to 0.45 and 0.27 keV, respectively, both approaching single quanta production, the physical limit of LXe detectors. These results hold significant implications for the future of dual-phase xenon TPCs in detecting low-mass dark matter via nuclear recoils.

Cover page of A graphics processing unit accelerated sparse direct solver and preconditioner with block low rank compression

A graphics processing unit accelerated sparse direct solver and preconditioner with block low rank compression

(2025)

We present the GPU implementation efforts and challenges of the sparse solver package STRUMPACK. The code is made publicly available on github with a permissive BSD license. STRUMPACK implements an approximate multifrontal solver, a sparse LU factorization which makes use of compression methods to accelerate time to solution and reduce memory usage. Multiple compression schemes based on rank-structured and hierarchical matrix approximations are supported, including hierarchically semi-separable, hierarchically off-diagonal butterfly, and block low rank. In this paper, we present the GPU implementation of the block low rank (BLR) compression method within a multifrontal solver. Our GPU implementation relies on highly optimized vendor libraries such as cuBLAS and cuSOLVER for NVIDIA GPUs, rocBLAS and rocSOLVER for AMD GPUs and the Intel oneAPI Math Kernel Library (oneMKL) for Intel GPUs. Additionally, we rely on external open source libraries such as SLATE (Software for Linear Algebra Targeting Exascale), MAGMA (Matrix Algebra on GPU and Multi-core Architectures), and KBLAS (KAUST BLAS). SLATE is used as a GPU-capable ScaLAPACK replacement. From MAGMA we use variable sized batched dense linear algebra operations such as GEMM, TRSM and LU with partial pivoting. KBLAS provides efficient (batched) low rank matrix compression for NVIDIA GPUs using an adaptive randomized sampling scheme. The resulting sparse solver and preconditioner runs on NVIDIA, AMD and Intel GPUs. Interfaces are available from PETSc, Trilinos and MFEM, or the solver can be used directly in user code. We report results for a range of benchmark applications, using the Perlmutter system from NERSC, Frontier from ORNL, and Aurora from ALCF. For a high frequency wave equation on a regular mesh, using 32 Perlmutter compute nodes, the factorization phase of the exact GPU solver is about 6.5× faster compared to the CPU-only solver. The BLR-enabled GPU solver is about 13.8× faster than the CPU exact solver. For a collection of SuiteSparse matrices, the STRUMPACK exact factorization on a single GPU is on average 1.9× faster than NVIDIA’s cuDSS solver.

DESI 2024 IV: Baryon Acoustic Oscillations from the Lyman alpha forest

(2025)

Abstract: We present the measurement of Baryon Acoustic Oscillations (BAO) from the Lyman-α (Lyα) forest of high-redshift quasars with the first-year dataset of the Dark Energy Spectroscopic Instrument (DESI). Our analysis uses over 420 000 Lyα forest spectra and their correlation with the spatial distribution of more than 700 000 quasars. An essential facet of this work is the development of a new analysis methodology on a blinded dataset. We conducted rigorous tests using synthetic data to ensure the reliability of our methodology and findings before unblinding. Additionally, we conducted multiple data splits to assess the consistency of the results and scrutinized various analysis approaches to confirm their robustness. For a given value of the sound horizon (rd ), we measure the expansion at z eff = 2.33 with 2% precision, H(z eff) = ( 239.2 ± 4.8 ) (147.09 Mpc /rd ) km/s/Mpc. Similarly, we present a 2.4% measurement of the transverse comoving distance to the same redshift, DM (z eff) = ( 5.84 ± 0.14 ) (rd /147.09 Mpc) Gpc. Together with other DESI BAO measurements at lower redshifts, these results are used in a companion paper to constrain cosmological parameters.

Cover page of Evolving to Find Optimizations Humans Miss: Using Evolutionary Computation to Improve GPU Code for Bioinformatics Applications

Evolving to Find Optimizations Humans Miss: Using Evolutionary Computation to Improve GPU Code for Bioinformatics Applications

(2024)

GPUs are used in many settings to accelerate large-scale scientific computation, including simulation, computational biology, and molecular dynamics. However, optimizing codes to run efficiently on GPUs requires developers to have both detailed understanding of the application logic and significant knowledge of parallel programming and GPU architectures. This paper shows that an automated GPU program optimization tool, GEVO, can leverage evolutionary computation to find code edits that reduce the runtime of three important applications, multiple sequence alignment, agent-based simulation and molecular dynamics codes, by 28.9%, 29%, and 17.8% respectively. The paper presents an in-depth analysis of the discovered optimizations, revealing that (1) several of the most important optimizations involve significant epistasis, (2) the primary sources of improvement are application-specific, and (3) many of the optimizations generalize across GPU architectures. In general, the discovered optimizations are not straightforward even for a GPU human expert, showcasing the potential of automated program optimization tools to both reduce the optimization burden for human domain experts and provide new insights for GPU experts.

Cover page of Neural Posterior Unfolding

Neural Posterior Unfolding

(2024)

Differential cross section measurements are the currency of scientific exchange in particle and nuclear physics. The key challenge for these analyses is the correction for detector distortions known as deconvolution or unfolding. In the case of binned cross section measurements, there are many tools for regularized matrix inversion where the matrix governs the detector response going from pre- to post-detector observables. In this paper, we show how normalizing flows and neural posterior estimation can be used for unfolding. This approach has many potential advantages, including implicit regularization from the neural networks and fast inference from amortized training. We demonstrate this approach using simple Gaussian examples as well as a simulated jet substructure measurement at the Large Hadron Collider.

Artificial Intelligence for the Electron Ion Collider (AI4EIC)

(2024)

The Electron-Ion Collider (EIC), a state-of-the-art facility for studying the strong force, is expected to begin commissioning its first experiments in 2028. This is an opportune time for artificial intelligence (AI) to be included from the start at this facility and in all phases that lead up to the experiments. The second annual workshop organized by the AI4EIC working group, which recently took place, centered on exploring all current and prospective application areas of AI for the EIC. This workshop is not only beneficial for the EIC, but also provides valuable insights for the newly established ePIC collaboration at EIC. This paper summarizes the different activities and R&D projects covered across the sessions of the workshop and provides an overview of the goals, approaches and strategies regarding AI/ML in the EIC community, as well as cutting-edge techniques currently studied in other experiments.

Constraints on Covariant Dark-Matter–Nucleon Effective Field Theory Interactions from the First Science Run of the LUX-ZEPLIN Experiment

(2024)

The LUX-ZEPLIN (LZ) experiment is a dual-phase xenon time project chamber operating in the Sanford Underground Research Facility in South Dakota, USA. We report on the results of a relativistic extension to the nonrelativistic effective field theory (NREFT) from a 5.5 t fiducial mass and 60 live days of exposure. We present constraints on couplings from covariant interactions arising from the coupling of vector, axial currents, and electric dipole moments of the nucleon to the magnetic and electric dipole moments of the weakly interacting massive particle which cannot be described by recasting previous results described by an NREFT. Using a profile-likelihood ratio analysis, in an energy region between 0  keV_{nr} to 270  keV_{nr}, we report 90% confidence level exclusion limits on the coupling strength of five interactions in both the isoscalar and isovector bases.