NERSC

DESI 2024 III: baryon acoustic oscillations from galaxies and quasars

(2025)

Abstract: We present the DESI 2024 galaxy and quasar baryon acoustic oscillations (BAO) measurements using over 5.7 million unique galaxy and quasar redshifts in the range 0.1 < z < 2.1. Divided by tracer type, we utilize 300,017 galaxies from the magnitude-limited Bright Galaxy Survey with 0.1 < z < 0.4, 2,138,600 Luminous Red Galaxies with 0.4 < z < 1.1, 2,432,022 Emission Line Galaxies with 0.8 < z < 1.6, and 856,652 quasars with 0.8 < z < 2.1, over a ∼ 7,500 square degree footprint. The analysis was blinded at the catalog-level to avoid confirmation bias. All fiducial choices of the BAO fitting and reconstruction methodology, as well as the size of the systematic errors, were determined on the basis of the tests with mock catalogs and the blinded data catalogs. We present several improvements to the BAO analysis pipeline, including enhancing the BAO fitting and reconstruction methods in a more physically-motivated direction, and also present results using combinations of tracers. We employ a unified BAO analysis method across all tracers. We present a re-analysis of SDSS BOSS and eBOSS results applying the improved DESI methodology and find scatter consistent with the level of the quoted SDSS theoretical systematic uncertainties. With the total effective survey volume of ∼ 18 Gpc3, the combined precision of the BAO measurements across the six different redshift bins is ∼0.52%, marking a 1.2-fold improvement over the previous state-of-the-art results using only first-year data. We detect the BAO in all of these six redshift bins. The highest significance of BAO detection is 9.1σ at the effective redshift of 0.93, with a constraint of 0.86% placed on the BAO scale. We find that our observed BAO scales are systematically larger than the prediction of the Planck 2018-ΛCDM at z < 0.8. We translate the results into transverse comoving distance and radial Hubble distance measurements, which are used to constrain cosmological models in our companion paper.

Unifying simulation and inference with normalizing flows

(2025)

There have been many applications of deep neural networks to detector calibrations and a growing number of studies that propose deep generative models as automated fast detector simulators. We show that these two tasks can be unified by using maximum likelihood estimation (MLE) from conditional generative models for energy regression. Unlike direct regression techniques, the MLE approach is prior independent and non-Gaussian resolutions can be determined from the shape of the likelihood near the maximum. Using an ATLAS-like calorimeter simulation, we demonstrate this concept in the context of calorimeter energy calibration. Published by the American Physical Society 2025

Method to simultaneously facilitate all jet physics tasks

(2025)

Machine learning has become an essential tool in jet physics. Due to their complex, high-dimensional nature, jets can be explored holistically by neural networks in ways that are not possible manually. However, innovations in all areas of jet physics are proceeding in parallel. We show that specially constructed machine learning models trained for a specific jet classification task can improve the accuracy, precision, or speed of all other jet physics tasks. This is demonstrated by training on a particular multiclass generation and classification task and then using the learned representation for different generation and classification tasks, for datasets with a different (full) detector simulation, for jets from a different collision system (pp versus ep), for generative models, for likelihood ratio estimation, and for anomaly detection. We consider our omnilearn approach thus as a jet-physics foundation model. It is made publicly available for use in any area where state-of-the-art precision is required for analyses involving jets and their substructure.

Solving key challenges in collider physics with foundation models

(2025)

Foundation models are neural networks that are capable of simultaneously solving many problems. Large language foundation models like ChatGPT have revolutionized many aspects of daily life, but their impact for science is not yet clear. In this paper, we use a new foundation model for hadronic jets to solve three key challenges in collider physics. In particular, we show how experiments can (1) save significant computing power when developing reconstruction algorithms, (2) perform a complete uncertainty quantification for high-dimensional measurements, and (3) search for new physics with model agnostic methods using low-level inputs. In each case, there are significant computational or methodological challenges with current methods that limit the science potential of deep learning algorithms. By solving each problem, we take jet foundation models beyond proof-of-principle studies and into the toolkit of practitioners.

The landscape of unfolding with machine learning

(2025)

Recent innovations from machine learning allow for data unfolding, without binning and including correlations across many dimensions. We describe a set of known, upgraded, and new methods for ML-based unfolding. The performance of these approaches are evaluated on the same two datasets. We find that all techniques are capable of accurately reproducing the particle-level spectra across complex observables. Given that these approaches are conceptually diverse, they offer an exciting toolkit for a new class of measurements that can probe the Standard Model with an unprecedented level of detail and may enable sensitivity to new phenomena.

Streaming Large-Scale Microscopy Data to a Supercomputing Facility

(2025)

Data management is a critical component of modern experimental workflows. As data generation rates increase, transferring data from acquisition servers to processing servers via conventional file-based methods is becoming increasingly impractical. The 4D Camera at the National Center for Electron Microscopy generates data at a nominal rate of 480 Gbit s-1 (87,000 frames s-1), producing a 700 GB dataset in 15 s. To address the challenges associated with storing and processing such quantities of data, we developed a streaming workflow that utilizes a high-speed network to connect the 4D Camera's data acquisition system to supercomputing nodes at the National Energy Research Scientific Computing Center, bypassing intermediate file storage entirely. In this work, we demonstrate the effectiveness of our streaming pipeline in a production setting through an hour-long experiment that generated over 10 TB of raw data, yielding high-quality datasets suitable for advanced analyses. Additionally, we compare the efficacy of this streaming workflow against the conventional file-transfer workflow by conducting a postmortem analysis on historical data from experiments performed by real users. Our findings show that the streaming workflow significantly improves data turnaround time, enables real-time decision-making, and minimizes the potential for human error by eliminating manual user interactions.

Nuclear Recoil Calibration at Sub-keV Energies in LUX and Its Impact on Dark Matter Search Sensitivity

(2025)

Dual-phase xenon time projection chamber (TPC) detectors offer heightened sensitivities for dark matter detection across a spectrum of particle masses. To broaden their capability to low-mass dark matter interactions, we investigated the light and charge responses of liquid xenon (LXe) to sub-keV nuclear recoils. Using neutron events from a pulsed Adelphi Deuterium-Deuterium neutron generator, an in situ calibration was conducted on the LUX detector. We demonstrate direct measurements of light and charge yields down to 0.45 and 0.27 keV, respectively, both approaching single quanta production, the physical limit of LXe detectors. These results hold significant implications for the future of dual-phase xenon TPCs in detecting low-mass dark matter via nuclear recoils.

DESI 2024 VI: cosmological constraints from the measurements of baryon acoustic oscillations

(2025)

Abstract: We present cosmological results from the measurement of baryon acoustic oscillations (BAO) in galaxy, quasar and Lyman-α forest tracers from the first year of observations from the Dark Energy Spectroscopic Instrument (DESI), to be released in the DESI Data Release 1. DESI BAO provide robust measurements of the transverse comoving distance and Hubble rate, or their combination, relative to the sound horizon, in seven redshift bins from over 6 million extragalactic objects in the redshift range 0.1 < z < 4.2. To mitigate confirmation bias, a blind analysis was implemented to measure the BAO scales. DESI BAO data alone are consistent with the standard flat ΛCDM cosmological model with a matter density Ωm=0.295±0.015. Paired with a baryon density prior from Big Bang Nucleosynthesis and the robustly measured acoustic angular scale from the cosmic microwave background (CMB), DESI requires H 0=(68.52±0.62) km s-1 Mpc-1. In conjunction with CMB anisotropies from Planck and CMB lensing data from Planck and ACT, we find Ωm=0.307± 0.005 and H 0=(67.97±0.38) km s-1 Mpc-1. Extending the baseline model with a constant dark energy equation of state parameter w, DESI BAO alone require w=-0.99+0.15 -0.13. In models with a time-varying dark energy equation of state parametrised by w 0 and wa , combinations of DESI with CMB or with type Ia supernovae (SN Ia) individually prefer w 0 > -1 and wa < 0. This preference is 2.6σ for the DESI+CMB combination, and persists or grows when SN Ia are added in, giving results discrepant with the ΛCDM model at the 2.5σ, 3.5σ or 3.9σ levels for the addition of the Pantheon+, Union3, or DES-SN5YR supernova datasets respectively. For the flat ΛCDM model with the sum of neutrino mass ∑ mν free, combining the DESI and CMB data yields an upper limit ∑ mν < 0.072 (0.113) eV at 95% confidence for a ∑ mν > 0 (∑ mν > 0.059) eV prior. These neutrino-mass constraints are substantially relaxed if the background dynamics are allowed to deviate from flat ΛCDM.

Two-neutrino double electron capture of 124Xe in the first LUX-ZEPLIN exposure

(2025)

The broad physics reach of the LUX-ZEPLIN (LZ) experiment covers rare phenomena beyond the direct detection of dark matter. We report precise measurements of the extremely rare decay of 124Xe through the process of two-neutrino double electron capture, utilizing a 1.39 kg × yr isotopic exposure from the first LZ science run. A half-life of T12/n22EC = (1.09 ± 0.14 stat ± 0.05sys ) × 1022 yr is observed with a statistical significance of 8.3σ, in agreement with literature. First empirical measurements of the KK capture fraction relative to other K-shell modes were conducted, and demonstrate consistency with respect to recent signal models at the 1.4σ level.

A graphics processing unit accelerated sparse direct solver and preconditioner with block low rank compression

(2025)

We present the GPU implementation efforts and challenges of the sparse solver package STRUMPACK. The code is made publicly available on github with a permissive BSD license. STRUMPACK implements an approximate multifrontal solver, a sparse LU factorization which makes use of compression methods to accelerate time to solution and reduce memory usage. Multiple compression schemes based on rank-structured and hierarchical matrix approximations are supported, including hierarchically semi-separable, hierarchically off-diagonal butterfly, and block low rank. In this paper, we present the GPU implementation of the block low rank (BLR) compression method within a multifrontal solver. Our GPU implementation relies on highly optimized vendor libraries such as cuBLAS and cuSOLVER for NVIDIA GPUs, rocBLAS and rocSOLVER for AMD GPUs and the Intel oneAPI Math Kernel Library (oneMKL) for Intel GPUs. Additionally, we rely on external open source libraries such as SLATE (Software for Linear Algebra Targeting Exascale), MAGMA (Matrix Algebra on GPU and Multi-core Architectures), and KBLAS (KAUST BLAS). SLATE is used as a GPU-capable ScaLAPACK replacement. From MAGMA we use variable sized batched dense linear algebra operations such as GEMM, TRSM and LU with partial pivoting. KBLAS provides efficient (batched) low rank matrix compression for NVIDIA GPUs using an adaptive randomized sampling scheme. The resulting sparse solver and preconditioner runs on NVIDIA, AMD and Intel GPUs. Interfaces are available from PETSc, Trilinos and MFEM, or the solver can be used directly in user code. We report results for a range of benchmark applications, using the Perlmutter system from NERSC, Frontier from ORNL, and Aurora from ALCF. For a high frequency wave equation on a regular mesh, using 32 Perlmutter compute nodes, the factorization phase of the exact GPU solver is about 6.5× faster compared to the CPU-only solver. The BLR-enabled GPU solver is about 13.8× faster than the CPU exact solver. For a collection of SuiteSparse matrices, the STRUMPACK exact factorization on a single GPU is on average 1.9× faster than NVIDIA’s cuDSS solver.

Computing Sciences

NERSC