Skip to main content
eScholarship
Open Access Publications from the University of California

Department of Biostatistics

Open Access Policy Deposits bannerUCLA

Open Access Policy Deposits

This series is automatically populated with publications deposited by UCLA Fielding School of Public Health Department of Biostatistics researchers in accordance with the University of California’s open access policies. For more information see Open Access Policy Deposits and the UC Publication Management System.

Cover page of Predictors of Short-Term Outcomes after Syncope: A Systematic Review and Meta-Analysis

Predictors of Short-Term Outcomes after Syncope: A Systematic Review and Meta-Analysis

(2018)

Introduction: We performed a systematic review and meta-analysis to identify predictors of serious clinical outcomes after an acute-care evaluation for syncope.

Methods: We identified studies that assessed for predictors of short-term (≤30 days) serious clinical events after an emergency department (ED) visit for syncope. We performed a MEDLINE search (January 1, 1990 - July 1, 2017) and reviewed reference lists of retrieved articles. The primary outcome was the occurrence of a serious clinical event (composite of mortality, arrhythmia, ischemic or structural heart disease, major bleed, or neurovascular event) within 30 days. We estimated the sensitivity, specificity, and likelihood ratio of findings for the primary outcome. We created summary estimates of association on a variable-by-variable basis using a Bayesian random-effects model.

Results: We reviewed 2,773 unique articles; 17 met inclusion criteria. The clinical findings most predictive of a short-term, serious event were the following: 1) An elevated blood urea nitrogen level (positive likelihood ratio [LR+]: 2.86, 95% confidence interval [CI] [1.15, 5.42]); 2); history of congestive heart failure (LR+: 2.65, 95%CI [1.69, 3.91]); 3) initial low blood pressure in the ED (LR+: 2.62, 95%CI [1.12, 4.9]); 4) history of arrhythmia (LR+: 2.32, 95%CI [1.31, 3.62]); and 5) an abnormal troponin value (LR+: 2.49, 95%CI [1.36, 4.1]). Younger age was associated with lower risk (LR-: 0.44, 95%CI [0.25, 0.68]). An abnormal electrocardiogram was mildly predictive of increased risk (LR+ 1.79, 95%CI [1.14, 2.63]).

Conclusion: We identified specific risk factors that may aid clinical judgment and that should be considered in the development of future risk-prediction tools for serious clinical events after an ED visit for syncope.

  • 3 supplemental files
Cover page of Estimating the Cost of Care for Emergency Department Syncope Patients: Comparison of Three Models

Estimating the Cost of Care for Emergency Department Syncope Patients: Comparison of Three Models

(2017)

Introduction: We sought to compare three hospital cost estimation models for patients undergoing evaluation for unexplained syncope with hospital cost data. Developing such a model would allow researchers to assess the value of novel clinical algorithms for syncope management.

Methods: Complete health services data, including disposition, testing, and length of stay (LOS), were collected on 67 adult patients (age 60 years and older) who presented to the Emergency Department (ED) with syncope at a single hospital. Patients were excluded if a serious medical condition was identified. Three hospital cost estimation models were created to estimate facility costs: V1, unadjusted Medicare payments for observation and/or hospital admission, V2: modified Medicare payment, prorated by LOS in calendar days, and, V3: modified Medicare payment, prorated by LOS in hours. Total hospital costs included unadjusted Medicare payments for diagnostic testing and estimated facility costs. These estimates were plotted against actual cost data from the hospital finance department. Correlation and regression analyses were performed.

Results: Of the three models, V3 consistently outperformed the others with regard to correlation and goodness of fit. The Pearson correlation coefficient for V3 was 0.88 (95% Confidence Interval 0.81, 0.92) with an R-square value of 0.77 and a linear regression coefficient of 0.87 (95% Confidence Interval 0.76, 0.99).

Conclusion: Using basic health services data, it is possible to accurately estimate hospital costs for older adults undergoing a hospital-based evaluation for unexplained syncope. This methodology could help assess the potential economic impact of implementing novel clinical algorithms for ED syncope. 

  • 2 supplemental files
Cover page of A Risk Score to Predict Short-Term Outcomes Following Emergency Department Discharge

A Risk Score to Predict Short-Term Outcomes Following Emergency Department Discharge

(2018)

Introduction: The emergency department (ED) is an inherently high-risk setting. Risk scores can help practitioners understand the risk of ED patients for developing poor outcomes after discharge. Our objective was to develop two risk scores that predict either general inpatient admission or death/intensive care unit (ICU) admission within seven days of ED discharge.

Methods: We conducted a retrospective cohort study of patients age > 65 years using clinical data from a regional, integrated health system for years 2009-2010 to create risk scores to predict two outcomes, a general inpatient admission or death/ICU admission. We used logistic regression to predict the two outcomes based on age, body mass index, vital signs, Charlson comorbidity index (CCI), ED length of stay (LOS), and prior inpatient admission.

Results: Of 104,025 ED visit discharges, 4,638 (4.5%) experienced a general inpatient admission and 531 (0.5%) death or ICU admission within seven days of discharge. Risk factors with the greatest point value for either outcome were high CCI score and a prolonged ED LOS. The C-statistic was 0.68 and 0.76 for the two models.

Conclusion: Risk scores were successfully created for both outcomes from an integrated health system, inpatient admission or death/ICU admission. Patients who accrued the highest number of points and greatest risk present to the ED with a high number of comorbidities and require prolonged ED evaluations.

  • 1 supplemental file
Cover page of Enhancing epigenetic aging clocks in cetaceans: accurate age estimations in small endangered delphinids, killer whales, pilot whales, belugas, humpbacks, and bowhead whales.

Enhancing epigenetic aging clocks in cetaceans: accurate age estimations in small endangered delphinids, killer whales, pilot whales, belugas, humpbacks, and bowhead whales.

(2025)

This study presents refined epigenetic clocks for cetaceans, building on previous research that estimated ages in several species from bottlenose dolphins to bowhead and humpback whales using cytosine methylation levels. We combined publicly available data (generated on the HorvathMammalMethylChip40 platform) from skin (n = 805) and blood (n = 286) samples across 13 cetacean species, aged 0 to 139 years. By combining methylation data from different sources, we enhanced our sample size, thereby strengthening the statistical validity of our clocks. We used elastic net regression with leave one sample out (LOO) and leave one species out (LOSO) cross validation to produce highly accurate blood only (Median Absolute Error [MAE] = 1.64 years, r = 0.96), skin only (MAE = 2.32 years, r = 0.94) and blood and skin multi-tissue (MAE = 2.24 years, r = 0.94) clocks. In addition, the LOSO blood and skin (MAE = 5.6 years, repeated measures r = 0.83), skin only (MAE = 6.22 years, repeated measures r = 0.81), and blood only (MAE = 4.11 years, repeated measures r = 0.95) clock analysis demonstrated relatively high correlation toward cetacean species not included within this current data set and provide evidence for a broader application of this model. Our results introduce a multi-species, two-tissue clock for broader applicability across cetaceans, alongside single-tissue multi-species clocks for blood and skin, which allow for more detailed aging analysis depending on the availability of samples. In addition, we developed species-specific clocks for enhanced precision, resulting in four blood-specific clocks and eight skin-specific clocks for individual species; all improving upon existing accuracy estimates for previously published species-specific clocks. By pooling methylation data from various studies, we increased our sample size, significantly enhancing the statistical power for building accurate clocks. These new epigenetic age estimators for cetaceans provide more accurate tools for aiding in conservation efforts of endangered cetaceans.

Cover page of Health- and Non-Health-Related Corporate Social Responsibility Statements in Top Selling Restaurant Chains in the U.S. Between 2012 and 2018: A Content Analysis.

Health- and Non-Health-Related Corporate Social Responsibility Statements in Top Selling Restaurant Chains in the U.S. Between 2012 and 2018: A Content Analysis.

(2025)

INTRODUCTION: The aim of this study was to understand the prevalence and content of corporate social responsibility statements in the top-selling chain restaurants between 2012 and 2018 to inform the ways restaurants can impact population health. METHODS: The study used a web scraping technique to abstract relevant text information (n=6,369 text sections that contained possible corporate social responsibility statements or thematically coded portions of the text section) from the archived web pages of the 96 top-selling chain restaurants. Content analysis was used to identify key themes in corporate social responsibility statements across restaurants and over time. All data were abstracted, and analyses were completed between November 2019 and November 2023. RESULTS: The majority of restaurants (68.8%) included a corporate social responsibility statement on their web pages between 2012 and 2018, and approximately half of the restaurants featured a health-related corporate social responsibility statement (51.0%). There were increases in corporate social responsibility statements by chain restaurants over the study period from 186 corporate social responsibility statements in 2012 to 1,218 corporate social responsibility statements in 2018, with most statements focused on philanthropy (37.1% of coded statements), community activities that were not health related (18.4% of coded statements), and sustainability initiatives (18.3% of coded statements). Only one quarter (24.4%) of these corporate social responsibility statements were health related, and many were vague in nature (only 28% of the eligible statements could be coded by theme). CONCLUSIONS: There is a need for more actionable health-focused initiatives in the corporate social responsibility statements for chain restaurants. Public health initiatives that engage with the restaurant industry should work to promote corporate social responsibility statements that are in line with other collective positions around improving health and reducing diet-related disease.

Cover page of Objective study validity diagnostics: a framework requiring pre-specified, empirical verification to increase trust in the reliability of real-world evidence.

Objective study validity diagnostics: a framework requiring pre-specified, empirical verification to increase trust in the reliability of real-world evidence.

(2025)

OBJECTIVE: Propose a framework to empirically evaluate and report validity of findings from observational studies using pre-specified objective diagnostics, increasing trust in real-world evidence (RWE). MATERIALS AND METHODS: The framework employs objective diagnostic measures to assess the appropriateness of study designs, analytic assumptions, and threats to validity in generating reliable evidence addressing causal questions. Diagnostic evaluations should be interpreted before the unblinding of study results or, alternatively, only unblind results from analyses that pass pre-specified thresholds. We provide a conceptual overview of objective diagnostic measures and demonstrate their impact on the validity of RWE from a large-scale comparative new-user study of various antihypertensive medications. We evaluated expected absolute systematic error (EASE) before and after applying diagnostic thresholds, using a large set of negative control outcomes. RESULTS: Applying objective diagnostics reduces bias and improves evidence reliability in observational studies. Among 11 716 analyses (EASE = 0.38), 13.9% met pre-specified diagnostic thresholds which reduced EASE to zero. Objective diagnostics provide a comprehensive and empirical set of tests that increase confidence when passed and raise doubts when failed. DISCUSSION: The increasing use of real-world data presents a scientific opportunity; however, the complexity of the evidence generation process poses challenges for understanding study validity and trusting RWE. Deploying objective diagnostics is crucial to reducing bias and improving reliability in RWE generation. Under ideal conditions, multiple study designs pass diagnostics and generate consistent results, deepening understanding of causal relationships. Open-source, standardized programs can facilitate implementation of diagnostic analyses. CONCLUSION: Objective diagnostics are a valuable addition to the RWE generation process.

Categorization of 34 computational methods to detect spatially variable genes from spatially resolved transcriptomics data

(2025)

In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 34 state-of-the-art methods, classifying SVGs into three categories: overall, cell-type-specific, and spatial-domain-marker SVGs. Our review explains the intuitions underlying these methods, summarizes their applications, and categorizes the hypothesis tests they use in the trade-off between generality and specificity for SVG detection. We discuss challenges in SVG detection and propose future directions for improvement. Our review offers insights for method developers and users, advocating for category-specific benchmarking.

Integrating dynamical modeling and phylogeographic inference to characterize global influenza circulation.

(2025)

Global seasonal influenza circulation involves a complex interplay between local (seasonality, demography, host immunity) and global factors (international mobility) shaping recurrent epidemic patterns. No studies so far have reconciled the two spatial levels, evaluating the coupling between national epidemics, considering heterogeneous coverage of epidemiological, and virological data, integrating different data sources. We propose a novel-combined approach based on a dynamical model of global influenza spread (GLEAM), integrating high-resolution demographic, and mobility data, and a generalized linear model of phylogeographic diffusion that accounts for time-varying migration rates. Seasonal migration fluxes across countries simulated with GLEAM are tested as phylogeographic predictors to provide model validation and calibration based on genetic data. Seasonal fluxes obtained with a specific transmissibility peak time and recurrent travel outperformed the raw air-transportation predictor, previously considered as optimal indicator of global influenza migration. Influenza A subtypes supported autumn-winter reproductive number as high as 2.25 and an average immunity duration of 2 years. Similar dynamics were preferred by influenza B lineages, with a lower autumn-winter reproductive number. Comparing simulated epidemic profiles against FluNet data offered comparatively limited resolution power. The multiscale approach enables model selection yielding a novel computational framework for describing global influenza dynamics at different scales-local transmission and national epidemics vs. international coupling through mobility and imported cases. Our findings have important implications to improve preparedness against seasonal influenza epidemics. The approach can be generalized to other epidemic contexts, such as emerging disease outbreaks to improve the flexibility and predictive power of modeling.

Cover page of Toward optimal disease surveillance with graph-based active learning.

Toward optimal disease surveillance with graph-based active learning.

(2024)

Tracking the spread of emerging pathogens is critical to the design of timely and effective public health responses. Policymakers face the challenge of allocating finite resources for testing and surveillance across locations, with the goal of maximizing the information obtained about the underlying trends in prevalence and incidence. We model this decision-making process as an iterative node classification problem on an undirected and unweighted graph, in which nodes represent locations and edges represent movement of infectious agents among them. To begin, a single node is randomly selected for testing and determined to be either infected or uninfected. Test feedback is then used to update estimates of the probability of unobserved nodes being infected and to inform the selection of nodes for testing at the next iterations, until certain test budget is exhausted. Using this framework, we evaluate and compare the performance of previously developed active learning policies for node selection, including Node Entropy and Bayesian Active Learning by Disagreement. We explore the performance of these policies under different outbreak scenarios using simulated outbreaks on both synthetic and empirical networks. Further, we propose a policy that considers the distance-weighted average entropy of infection predictions among neighbors of each candidate node. Our proposed policy outperforms existing ones in most outbreak scenarios given small test budgets, highlighting the need to consider an exploration-exploitation trade-off in policy design. Our findings could inform the design of cost-effective surveillance strategies for emerging and endemic pathogens and reduce uncertainties associated with early risk assessments in resource-constrained situations.

Cover page of Estimation of genetic admixture proportions via haplotypes

Estimation of genetic admixture proportions via haplotypes

(2024)

Estimation of ancestral admixture is essential for creating personal genealogies, studying human history, and conducting genome-wide association studies (GWAS). The following three primary methods exist for estimating admixture coefficients. The frequentist approach directly maximizes the binomial loglikelihood. The Bayesian approach adds a reasonable prior and samples the posterior distribution. Finally, the nonparametric approach decomposes the genotype matrix algebraically. Each approach scales successfully to datasets with a million individuals and a million single nucleotide polymorphisms (SNPs). Despite their variety, all current approaches assume independence between SNPs. To achieve independence requires performing LD (linkage disequilibrium) filtering before analysis. Unfortunately, this tactic loses valuable information and usually retains many SNPs still in LD. The present paper explores the option of explicitly incorporating haplotypes in ancestry estimation. Our program, HaploADMIXTURE, operates on adjacent SNP pairs and jointly estimates their haplotype frequencies along with admixture coefficients. This more complex strategy takes advantage of the rich information available in haplotypes and ultimately yields better admixture estimates and better clustering of real populations in curated datasets.