Skip to main content
eScholarship
Open Access Publications from the University of California

School of Information

Open Access Policy Deposits bannerUC Berkeley

Open Access Policy Deposits

This series is automatically populated with publications deposited by UC Berkeley School of Information researchers in accordance with the University of California’s open access policies. For more information see Open Access Policy Deposits and the UC Publication Management System.

Cover page of The Silicon Valley-Hsinchu Connection: Technical Communities and Industrial Upgrading

The Silicon Valley-Hsinchu Connection: Technical Communities and Industrial Upgrading

(2001)

Silicon Valley in California and the Hsinchu-Taipei region of Taiwan are among the most frequently cited ‘miracles’of the information technology era. The dominant accounts of these successes treat them in isolation, focusing either on free markets, multinationals or the state. This paper argues that the dynamism of these regional economies is attributable to their increasing interdependencies. A community of US-educated Taiwanese engineers has coordinated a decentralized process of reciprocal industrial upgrading by transferring capital, skill, and know-how and by facilitating collaboration between specialist producers in the two regions. This case underscores the significance of technical communities and their institutions in diffusing ideas and organizing production at the global as well as the local level.

Cover page of Abortion access barriers shared in r/abortion after Roe: a qualitative analysis of a Reddit community post-Dobbs decision leak in 2022.

Abortion access barriers shared in r/abortion after Roe: a qualitative analysis of a Reddit community post-Dobbs decision leak in 2022.

(2024)

With drastic changes to abortion policy, the months following the Dobbs leak and subsequent decision in 2022 were a uniquely uncertain and difficult time for abortion access in the United States. To understand experiences of challenges to abortion access during that time, we used a hybrid inductive and deductive thematic coding approach to analyse descriptions of barriers and their impacts shared in an abortion subreddit (r/abortion). A simple random sample of 10% of posts was obtained from those shared from 02 May 2022 through 23 December 2022; comments were purposively sampled during the coding process. In this sample of submissions (n = 523 posts, 88 comments), people described structural barriers identified in past research, including state abortion bans and gestational limits, high costs, limited appointment availability, and long travel required. Posters also commonly described known social barriers, including limited social support and abortion stigma. Several impactful barriers not well-described in past research emerged inductively, including wait time for receiving mail-ordered abortion medication, low credibility of online ordering platforms, and concerns about legal risks of accessing abortion or related medical care. The most common consequences of experiencing barriers were adverse mental health outcomes, delayed access to care, and being compelled to self-manage their abortion because of access barriers. This analysis provides timely insights into the experiences and impacts of abortion access barriers in a group of people with a range of engagement with clinical abortion care, lived experiences, and points in their abortion processes, with public health implications for mental health and abortion access.

Cover page of Privacy guarantees for personal mobility data in humanitarian response.

Privacy guarantees for personal mobility data in humanitarian response.

(2024)

Personal mobility data from mobile phones and other sensors are increasingly used to inform policymaking during pandemics, natural disasters, and other humanitarian crises. However, even aggregated mobility traces can reveal private information about individual movements to potentially malicious actors. This paper develops and tests an approach for releasing private mobility data, which provides formal guarantees over the privacy of the underlying subjects. Specifically, we (1) introduce an algorithm for constructing differentially private mobility matrices and derive privacy and accuracy bounds on this algorithm; (2) use real-world data from mobile phone operators in Afghanistan and Rwanda to show how this algorithm can enable the use of private mobility data in two high-stakes policy decisions: pandemic response and the distribution of humanitarian aid; and (3) discuss practical decisions that need to be made when implementing this approach, such as how to optimally balance privacy and accuracy. Taken together, these results can help enable the responsible use of private mobility data in humanitarian response.

Cover page of Early detection of pediatric health risks using maternal and child health data.

Early detection of pediatric health risks using maternal and child health data.

(2024)

Machine learning (ML)-driven diagnosis systems are particularly relevant in pediatrics given the well-documented impact of early-life health conditions on later-life outcomes. Yet, early identification of diseases and their subsequent impact on length of hospital stay for this age group has so far remained uncharacterized, likely because access to relevant health data is severely limited. Thanks to a confidential data use agreement with the California Department of Health Care Access and Information, we introduce Ped-BERT: a state-of-the-art deep learning model that accurately predicts the likelihood of 100+ conditions and the length of stay in a pediatric patients next medical visit. We link mother-specific pre- and postnatal period health information to pediatric patient hospital discharge and emergency room visits. Our data set comprises 513.9K mother-baby pairs and contains medical diagnosis codes, length of stay, as well as temporal and spatial pediatric patient characteristics, such as age and residency zip code at the time of visit. Following the popular bidirectional encoder representations from the transformers (BERT) approach, we pre-train Ped-BERT via the masked language modeling objective to learn embedding features for the diagnosis codes contained in our data. We then continue to fine-tune our model to accurately predict primary diagnosis outcomes and length of stay for a pediatric patients next visit, given the history of previous visits and, optionally, the mothers pre- and postnatal health information. We find that Ped-BERT generally outperforms contemporary and state-of-the-art classifiers when trained with minimum features. We also find that incorporating mother health attributes leads to significant improvements in model performance overall and across all patient subgroups in our data. Our most successful Ped-BERT model configuration achieves an area under the receiver operator curve (ROC AUC) of 0.927 and an average precision score (APS) of 0.408 for the diagnosis prediction task, and a ROC AUC of 0.855 and APS of 0.815 for the length of hospital stay task. Further, we examine Ped-BERTs fairness by determining whether prediction errors are evenly distributed across various subgroups of mother-baby demographics and health characteristics, or if certain subgroups exhibit a higher susceptibility to prediction errors.

Cover page of Relevance and creativity – a linear model

Relevance and creativity – a linear model

(2024)

Purpose: The purpose of this paper is to provide a new and useful formulation of relevance. Design/methodology/approach: This paper is formulated as a conceptual argument. It makes the case for the utility of considering relevance to be function of use in creative processes. Findings: There are several corollaries to formulating relevance as a function of use. These include the idea that objects by themselves cannot be relevant since use assumes interaction; the affordances of objects and how they are perceived can affect what becomes relevant but are not in themselves relevant; relevance is not an essential characteristic of objects; relevance is transient; potential relevance (what might be relevant in the future) can be distinguished from what is relevant in use and from what has been relevant in the past. Originality/value: The paper shows that its new formulation of relevance brings improved conceptual and terminological clarity to the discourse about relevance in information science. It demonstrates that how relevance is articulated conceptually is important as its conceptualization can affect the ways that users are able to make use of information systems and, by extension, how information systems can facilitate or disable the co-production of creative outcomes. The paper also usefully expands investigative opportunities by suggesting relevance and creativity are interrelated.

Cover page of Gait Event Detection and Travel Distance Using Waist-Worn Accelerometers across a Range of Speeds: Automated Approach.

Gait Event Detection and Travel Distance Using Waist-Worn Accelerometers across a Range of Speeds: Automated Approach.

(2024)

Estimation of temporospatial clinical features of gait (CFs), such as step count and length, step duration, step frequency, gait speed, and distance traveled, is an important component of community-based mobility evaluation using wearable accelerometers. However, accurate unsupervised computerized measurement of CFs of individuals with Duchenne muscular dystrophy (DMD) who have progressive loss of ambulatory mobility is difficult due to differences in patterns and magnitudes of acceleration across their range of attainable gait velocities. This paper proposes a novel calibration method. It aims to detect steps, estimate stride lengths, and determine travel distance. The approach involves a combination of clinical observation, machine-learning-based step detection, and regression-based stride length prediction. The method demonstrates high accuracy in children with DMD and typically developing controls (TDs) regardless of the participants level of ability. Fifteen children with DMD and fifteen TDs underwent supervised clinical testing across a range of gait speeds using 10 m or 25 m run/walk (10 MRW, 25 MRW), 100 m run/walk (100 MRW), 6-min walk (6 MWT), and free-walk (FW) evaluations while wearing a mobile-phone-based accelerometer at the waist near the bodys center of mass. Following calibration by a trained clinical evaluator, CFs were extracted from the accelerometer data using a multi-step machine-learning-based process and the results were compared to ground-truth observation data. Model predictions vs. observed values for step counts, distance traveled, and step length showed a strong correlation (Pearsons r = -0.9929 to 0.9986, p < 0.0001). The estimates demonstrated a mean (SD) percentage error of 1.49% (7.04%) for step counts, 1.18% (9.91%) for distance traveled, and 0.37% (7.52%) for step length compared to ground-truth observations for the combined 6 MWT, 100 MRW, and FW tasks. Our study findings indicate that a single waist-worn accelerometer calibrated to an individuals stride characteristics using our methods accurately measures CFs and estimates travel distances across a common range of gait speeds in both DMD and TD peers.

Cover page of Gait Characterization in Duchenne Muscular Dystrophy (DMD) Using a Single-Sensor Accelerometer: Classical Machine Learning and Deep Learning Approaches.

Gait Characterization in Duchenne Muscular Dystrophy (DMD) Using a Single-Sensor Accelerometer: Classical Machine Learning and Deep Learning Approaches.

(2024)

Differences in gait patterns of children with Duchenne muscular dystrophy (DMD) and typically developing (TD) peers are visible to the eye, but quantifications of those differences outside of the gait laboratory have been elusive. In this work, we measured vertical, mediolateral, and anteroposterior acceleration using a waist-worn iPhone accelerometer during ambulation across a typical range of velocities. Fifteen TD and fifteen DMD children from 3 to 16 years of age underwent eight walking/running activities, including five 25 m walk/run speed-calibration tests at a slow walk to running speeds (SC-L1 to SC-L5), a 6-min walk test (6MWT), a 100 m fast walk/jog/run (100MRW), and a free walk (FW). For clinical anchoring purposes, participants completed a Northstar Ambulatory Assessment (NSAA). We extracted temporospatial gait clinical features (CFs) and applied multiple machine learning (ML) approaches to differentiate between DMD and TD children using extracted temporospatial gait CFs and raw data. Extracted temporospatial gait CFs showed reduced step length and a greater mediolateral component of total power (TP) consistent with shorter strides and Trendelenberg-like gait commonly observed in DMD. ML approaches using temporospatial gait CFs and raw data varied in effectiveness at differentiating between DMD and TD controls at different speeds, with an accuracy of up to 100%. We demonstrate that by using ML with accelerometer data from a consumer-grade smartphone, we can capture DMD-associated gait characteristics in toddlers to teens.

Cover page of Search for quantum black hole production in lepton+jet final states using proton-proton collisions at s=13 TeV with the ATLAS detector

Search for quantum black hole production in lepton+jet final states using proton-proton collisions at s=13 TeV with the ATLAS detector

(2024)

A search for quantum black holes in electron + jet and muon + jet invariant mass spectra is performed with 140 fb-1 of data collected by the ATLAS detector in proton-proton collisions at √s = 13 TeV at the Large Hadron Collider. The observed invariant mass spectrum of lepton + jet pairs is consistent with Standard Model expectations. Upper limits are set at 95% confidence level on the production cross section times branching fractions for quantum black holes decaying into a lepton and a quark in a search region with invariant mass above 2.0 TeV. The resulting quantum black hole lower mass threshold limit is 9.2 TeV in the Arkani-Hamed-Dimopoulos-Dvali model, and 6.8 TeV in the Randall-Sundrum model.