Search

Article
Peer Reviewed

The global lake area, climate, and population dataset

UC San Francisco Previously Published Works (2020)

An increasing population in conjunction with a changing climate necessitates a detailed understanding of water abundance at multiple spatial and temporal scales. Remote sensing has provided massive data volumes to track fluctuations in water quantity, yet contextualizing water abundance with other local, regional, and global trends remains challenging by often requiring large computational resources to combine multiple data sources into analytically-friendly formats. To bridge this gap and facilitate future freshwater research opportunities, we harmonized existing global datasets to create the Global Lake area, Climate, and Population (GLCP) dataset. The GLCP is a compilation of lake surface area for 1.42 + million lakes and reservoirs of at least 10 ha in size from 1995 to 2015 with co-located basin-level temperature, precipitation, and population data. The GLCP was created with FAIR (findable, accessible, interoperable, reusable) data principles in mind and retains unique identifiers from parent datasets to expedite interoperability. The GLCP offers critical data for basic and applied investigations of lake surface area and water quantity at local, regional, and global scales.

Creative Commons 'BY' version 4.0 license

Article
Peer Reviewed

Toward Enhanced Reusability: A Comparative Analysis of Metadata for Machine Learning Objects and Their Characteristics in Generalist and Specialist Repositories

UC San Diego Previously Published Works (2024)

Objective: The rapidly increasing prevalence and application of machine learning (ML) across disciplines creates a pressing need to establish guidance for data curation professionals. However, we must first understand the characteristics of ML-related objects shared in generalist and specialist repositories and the extent to which repository metadata fields enable findability and reuse of ML objects. Methods: We used a combination of API queries and web scraping to retrieve metadata for ML objects in eight commonly used generalist and ML-specific data repositories. We assessed both metadata schema and characteristics of deposited ML objects, within the context of the widely adopted FAIR Principles. We also calculated summary statistics for properties of objects, including number of objects per year, dataset size, domains represented, and availability of related resources. Results: Generalist repositories excelled at providing provenance metadata, specifically unique identifiers, unambiguous citations, clear licenses, and related resources, while specialist repositories emphasized ML-specific descriptive metadata, such as number of attributes and instances and task type. In terms of object content, we noted a wide range of file formats, as well as licenses, all of which impact reusability. Conclusions: Generalist repositories will benefit from some of the practices adopted by specialists, and specialist repositories will benefit from adopting proven data curation practices of generalist repositories. A step forward for repositories will be to invest more into use of labels and persistent identifiers to improve workflow documentation, provenance, and related resource linking of ML objects, which will increase their findability, interoperability, and reusability.

Article
Peer Reviewed

The Most Disruptive Publications in Plastic Surgery

UC San Diego Previously Published Works (2021)

Creative Commons 'BY-NC-ND' version 4.0 license

Article
Peer Reviewed

Immune response to intravenous immunoglobulin in patients with Kawasaki disease and MIS-C

UC San Diego Previously Published Works (2021)

BACKGROUNDMultisystem inflammatory syndrome in children (MIS-C) is a rare but potentially severe illness that follows exposure to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Kawasaki disease (KD) shares several clinical features with MIS-C, which prompted the use of intravenous immunoglobulin (IVIG), a mainstay therapy for KD. Both diseases share a robust activation of the innate immune system, including the IL-1 signaling pathway, and IL-1 blockade has been used for the treatment of both MIS-C and KD. The mechanism of action of IVIG in these 2 diseases and the cellular source of IL-1β have not been defined.METHODSThe effects of IVIG on peripheral blood leukocyte populations from patients with MIS-C and KD were examined using flow cytometry and mass cytometry (CyTOF) and live-cell imaging.RESULTSCirculating neutrophils were highly activated in patients with KD and MIS-C and were a major source of IL-1β. Following IVIG treatment, activated IL-1β+ neutrophils were reduced in the circulation. In vitro, IVIG was a potent activator of neutrophil cell death via PI3K and NADPH oxidase, but independently of caspase activation.CONCLUSIONSActivated neutrophils expressing IL-1β can be targeted by IVIG, supporting its use in both KD and MIS-C to ameliorate inflammation.FUNDINGPatient Centered Outcomes Research Institute; NIH; American Asthma Foundation; American Heart Association; Novo Nordisk Foundation; NIGMS; American Academy of Allergy, Asthma and Immunology Foundation.

Cover page: Immune response to intravenous immunoglobulin in patients with Kawasaki disease and MIS-C

Article
Peer Reviewed

Skills and Knowledge for Data-Intensive Environmental Research

UC Berkeley Previously Published Works (2017)

The scale and magnitude of complex and pressing environmental issues lend urgency to the need for integrative and reproducible analysis and synthesis, facilitated by data-intensive research approaches. However, the recent pace of technological change has been such that appropriate skills to accomplish data-intensive research are lacking among environmental scientists, who more than ever need greater access to training and mentorship in computational skills. Here, we provide a roadmap for raising data competencies of current and next-generation environmental researchers by describing the concepts and skills needed for effectively engaging with the heterogeneous, distributed, and rapidly growing volumes of available data. We articulate five key skills: (1) data management and processing, (2) analysis, (3) software skills for science, (4) visualization, and (5) communication methods for collaboration and dissemination. We provide an overview of the current suite of training initiatives available to environmental scientists and models for closing the skill-transfer gap.

Cover page: Skills and Knowledge for Data-Intensive Environmental Research