- Mayo, CS;
- Feng, MU;
- Brock, KK;
- Kudner, R;
- Balter, P;
- Buchsbaum, JC;
- Caissie, A;
- Covington, E;
- Daugherty, EC;
- Dekker, AL;
- Fuller, CD;
- Hallstrom, AL;
- Hong, DS;
- Hong, JC;
- Kamran, SC;
- Katsoulakis, E;
- Kildea, J;
- Krauze, AV;
- Kruse, JJ;
- McNutt, T;
- Mierzwa, M;
- Moreno, A;
- Palta, JR;
- Popple, R;
- Purdie, TG;
- Richardson, S;
- Sharp, GC;
- Shiraishi, S;
- Tarbox, L;
- Venkatesan, AM;
- Witztum, A;
- Woods, KE;
- Yao, J;
- Farahani, K;
- Aneja, S;
- Gabriel, PE;
- Hadjiiski, L;
- Ruan, D;
- Siewerdsen, JH;
- Bratt, S;
- Casagni, M;
- Chen, S;
- Christodouleas, J;
- DiDonato, A;
- Hayman, J;
- Kapoor, R;
- Kravitz, S;
- Sebastian, S;
- Von Siebenthal, M;
- Xiao, Y
Purpose
The ongoing lack of data standardization severely undermines the potential for automated learning from the vast amount of information routinely archived in electronic health records (EHRs), Radiation Oncology Information Systems (ROIS), treatment planning systems (TPSs), and other cancer care and outcomes databases. The effort sought to create a standardized ontology for clinical data, social determinants of health (SDOH), and other radiation oncology concepts and interrelationships.Methods and materials
The American Association of Physicists in Medicine's (AAPM's) Big Data Science Committee (BDSC) was initiated July of 2019 to explore common ground from the stakeholders' collective experience of issues that typically compromise the formation of large inter- and intra- institutional databases from EHRs. The BDSC adopted an iterative, cyclical approach to engaging stakeholders beyond its membership to optimize the integration of diverse perspectives from the community.Results
We developed the Operational Ontology for Oncology (O3) which identified 42 key elements, 359 attributes, 144 value sets, and 155 relationships ranked in relative importance of clinical significance, likelihood of availability in EHRs, or the ability to modify routine clinical processes to permit aggregation. Recommendations are provided for best use and development of the O3 to four constituencies: device manufacturers, centers of clinical care, researchers, and professional societies.Conclusions
O3 is designed to extend and interoperate with existing global infrastructure and data science standards. The implementation of these recommendations will lower the barriers for aggregation of information that could be used creating large, representative, findable, accessible, interoperable and reusable (FAIR) datasets supporting the scientific objectives of grant programs. The construction of comprehensive "real world" datasets and application of advanced analytic techniques, including artificial intelligence (AI), holds the potential to revolutionize patient management and improve outcomes by leveraging increased access to information derived from larger, more representative datasets.