- Zook, Justin M;
- Hansen, Nancy F;
- Olson, Nathan D;
- Chapman, Lesley;
- Mullikin, James C;
- Xiao, Chunlin;
- Sherry, Stephen;
- Koren, Sergey;
- Phillippy, Adam M;
- Boutros, Paul C;
- Sahraeian, Sayed Mohammad E;
- Huang, Vincent;
- Rouette, Alexandre;
- Alexander, Noah;
- Mason, Christopher E;
- Hajirasouliha, Iman;
- Ricketts, Camir;
- Lee, Joyce;
- Tearle, Rick;
- Fiddes, Ian T;
- Barrio, Alvaro Martinez;
- Wala, Jeremiah;
- Carroll, Andrew;
- Ghaffari, Noushin;
- Rodriguez, Oscar L;
- Bashir, Ali;
- Jackman, Shaun;
- Farrell, John J;
- Wenger, Aaron M;
- Alkan, Can;
- Soylev, Arda;
- Schatz, Michael C;
- Garg, Shilpa;
- Church, George;
- Marschall, Tobias;
- Chen, Ken;
- Fan, Xian;
- English, Adam C;
- Rosenfeld, Jeffrey A;
- Zhou, Weichen;
- Mills, Ryan E;
- Sage, Jay M;
- Davis, Jennifer R;
- Kaiser, Michael D;
- Oliver, John S;
- Catalano, Anthony P;
- Chaisson, Mark JP;
- Spies, Noah;
- Sedlazeck, Fritz J;
- Salit, Marc
New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping.