- Deutsch, Eric W;
- Perez-Riverol, Yasset;
- Chalkley, Robert J;
- Wilhelm, Mathias;
- Tate, Stephen;
- Sachsenberg, Timo;
- Walzer, Mathias;
- Käll, Lukas;
- Delanghe, Bernard;
- Böcker, Sebastian;
- Schymanski, Emma L;
- Wilmes, Paul;
- Dorfer, Viktoria;
- Kuster, Bernhard;
- Volders, Pieter-Jan;
- Jehmlich, Nico;
- Vissers, Johannes PC;
- Wolan, Dennis W;
- Wang, Ana Y;
- Mendoza, Luis;
- Shofstahl, Jim;
- Dowsey, Andrew W;
- Griss, Johannes;
- Salek, Reza M;
- Neumann, Steffen;
- Binz, Pierre-Alain;
- Lam, Henry;
- Vizcaíno, Juan Antonio;
- Bandeira, Nuno;
- Röst, Hannes
The 2017 Dagstuhl Seminar on Computational Proteomics provided an opportunity for a broad discussion on the current state and future directions of the generation and use of peptide tandem mass spectrometry spectral libraries. Their use in proteomics is growing slowly, but there are multiple challenges in the field that must be addressed to further increase the adoption of spectral libraries and related techniques. The primary bottlenecks are the paucity of high quality and comprehensive libraries and the general difficulty of adopting spectral library searching into existing workflows. There are several existing spectral library formats, but none captures a satisfactory level of metadata; therefore, a logical next improvement is to design a more advanced, Proteomics Standards Initiative-approved spectral library format that can encode all of the desired metadata. The group discussed a series of metadata requirements organized into three designations of completeness or quality, tentatively dubbed bronze, silver, and gold. The metadata can be organized at four different levels of granularity: at the collection (library) level, at the individual entry (peptide ion) level, at the peak (fragment ion) level, and at the peak annotation level. Strategies for encoding mass modifications in a consistent manner and the requirement for encoding high-quality and commonly seen but as-yet-unidentified spectra were discussed. The group also discussed related topics, including strategies for comparing two spectra, techniques for generating representative spectra for a library, approaches for selection of optimal signature ions for targeted workflows, and issues surrounding the merging of two or more libraries into one. We present here a review of this field and the challenges that the community must address in order to accelerate the adoption of spectral libraries in routine analysis of proteomics datasets.