THESIS
2023
1 online resource (xii, 63 pages) : illustrations (some color)
Abstract
Spectral libraries are useful resources in proteomic data analysis, but they can only identify
previously known peptides existing within libraries. Recent advances in deep learning
allow tandem mass spectra of peptides to be predicted from their amino acid sequences,
which enables predicted spectral libraries to be compiled. Searching against such libraries
has been shown to improve sensitivity in peptide identification over conventional sequence
database searching. However, current prediction models lack support for longer peptides,
and thus far predicted library searching was only demonstrated for backbone ion-only spectrum
prediction methods. Here, we propose a deep learning-based full-spectrum prediction
method to generate predicted spectral libraries for peptide identification. We...[
Read more ]
Spectral libraries are useful resources in proteomic data analysis, but they can only identify
previously known peptides existing within libraries. Recent advances in deep learning
allow tandem mass spectra of peptides to be predicted from their amino acid sequences,
which enables predicted spectral libraries to be compiled. Searching against such libraries
has been shown to improve sensitivity in peptide identification over conventional sequence
database searching. However, current prediction models lack support for longer peptides,
and thus far predicted library searching was only demonstrated for backbone ion-only spectrum
prediction methods. Here, we propose a deep learning-based full-spectrum prediction
method to generate predicted spectral libraries for peptide identification. We demonstrated
the superiority of using full-spectrum libraries over backbone ion-only prediction
approaches in spectral library searching. Furthermore, merging spectra from different prediction
models, as a form of ensemble learning, can produce improved spectral libraries in
terms of identification sensitivity. We also show that a hybrid library combining predicted
and experimental spectra can lead to 20% more confident identifications over experimental library searching or sequence database searching. For carrying out target decoy search
strategy in library search, we have experimented generating decoys via a shuffle-and-predict
method. Results from our experiments suggest that predicted decoys can be a viable alternative
to existing library decoy generation methods.
Post a Comment