THESIS
2015
xvii, 131 pages : illustrations (some color) ; 30 cm
Abstract
Post-translational modification (PTM) is a key step in protein biosynthesis, critical for the correct
trafficking and function of the protein. However, the high-throughput identification of additional
biochemical functional groups, such as phosphate and glycans, involved in PTM remains a
challenge in proteomics. The search strategy based on protein sequence database is in
widespread use, but it is time-consuming and prone to false positives because of its exponentially
increased search space and incomplete theoretical fragmentation model.
Due to its advantages in efficiency and sensitivity, spectral library searching is a promising
alternative to conventional sequence database searching. Our work aims to facilitate PTM
identification in the spectral library search approach. In p...[
Read more ]
Post-translational modification (PTM) is a key step in protein biosynthesis, critical for the correct
trafficking and function of the protein. However, the high-throughput identification of additional
biochemical functional groups, such as phosphate and glycans, involved in PTM remains a
challenge in proteomics. The search strategy based on protein sequence database is in
widespread use, but it is time-consuming and prone to false positives because of its exponentially
increased search space and incomplete theoretical fragmentation model.
Due to its advantages in efficiency and sensitivity, spectral library searching is a promising
alternative to conventional sequence database searching. Our work aims to facilitate PTM
identification in the spectral library search approach. In particular, we first applied the approach
on two important and challenging PTMs, phosphorylation and glycosylation, and extended the
method to other modifications.
In phosphorylated peptide identification, the largest collision-induced dissociation (CID) tandem
mass (MS2) spectral libraries of phosphorylated peptides in human and other model organisms to
date have been built in an automatic platform which consists of multiple state-of-art search
engines (e.g. X!Tandem and MSGF+) and site-localization tools (e.g. PhosphoRS and
PTMProphet) with strict quality control. Spectral library searching using this library significantly
outperforms existing methods for detecting phosphosites in a variety of datasets.
In glycopeptide identification, a spectral library searching method was developed to identify
intact N-linked glycopeptides from the MS2 spectra, based upon an existing spectrum prediction
tool, MassAnalyzer (Zhang, Z., Anal. Chem. 2010), to account for the special fragmentation patterns of glycopeptides. We evaluated the scoring functions, developed methods to analyze
ambiguous candidates and clustered the predicted spectral library to reduce the searching cost. A
novel query decoy strategy was further applied to estimate the false discovery rate (FDR) of
glycopeptides. The spectral library searching strategy was successfully verified in the searching
of standard N-linked glycoproteins.
In multiple PTM identification, we extended the spectral library searching method to utilize
known modifications sites in UniProtKB to achieve multiple PTMs searching at one time. A
predicted spectral library was built using the software MassAnalyzer, which contained all
possible tryptic modified peptides generated based on the PTMs reported in MOD_RES fields
and all reviewed proteins in UniProtKB. The search results of 4 human tissues samples against
the spectral library showed that our spectral library is able to realize multiple PTM profiling, but
there are still several challenges in both experimental and computational methods, such as
enrichment of multiple PTMs and prediction models of novel PTMs.
Keywords:
Tandem Mass Spectrometry, Shotgun Proteomics, Protein Posttranslational Modification (PTM),
Phosphorylation, Glycosylation, PTMs Profiling, Spectral Library Searching, SpectraST,
UniProtKB
Post a Comment