Document Detail

Integrated data management and validation platform for phosphorylated tandem mass spectrometry data.
MedLine Citation:
PMID:  20827731     Owner:  NLM     Status:  MEDLINE    
MS/MS is a widely used method for proteome-wide analysis of protein expression and PTMs. The thousands of MS/MS spectra produced from a single experiment pose a major challenge for downstream analysis. Standard programs, such as MASCOT, provide peptide assignments for many of the spectra, including identification of PTM sites, but these results are plagued by false-positive identifications. In phosphoproteomic experiments, only a single peptide assignment is typically available to support identification of each phosphorylation site, and hence minimizing false positives is critical. Thus, tedious manual validation is often required to increase confidence in the spectral assignments. We have developed phoMSVal, an open-source platform for managing MS/MS data and automatically validating identified phosphopeptides. We tested five classification algorithms with 17 extracted features to separate correct peptide assignments from incorrect ones using over 2600 manually curated spectra. The naïve Bayes algorithm was among the best classifiers with an AUC value of 97% and PPV of 97% for phosphotyrosine data. This classifier required only three features to achieve a 76% decrease in false positives as compared with MASCOT while retaining 97% of true positives. This algorithm was able to classify an independent phosphoserine/threonine data set with AUC value of 93% and PPV of 91%, demonstrating the applicability of this method for all types of phospho-MS/MS data. PhoMSVal is available at
Anna-Maria Lahesmaa-Korpinen; Scott M Carlson; Forest M White; Sampsa Hautaniemi
Publication Detail:
Type:  Evaluation Studies; Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't; Research Support, U.S. Gov't, Non-P.H.S.    
Journal Detail:
Title:  Proteomics     Volume:  10     ISSN:  1615-9861     ISO Abbreviation:  Proteomics     Publication Date:  2010 Oct 
Date Detail:
Created Date:  2010-09-27     Completed Date:  2011-01-24     Revised Date:  2014-09-12    
Medline Journal Info:
Nlm Unique ID:  101092707     Medline TA:  Proteomics     Country:  Germany    
Other Details:
Languages:  eng     Pagination:  3515-24     Citation Subset:  IM    
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Artificial Intelligence
Databases, Protein
Phosphopeptides / analysis*
Proteome / analysis*
Proteomics / methods*
Tandem Mass Spectrometry / methods*
Grant Support
R01 CA118705/CA/NCI NIH HHS; R01-CA118705/CA/NCI NIH HHS; U54-CA11297/CA/NCI NIH HHS
Reg. No./Substance:
0/Phosphopeptides; 0/Proteome

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  High-throughput quantification of selenium in individual serum proteins from a healthy human populat...
Next Document:  Evaluation of changes in serum protein profiles during neoadjuvant chemotherapy in HER2-positive bre...