Identification of proteins and characterization of their post-translational modifications.
Mass spectrometry-based protein identification has become an
invaluable tool for elucidating protein function, and several methods
have been developed for protein identification, including
sequence collection searching with masses of peptides or their
fragments, spectral library searching, and de novo sequencing
The first step in protein identification is to find peaks in the
mass spectra that correspond to peptides and their fragments. It
is important to find all the relevant peaks and at the same time
minimizing the number of background peaks. This can be
achieved by scanning the spectra for peaks of the expected width
and selecting peaks above a signal to noise threshold,
and then picking the monoisotopic peak for each isotope cluster.
After picking the peaks, spectra with low information
content that could not produce any meaningful results can be
removed to increase the speed of subsequent analysis.
In all mass spectrometry-based identification methods, a score
is calculated to quantify the match between the observed mass
spectrum and the collection of possible sequences. These scores
are highly dependent on the details of the algorithm used, and
they are not always easy to interpret because the interpretation of
the score depends on properties of the data and the search results.
Therefore, it is desirable to convert the score to a measure that is
easy to interpret, such as the probability that the result is random
and false. For this conversion, the distribution of random and false
scores is needed. Estimates of this distribution can be generated
using either simulations, collecting statistics during
the search, or direct calculations.
|Figure 1. Mass spectrometry based workflows for protein identification : (a) searching a protein sequence collection with
peptide mass information; (b) searching a protein sequence collection with peptide fragment mass information;
(c) searching a spectrum library with peptide fragment mass information; (d) de novo sequencing.
D. Fenyö, J. Eriksson, R. Beavis, "Mass spectrometric protein identification using the global proteome machine", Methods Mol Biol 673 (2010) 189-202.
D. Fenyö, B.S.Phinney, R.C. Beavis, "Determining the Overall Merit of Protein Identification Data Sets: rho-Diagrams and rho-Scores". J Proteome Res 6 (2007) 1997-2004.
R. Craig, J.C. Cortens, D. Fenyö, R.C. Beavis, "Using Annotated Peptide Mass Spectrum Libraries for Protein Identification" Journal of Proteome Research, 5 (2006) 1843-1849.
J. Eriksson, D. Fenyö "Protein Identification in Complex Mixtures", J Proteome Research 4 (2005) 387-93.
D. Fenyö, R.C. Beavis, "A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes", Analytical Chemistry 75 (2003) 768-74.
H.I. Field, D. Fenyö, R.C. Beavis "RADARS, a bioinformatics solution that automates proteome mass spectral analysis, optimises protein identification, and archives data in a relational database", Proteomics 2 (2002) 36-47.
J. Eriksson, B.T. Chait, D. Fenyö "A statistical basis for testing the significance of mass spectrometric protein identification results" Anal Chem 72 (2000) 999-1005.
D. Fenyö, J. Qin, B.T. Chait "Protein Identification using Mass Spectrometric Information" Electrophoresis 19 (1998) 998-1005.
Quantitation of proteins and peptides
Mass spectrometry (MS)-based quantitative proteomics has been
applied to solve a wide variety of biological problems, and several
MS-based workflows have been developed for protein and peptide
quantitation (Fig. 1). In mass spectrometric quantitation
methods it is usually assumed that the measured signal has a linear
dependence on the amount of material in the sample for the
entire range of amounts being studied. A prerequisite for accurate
quantitation is that unwanted experimental variations in
sample extraction, preparation, and analysis be minimized, and it
is therefore critical that each step in the workflow is optimized
Figure 1. Workflows for mass spectrometry-based protein and peptide quantitation.
(a) Metabolic labeling.
(b) Protein labeling.
(c) Chimeric recombinant protein labeling.
(d) Peptide labeling.
(e) Isobaric peptide labeling.
(f) Synthetic peptide labeling.
(g) Label-free quantitation (intensity of precursor ions).
(h) Label-free quantitation (standard curve).
(i) Label-free quantitation (intensity of fragment ions).
When quantitation of proteins in complex samples is based on the intensity of peptide precursor and fragment ions,
interference can distort the measurements. It is important to detect and correct for these interferences.
We used computer simulations as a tool to investigate the feasibility of correction for interference in MRM analyses.
In our simulations, it was assumed that the expected relative intensity of the transitions for a peptide is known.
Hypothetical interference was added to one or more transitions, and random noise was added to all transitions.
The distribution of the noise was obtained from repeated measurements.
Interference was detected by measuring the deviation of the intensity ratios of transitions from the expected ratios, and detecting outliers.
The transitions with interference were removed and the peptide quantity was calculated using only the transitions without interference.
Figure 2. Correction for interference. The effect of using different interference detection thresholds.
Panels A and B: The corrected relative error in the quantitation as a function of the relative error before correction.
Panels: C and D: The distribution of the corrected error in quantitation for relative error ranges of 0.3-0.7 and 1.3-1.7, respectively.
G. Zhang, B.M. Ueberheide, S. Waldemarson, S. Myung, K. Molloy, J. Eriksson, B.T. Chait, T.A. Neubert, D. Fenyö, "Protein quantitation using mass spectrometry", Methods Mol Biol 673 (2010) 211-22.
G. Zhang, D. Fenyö, T.A. Neubert, "Evaluation of the Variation in Sample Preparation for Comparative Proteomics Using Stable Isotope Labeling by Amino Acids in Cell Culture", J Proteome Res 8 (2009) 1285-1292.