Stowers Institute for Medical Research
Home | Contact

Internal Wiki

Development of Label Free Quantitation Methods

We continually seek to improve the underlying methods we use for quantitative proteomics analyses. This began with adopting a spectral counting based approach for quantitative proteomics (Zybailov et al., 2005) that led to developing the normalized spectral abundance factor (NSAF) (Zybailov et al., 2006), and most recently distributed NSAF (dNSAF) to deal with peptides shared between multiple proteins (Zhang et al., 2010). We have also adopted GeneChip statistical tools such as PLGEM to identify differential protein expression (Pavelka et al, 2008; Fournier et al, 2010).

Label Free Quantitation

dNSAF vs NSAFThe dynamic changes of a proteome or fractions of a proteome; i.e., organelles and protein complexes, can be analyzed via quantitative proteomic methods. We largely carry out label free quantitative proteomic analyses using spectral counting. In spectral counting, the total number of tandem mass spectra that match peptides to a particular protein is used to measure the abundance of proteins in a complex mixture. We have developed the normalized spectral abundance factor (NSAF) approach for using spectral counting in quantitative proteomics (Zybailov et al., 2006). This approach takes into account the sample-to-sample variation that is obtained when carrying out replicate analyses of a sample and the fact that longer proteins tend to have more peptide identifications than shorter proteins. Examples of the application of the NSAF approach to quantitative proteomic analysis include work on the expression changes of membrane proteins in S. cerevisiae (Zybailov et al., 2006), on the yeast SIN3/RPD3 complex (Florens et al, 2006), and on the human transcriptional regulatory complex, Mediator (Paoletti et al., 2006).

We have recently improved the NSAF approach to better deal with peptides shared between multiple proteins. We now use the distributed NSAF (dNSAF) approach for our quantitative proteomics analyses (Zhang et al., 2010): peptides/spectral counts that are shared between proteins are distributed to each protein based on the amount of unique peptides/spectral counts found for each protein.


, in which shared spectral counts (sSpCi) are distributed based on spectral counts unique to each protein i (uSpCi) divided by the sum of all unique spectral counts for the M protein isoforms that shared peptide j with protein i, normalized against protein i's length (Lengthi).

PLGEM-STN Statistics

plgemThe PLGEM (Power Law Global Error Model) was initially developed to identify differentially expressed genes from microarray data. It can be implemented using the open source software package "plgem" written in R and maintained by the BioConductor project.

We have demonstrated that NSAF datasets share substantial statistical similarities with GeneChip data, suggesting that most GeneChip-specific statistical tools should be applicable to the analysis of NSAF datasets as well (Pavelka et al., 2008). This opens the door for NSAF based MudPIT analyses to carry out similar studies to those done over the years for GeneChip™ analyses and provides a foundation for bioinformatics analysis of such datasets using established GeneChip™ tools (Pavelka et al., 2008).

The use of PLGEM-based standard deviations to calculate signal-to-noise (STN) ratios in an NSAF dataset improves our ability to determine protein expression changes. PLGEM-STN statistic outperforms both fold-change (FC) and Standard-STN by being more conservative with proteins of low abundance than proteins with high abundance. PLGEM is based on the following mathematical equation:

ln (SD) = k ln (mean) + c + ε

, where SD is standard deviation, k is the slope, c is the intercept, while ε represents a normally distributed residual error.

Zybailov, B., Coleman, M.K., Florens, L., & Washburn, M.P. (2005) Correlation of Relative Abundance Ratios Derived from Peptide Ion Chromatograms and Spectrum Counting for Quantitative Proteomic Analysis using Stable Isotope Labeling. Anal. Chem., 77(19):6218-24. Abstract

Zybailov, B., Mosley, A.L., Sardiu, M.E., Coleman, M.K., Florens, L., & Washburn, M.P. (2006) Statistical Analysis of Membrane Proteome Expression Changes in Saccharomyces cerevisiae. J. Proteome Res., 5(9):2339-47. Abstract

Florens, L., Carozza, M.J., Swanson, S.K., Fournier, M., Coleman, M.K., Workman, J.L., & Washburn, M.P. (2006) Analyzing Chromatin Remodeling Complexes Using Shotgun Proteomics and Normalized Spectral Abundance Factors. Methods, 40(4):303-11. Abstract

Paoletti, A.C., Parmely, T.J., Chieri Tomomori-Sato, C., Sato, S., Zhu, D., Conaway, R.C., Weliky Conaway, J., Florens, L., & Washburn, M.P. (2006) Quantitative Proteomic Analysis of Distinct Mammalian Mediator Complexes using Normalized Spectral Abundance Factors. Proc. Natl. Acad. Sci. USA, 103(50):18928-33. Abstract

Pavelka, N., Fournier, M., Swanson, S.K., Pelizzola, M., Ricciardi-Castagnoli, P., Florens, L., & Washburn, M.P. (2008) Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol. Cell Proteomics, 7(4):631-44. Abstract

Zhang, Y., Wen, Z., Washburn, M.P., & Florens, L. (2010) Refinements to Label Free Proteome Quantitation: How to Deal with Peptides Shared by Multiple Proteins. Anal. Chem., 82(6):2272-81. Abstract

Fournier, M.L., Paulson, A., Pavelka, N., Mosley, A.L., Gaudenz, K., Bradford, W.D., Glynn, E., Li, H., Sardiu, M.E., Fleharty, B., Seidel, C., Florens, L., & Washburn, M.P. (2010) Delayed Correlation of mRNA and Protein Expression in Rapamycin Treated Cells and a Role for Ggc1 in Cellular Sensitivity to Rapamycin. Mol. Cell. Proteomics., 9(2):271-84. Abstract