Document Detail

Sample size calculations for designing clinical proteomic profiling studies using mass spectrometry.
MedLine Citation:
PMID:  22499705     Owner:  NLM     Status:  MEDLINE    
In cancer clinical proteomics, MALDI and SELDI profiling are used to search for biomarkers of potentially curable early-stage disease. A given number of samples must be analysed in order to detect clinically relevant differences between cancers and controls, with adequate statistical power. From clinical proteomic profiling studies, expression data for each peak (protein or peptide) from two or more clinically defined groups of subjects are typically available. Typically, both exposure and confounder information on each subject are also available, and usually the samples are not from randomized subjects. Moreover, the data is usually available in replicate. At the design stage, however, covariates are not typically available and are often ignored in sample size calculations. This leads to the use of insufficient numbers of samples and reduced power when there are imbalances in the numbers of subjects between different phenotypic groups. A method is proposed for accommodating information on covariates, data imbalances and design-characteristics, such as the technical replication and the observational nature of these studies, in sample size calculations. It assumes knowledge of a joint distribution for the protein expression values and the covariates. When discretized covariates are considered, the effect of the covariates enters the calculations as a function of the proportions of subjects with specific attributes. This makes it relatively straightforward (even when pilot data on subject covariates is unavailable) to specify and to adjust for the effect of the expected heterogeneities. The new method suggests certain experimental designs which lead to the use of a smaller number of samples when planning a study. Analysis of data from the proteomic profiling of colorectal cancer reveals that fewer samples are needed when a study is balanced than when it is unbalanced, and when the IMAC30 chip-type is used. The method is implemented in the clippda package and is available in R at:
Stephen O Nyangoma; Stuart I Collins; Douglas G Altman; Philip Johnson; Lucinda J Billingham
Related Documents :
25472525 - The dynamics of motivated perception: effects of control and status on the perception o...
2488905 - Eskimo: an epidemiological simulation kinetic model for tuberculosis.
11757595 - Addressing uncertainty and conflicting cost estimates in revising the arsenic mcl.
15748085 - The friction-cost method : replacement for nothing and leisure for free?
25384895 - Facilitated communication and authorship: a systematic review.
22442645 - Neurogenesis drives stimulus decorrelation in a model of the olfactory bulb.
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't     Date:  2012-02-10
Journal Detail:
Title:  Statistical applications in genetics and molecular biology     Volume:  11     ISSN:  1544-6115     ISO Abbreviation:  Stat Appl Genet Mol Biol     Publication Date:  2012  
Date Detail:
Created Date:  2012-04-13     Completed Date:  2012-08-15     Revised Date:  2014-02-20    
Medline Journal Info:
Nlm Unique ID:  101176023     Medline TA:  Stat Appl Genet Mol Biol     Country:  United States    
Other Details:
Languages:  eng     Pagination:  Article 2     Citation Subset:  IM    
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Computer Simulation
Mass Spectrometry*
Models, Statistical
Proteins / metabolism*
Proteome / metabolism*
Proteomics / methods*
Research Design
Sample Size
Grant Support
G0500994//Medical Research Council; G0800808//Medical Research Council; RRAK11686//Medical Research Council
Reg. No./Substance:
0/Proteins; 0/Proteome

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Detection of differentially expressed gene sets in a partially paired microarray data set.
Next Document:  Normalization, bias correction, and peak calling for ChIP-seq.