Document Detail


Sample size calculations for designing clinical proteomic profiling studies using mass spectrometry.
MedLine Citation:
PMID:  22499705     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
In cancer clinical proteomics, MALDI and SELDI profiling are used to search for biomarkers of potentially curable early-stage disease. A given number of samples must be analysed in order to detect clinically relevant differences between cancers and controls, with adequate statistical power. From clinical proteomic profiling studies, expression data for each peak (protein or peptide) from two or more clinically defined groups of subjects are typically available. Typically, both exposure and confounder information on each subject are also available, and usually the samples are not from randomized subjects. Moreover, the data is usually available in replicate. At the design stage, however, covariates are not typically available and are often ignored in sample size calculations. This leads to the use of insufficient numbers of samples and reduced power when there are imbalances in the numbers of subjects between different phenotypic groups. A method is proposed for accommodating information on covariates, data imbalances and design-characteristics, such as the technical replication and the observational nature of these studies, in sample size calculations. It assumes knowledge of a joint distribution for the protein expression values and the covariates. When discretized covariates are considered, the effect of the covariates enters the calculations as a function of the proportions of subjects with specific attributes. This makes it relatively straightforward (even when pilot data on subject covariates is unavailable) to specify and to adjust for the effect of the expected heterogeneities. The new method suggests certain experimental designs which lead to the use of a smaller number of samples when planning a study. Analysis of data from the proteomic profiling of colorectal cancer reveals that fewer samples are needed when a study is balanced than when it is unbalanced, and when the IMAC30 chip-type is used. The method is implemented in the clippda package and is available in R at: http://www.bioconductor.org/help/bioc-views/release/bioc/html/clippda.html.
Authors:
Stephen O Nyangoma; Stuart I Collins; Douglas G Altman; Philip Johnson; Lucinda J Billingham
Related Documents :
21898635 - Advantages of the population-based approach to pregnancy dating demonstrated with resul...
6836015 - Teaching treatment planning. a problem-solving model.
22808075 - Using rule-based machine learning for candidate disease gene prioritization and sample ...
17710545 - Multimodal treatments for childhood attention-deficit/hyperactivity disorder: interpret...
12125735 - Predictive mapping of air pollution involving sparse spatial observations.
15344485 - Symmetric bem formulation for the m/eeg forward problem.
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't     Date:  2012-02-10
Journal Detail:
Title:  Statistical applications in genetics and molecular biology     Volume:  11     ISSN:  1544-6115     ISO Abbreviation:  Stat Appl Genet Mol Biol     Publication Date:  2012  
Date Detail:
Created Date:  2012-04-13     Completed Date:  2012-08-15     Revised Date:  2014-02-20    
Medline Journal Info:
Nlm Unique ID:  101176023     Medline TA:  Stat Appl Genet Mol Biol     Country:  United States    
Other Details:
Languages:  eng     Pagination:  Article 2     Citation Subset:  IM    
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Algorithms
Computer Simulation
Humans
Mass Spectrometry*
Models, Statistical
Proteins / metabolism*
Proteome / metabolism*
Proteomics / methods*
Research Design
Sample Size
Grant Support
ID/Acronym/Agency:
G0500994//Medical Research Council; G0800808//Medical Research Council; RRAK11686//Medical Research Council
Chemical
Reg. No./Substance:
0/Proteins; 0/Proteome

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Detection of differentially expressed gene sets in a partially paired microarray data set.
Next Document:  Normalization, bias correction, and peak calling for ChIP-seq.