Power and sample size calculations for high-throughput sequencing-based experiments.

Li CI, Samuels DC, Zhao YY, Shyr Y, Guo Y
Brief Bioinform. 2018 19 (6): 1247-1255

PMID: 28605403 · PMCID: PMC6291796 · DOI:10.1093/bib/bbx061

Power/sample size (power) analysis estimates the likelihood of successfully finding the statistical significance in a data set. There has been a growing recognition of the importance of power analysis in the proper design of experiments. Power analysis is complex, yet necessary for the success of large studies. It is important to design a study that produces statistically accurate and reliable results. Power computation methods have been well established for both microarray-based gene expression studies and genotyping microarray-based genome-wide association studies. High-throughput sequencing (HTS) has greatly enhanced our ability to conduct biomedical studies at the highest possible resolution (per nucleotide). However, the complexity of power computations is much greater for sequencing data than for the simpler genotyping array data. Research on methods of power computations for HTS-based studies has been recently conducted but is not yet well known or widely used. In this article, we describe the power computation methods that are currently available for a range of HTS-based studies, including DNA sequencing, RNA-sequencing, microbiome sequencing and chromatin immunoprecipitation sequencing. Most importantly, we review the methods of power analysis for several types of sequencing data and guide the reader to the relevant methods for each data type.

MeSH Terms (9)

Chromatin Immunoprecipitation Genome-Wide Association Study Heterozygote High-Throughput Nucleotide Sequencing Humans Microbiota Mutation Poisson Distribution Sequence Analysis, RNA

Connections (1)

This publication is referenced by other Labnodes entities: