Result: Assessing differential expression in two-color microarrays: a resampling-based empirical Bayes approach.

Title:
Assessing differential expression in two-color microarrays: a resampling-based empirical Bayes approach.
Authors:
Li D; Office of Public Health Studies, John A. Burns School of Medicine, University of Hawaii at Manoa, Honolulu, Hawaii, United States of America., Le Pape MA, Parikh NI, Chen WX, Dye TD
Source:
PloS one [PLoS One] 2013 Nov 27; Vol. 8 (11), pp. e80099. Date of Electronic Publication: 2013 Nov 27 (Print Publication: 2013).
Publication Type:
Journal Article; Research Support, N.I.H., Extramural
Language:
English
Journal Info:
Publisher: Public Library of Science Country of Publication: United States NLM ID: 101285081 Publication Model: eCollection Cited Medium: Internet ISSN: 1932-6203 (Electronic) Linking ISSN: 19326203 NLM ISO Abbreviation: PLoS One Subsets: MEDLINE
Imprint Name(s):
Original Publication: San Francisco, CA : Public Library of Science
References:
PLoS One. 2013 Jun 27;8(6):e67489. (PMID: 23826308)
BMC Bioinformatics. 2005 May 29;6:129. (PMID: 15921534)
Biom J. 2008 Oct;50(5):756-66. (PMID: 18932135)
Reprod Sci. 2012 Jan;19(1):6-13. (PMID: 22228737)
Proc Natl Acad Sci U S A. 2001 Apr 24;98(9):5116-21. (PMID: 11309499)
Stat Appl Genet Mol Biol. 2004;3:Article3. (PMID: 16646809)
Birth Defects Res A Clin Mol Teratol. 2011 Aug;91(8):728-36. (PMID: 21308978)
Nucleic Acids Res. 2002 Jan 1;30(1):207-10. (PMID: 11752295)
Biostatistics. 2001 Jun;2(2):183-201. (PMID: 12933549)
BMC Bioinformatics. 2009 Jun 28;10:198. (PMID: 19558706)
Grant Information:
G12 MD007601 United States MD NIMHD NIH HHS; G12MD007601 United States MD NIMHD NIH HHS; U54MD007584 United States MD NIMHD NIH HHS; P20GM103516 United States GM NIGMS NIH HHS; U54 MD007584 United States MD NIMHD NIH HHS; P20 GM103516 United States GM NIGMS NIH HHS
Entry Date(s):
Date Created: 20131207 Date Completed: 20140911 Latest Revision: 20211021
Update Code:
20250114
PubMed Central ID:
PMC3842292
DOI:
10.1371/journal.pone.0080099
PMID:
24312198
Database:
MEDLINE

Further information

Microarrays are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. Multiple testing methods in microarray data analysis aim at controlling both Type I and Type II error rates; however, real microarray data do not always fit their distribution assumptions. Smyth's ubiquitous parametric method, for example, inadequately accommodates violations of normality assumptions, resulting in inflated Type I error rates. The Significance Analysis of Microarrays, another widely used microarray data analysis method, is based on a permutation test and is robust to non-normally distributed data; however, the Significance Analysis of Microarrays method fold change criteria are problematic, and can critically alter the conclusion of a study, as a result of compositional changes of the control data set in the analysis. We propose a novel approach, combining resampling with empirical Bayes methods: the Resampling-based empirical Bayes Methods. This approach not only reduces false discovery rates for non-normally distributed microarray data, but it is also impervious to fold change threshold since no control data set selection is needed. Through simulation studies, sensitivities, specificities, total rejections, and false discovery rates are compared across the Smyth's parametric method, the Significance Analysis of Microarrays, and the Resampling-based empirical Bayes Methods. Differences in false discovery rates controls between each approach are illustrated through a preterm delivery methylation study. The results show that the Resampling-based empirical Bayes Methods offer significantly higher specificity and lower false discovery rates compared to Smyth's parametric method when data are not normally distributed. The Resampling-based empirical Bayes Methods also offers higher statistical power than the Significance Analysis of Microarrays method when the proportion of significantly differentially expressed genes is large for both normally and non-normally distributed data. Finally, the Resampling-based empirical Bayes Methods are generalizable to next generation sequencing RNA-seq data analysis.