Abstract:
|
Microarray data has a high dimensional data structure that makes statistical inference drawn from this type of data challenging. Since current statistical methods are generally for ``small p (number of variables) and large n (number of sample size)", these methods can be insufficient to draw valid conclusions for microarray data. Nevertheless, some of these methods, such as ANOVA (F-test), are still widely used. Beside having high dimensional data, microarray data also have correlation structure. Most of the current methods either ignore high dimensional data structure or fail to efficiently take correlations among genes into account. In this paper, we propose using an effective column size idea to handle these situations by modifying the classical F-test. We consider various magnitudes of correlation among genes in Monte Carlo simulation studies. We demonstrate the proposed test with real type 2 diabetes mellitus gene expression data, which was obtained from the Gene Expression Omnibus (GEO) database with accession number GSE25724.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.