Abstract:
|
Large amounts of biomedical data is generated from various genomic/medical studies. Analyzing this data through innovative statistical analysis helps scientists to better understand disease risk, progression, and treatment outcomes. We present a re-analysis of gene-expression profiling data to predict patient responses to FOLFOX (FOLinic acid Fluorouracil OXaliplatin) therapy, a type of chemotherapy that treats colorectal cancer. We conduct an extensive data search on colorectal cancer studies and use a data set, available at NCBI, GEO accession number GSE28702 (original study: Tsuji, S. et al (2012). Potential responders to FOLFOX therapy for colorectal cancer by Random Forests analysis. British Journal of Cancer, 106(1), 126-132. http://doi.org/10.1038/bjc.2011.505). It contains 83 patients, each with 17,920 gene-expression values. We perform exploratory data analysis, select significant predictor genes and make predictions for responders to FOLFOX therapy. This re-analysis demonstrates interesting aspects of large-scale data analysis, including data formatting, pre-processing, and statistical learning. Research funded by the National Science Foundation, Grant No. 1246818.
|