Abstract:
|
cDNA microarrays and high-density oligonucleotide chips are novel biotechnologies which allow the monitoring of expression levels in cells for thousands of genes simultaneously. Microarrays are being applied increasingly in biological and medical research to address a wide range of problems--for example, to study the molecular variations among tumors in cancer research. Microarray experiments generate large and complex multivariate datasets. Careful statistical design and analysis can greatly improve the efficiency and reliability of these experiments.
This talk will survey statistical issues arising in the classification of biological samples using microarray gene expression data. In the context of cancer research, outcomes of interest include tumor class, response to treatment, and survival. I will report the results of a comparison study of different classifiers for tumor class prediction based on gene expression levels. The methods include: nearest neighbor classifiers, linear and quadratic discriminant analysis, support vector machines, and random forests. I will also address the questions of variable selection and classifier performance assessment.
|