Statistical Methods for Subgroup Identification in Personalized Medicine
*James J. Chen, National Center for Toxicological Research, FDA 

Keywords: biomarker, subgroup identification, personalized medicine

Personalized medicine applies molecular technologies and statistical methods to identify genomic biomarkers in target patients for assigning more effective therapies and avoiding adverse events. Subgroup identification involves partitioning patients into subgroups defined by sets of biomarkers, where each subgroup corresponds to an optimal treatment. Subgroup identification for treatment selection consists of the three components: 1) biomarker identification, 2) subgroup selection, and 3) performance and clinical utility assessment. Biomarker identification involves developing statistical test procedures to identify a potential set of biomarkers to define patient subgroups. Subgroup selection is to develop a class prediction model to identify patient subgroups for treatment selection. Performance and clinical utility assessment evaluate 1) accuracy of classifiers and 2) power to detect treatment effect in the targeted subgroup. Statistical issues and challenges include experimental design, statistical models and tests to identify predictive biomarkers, classification model development to identify subgroups, classification of imbalanced subgroup sizes, and multiple testing.