Abstract:
|
I will introduce a new classification method for high-dimensional data, based on careful combination of the results of applying an arbitrary base classifier on random projections of the feature vectors into a lower-dimensional space. More precisely, the random projections are divided into non-overlapping blocks, and within each block we select the projection yielding the smallest estimate of the test error. Our random projection ensemble classifier then aggregates the results of applying the base classifier on the selected projections, with a data-driven voting threshold to determine the final assignment. Our theoretical results elucidate the effect on performance of increasing the number of projections. In fact, in certain cases the ensemble can perform almost as well as the optimal classifier in the lower-dimensional space. I will present the results of a simulation study comparing the classifier empirically with several other popular high-dimensional classifiers, demonstrating its excellent finite-sample performance.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.