Abstract:
|
For data with an unknown model structure, classification algorithms in machine learning can be an alternative or complement to model-based tests. Without heavily relying on statistical assumptions, classifiers can serve as a group comparison test by statistically testing whether a classifier predicts classes more accurately than chance.
To measure classification accuracy, a hold-out and a k-fold cross validation are the two most widely used methods. The hold-out is a simple way to test an accuracy result against a binomial distribution, but it has lower statistical power, due to a smaller test sample size, than the k-fold CV (Leave One Out, in particular). The accuracy results of a k-fold CV, however, are not binomially distributed and show alpha inflation under the null hypothesis of no group difference. In this presentation, we propose a new validation method, Independent Validation (IV), to remedy alpha inflation of the k-fold cross validation while achieving higher statistical power over the hold-out. With this new method, we can conveniently use any classifiers for hypothesis testing to compare groups without model specification.
|