Abstract:
|
The advent of new technologies resulted in production of massive data in many categories. In analyzing of these high dimensional data, variable selection in high or ultrahigh dimensional settings has attracted attentions of statisticians. Although this problem has recently been addressed using penalized likelihood methods, in this paper we adopt a Bayesian approach, which we call MOMLogit, that utilizes properties of non-local prior densities on the regression coefficient vector. In our context the response vector is binary. MOMLogit provides improved performance in finding true model as well as reducing prediction and estimation error rates in simulation studies. We also describe a novel approach for setting hyper parameters of the prior and provide diagnostics to assess the probability of finding highest posterior probability model. The performance of our algorithm for some real genomic data sets shows high accuracy predictions using much fewer explanatory variables compared to existing methods. As a result, we believe MOMLogit is going to have impactful applications in lots of areas such as bioinformatics, text processing and cancer genomics.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.