Abstract:
|
Widely used non-linear regression models like logistic regression, and other more complicated models like Ising graphical models utilize optimization algorithms for parameter estimation. These optimization algorithms can be employed on a single machine for standard size data. However, for big data where we have several millions of observations, standard iterative approaches for optimizations are not feasible as they require several passes over the data sets and the big data will not fit into memory. Alternating Direction Method of Multipliers (ADMM) is a scalable algorithm that can be applied to nonlinear models in big data. ADMM partitions the data, comes up with a consensus estimate, and then updates the estimates within every partition. This algorithm is guaranteed to converge within a reasonable number of iterations, under some weak assumptions. The primary goal of the paper is to conduct a study to evaluate the ADMM method for two models, namely logistic regression and Ising graphical model using simulations and real data. We will also develop generalized R optimization utility applicable for any likelihood-based model. The R utility can be used in the Hadoop MapReduce setting
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.