Online Program Home
My Program

Abstract Details

Activity Number: 641 - Recent Advances in Density Mixture Modeling and EM-Like Algorithms: Frequentist and Bayesian Views
Type: Topic Contributed
Date/Time: Thursday, August 1, 2019 : 10:30 AM to 12:20 PM
Sponsor: Section on Nonparametric Statistics
Abstract #304127
Title: An Asynchronous Distributed Expectation Maximization Algorithm for Massive Data: The DEM Algorithm
Author(s): Sanvesh Srivastava* and Chuanhai Liu and Glen DePalma
Companies: University of Iowa and Purdue University and Purdue University
Keywords: Divide-and-conquer; iterative computations; EM-type algorithm; message passing interface (MPI); linear mixed-effects model; ECME

The E step of EM-type algorithms is time-consuming in massive data applications because it requires multiple passes through the full data. We address this problem by proposing an asynchronous and distributed generalization of the EM called the Distributed EM (DEM). Using DEM, existing EM-type algorithms are easily extended to massive data settings by exploiting the divide-and-conquer technique and widely available computing power, such as grid computing. The DEM algorithm reserves two groups of computing processes called "workers" and "managers" for performing the E step and the M step, respectively. The samples are randomly partitioned into a large number of disjoint subsets and are stored on the worker processes. The E step of DEM algorithm is performed in parallel on all the workers, and every worker communicates its results to the managers at the end of the local E step. The managers perform the M step after they have received results from an r-fraction of the workers, where r is a fixed constant in (0, 1]. The sequence of parameter estimates generated by the DEM algorithm retains many attractive properties of the EM algorithm. (Joint work with Chuanhai Liu and Glen DePalma)

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program