Online Program Home
My Program

Abstract Details

Activity Number: 408 - SPAAC Poster Competition
Type: Topic Contributed
Date/Time: Tuesday, July 31, 2018 : 2:00 PM to 3:50 PM
Sponsor: Scientific and Public Affairs Advisory Committee
Abstract #328361
Title: Renewable Estimation and Incremental Inference in Generalized Linear Models with Streaming Data Sets
Author(s): Lan Luo* and Peter X.-K. Song
Companies: and University of Michigan
Keywords: Incremental statistical analysis; Lambda architecture; Online learning; Stochastic gradient descent algorithm
Abstract:

This paper presents an incremental Newton-Raphson learning algorithm to analyze streaming datasets using generalized linear models. The proposed method is formulated within a new framework of renewable estimation and incremental inference, in which the estimates are renewed with current data and summary statistics of historical data, but with no use of any historic subject-level data. In the implementation, we design a new paradigm of data storage and data processing, named as the Rho architecture, as part of expansion to the currently popular Apache Spark Lambda architecture, to accommodate the inference-related statistics in the inference layer, as well as to communicate with the speed layer to facilitate sequential updating of estimation and inference. Both estimation consistency and asymptotic normality of the renewable estimator are established, through which the Wald test is utilized for incremental inference. Our methods are examined and illustrated by various numerical examples from both simulation experiments and a real-world data analysis.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program