Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 246 - Data Science
Type: Contributed
Date/Time: Wednesday, August 11, 2021 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistical Computing
Abstract #318804
Title: Backfitting for Large-Scale Binary Regressions with Crossed Random Effects
Author(s): Swarnadip Ghosh* and Trevor JOHN Hastie and Art Owen
Companies: STANFORD UNIVERSITY and STANFORD UNIVERSITY and Stanford University
Keywords: Crossed Random Effect; Logistic Regression; Backfitting; Scalability
Abstract:

The cost of both generalized least squares and Gibbs sampling in a crossed random effect model can easily grow as O(N^3/2) (or worse) for N observations. Ghosh et al. (2020) have shown that the cost of backfitting algorithm for regression problem with Gaussian error is O(N) under some regularity conditions on the observation pattern. In this work we develop a scalable method to estimate the parameters for binary response. We approximate the likelihood using penalized quasi-likelihood and use a variant of a method developed by Schall (1991) to do inference for large crossed random effects structure for binary response. The method we develop collapses the fixed effect and one random effect at each iteration and is computable in O(N) work for the generalized linear mixed model. We apply our method to a real dataset from Stitch Fix. The crossed random effects induce correlations that plain logistic regression ignores. This naivete can lead to variance estimates too small by a factor as high as 200 on this data. Plain logistic regression is also inefficient compared to a crossed random effects model, raising the variance of fixed effect parameters by as much as 4.5.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program