Activity Number:
|
246
- Data Science
|
Type:
|
Contributed
|
Date/Time:
|
Wednesday, August 11, 2021 : 10:00 AM to 11:50 AM
|
Sponsor:
|
Section on Statistical Computing
|
Abstract #318804
|
|
Title:
|
Backfitting for Large-Scale Binary Regressions with Crossed Random Effects
|
Author(s):
|
Swarnadip Ghosh* and Trevor JOHN Hastie and Art Owen
|
Companies:
|
STANFORD UNIVERSITY and STANFORD UNIVERSITY and Stanford University
|
Keywords:
|
Crossed Random Effect;
Logistic Regression;
Backfitting;
Scalability
|
Abstract:
|
The cost of both generalized least squares and Gibbs sampling in a crossed random effect model can easily grow as O(N^3/2) (or worse) for N observations. Ghosh et al. (2020) have shown that the cost of backfitting algorithm for regression problem with Gaussian error is O(N) under some regularity conditions on the observation pattern. In this work we develop a scalable method to estimate the parameters for binary response. We approximate the likelihood using penalized quasi-likelihood and use a variant of a method developed by Schall (1991) to do inference for large crossed random effects structure for binary response. The method we develop collapses the fixed effect and one random effect at each iteration and is computable in O(N) work for the generalized linear mixed model. We apply our method to a real dataset from Stitch Fix. The crossed random effects induce correlations that plain logistic regression ignores. This naivete can lead to variance estimates too small by a factor as high as 200 on this data. Plain logistic regression is also inefficient compared to a crossed random effects model, raising the variance of fixed effect parameters by as much as 4.5.
|
Authors who are presenting talks have a * after their name.