Online Program

Return to main conference page

All Times ET

Thursday, June 3
Practice and Applications
Classification and Simulation: Methods, Analyses, and Applications
Thu, Jun 3, 10:00 AM - 11:35 AM
TBD
 

An Algorithm for Discrete Logistic Classification for Sparse Tables (309784)

Darcy Steeg Morris, U.S Census Bureau 
Eric Slud, U.S. Census Bureau 
*Yves Thibaudeau, U.S. Census Bureau 

Keywords: Logistic Regression, Classification, Sparse Data

Several Authors (Walker and Smith 2020) have researched logistic-regression classifiers in presence of actual or induced data sparsity. We center on computing logistic regression classifiers for sparse contingency tables classified in part by discrete latent variables as is common in record-linkage applications. Such tables can be high-dimensional and typically include a large proportion of zero cells. Identifying logistic regressions to give a probabilistic characterization of the cells in the table can be challenging. Often the preferred nominal model is not estimable because of sparsity. The number of alternative models can be large. We extend the methods of Fienberg and Rinaldo(2012) who estimate log-linear models in presence of "likelihood zeros". We use their approach to identify estimable logistic regression models that are "optimal" among all submodels of the nominal model in the sense they exploit all the residual information characterizing the nominal model and do not introduce constraints on the surviving parameters.