Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 417 - Statistical Methods for Discovering Latent Structures in High-Dimensional and Complex Data
Type: Topic Contributed
Date/Time: Wednesday, August 10, 2022 : 10:30 AM to 12:20 PM
Sponsor: Section on Bayesian Statistical Science
Abstract #322178
Title: Bayesian Pyramids: Identifiable Multilayer Discrete Latent Structure Models for Discrete Data
Author(s): Yuqi Gu* and David Dunson
Companies: Columbia University and Duke University
Keywords: Bayesian inference; deep generative models; identifiability; latent class; multivariate categorical data; interpretable machine learning

High dimensional categorical data are routinely collected in biomedical and social sciences. It is of great importance to build interpretable parsimonious models that perform dimension reduction and uncover meaningful latent structures from such discrete data. Identifiability is a fundamental requirement for valid modeling and inference in such scenarios, yet is challenging to address when there are complex latent structures. In this article, we propose a class of identifiable multilayer discrete latent structure models for discrete data, termed Bayesian pyramids. We establish the identifiability of Bayesian pyramids by developing transparent conditions on the sparsity structure of the pyramid-shaped directed graph. The proposed identifiability conditions can ensure Bayesian posterior consistency under suitable priors. As an illustration, we consider the two-latent-layer model and propose a Bayesian shrinkage estimation approach. Simulation results for this model corroborate identifiability and estimability of the model parameters. Applications of the methodology to DNA nucleotide sequence data uncover useful discrete latent features that are highly predictive of sequence types.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program