Activity Number:
|
417
- Statistical Methods for Discovering Latent Structures in High-Dimensional and Complex Data
|
Type:
|
Topic Contributed
|
Date/Time:
|
Wednesday, August 10, 2022 : 10:30 AM to 12:20 PM
|
Sponsor:
|
Section on Bayesian Statistical Science
|
Abstract #322178
|
|
Title:
|
Bayesian Pyramids: Identifiable Multilayer Discrete Latent Structure Models for Discrete Data
|
Author(s):
|
Yuqi Gu* and David Dunson
|
Companies:
|
Columbia University and Duke University
|
Keywords:
|
Bayesian inference;
deep generative models;
identifiability;
latent class;
multivariate categorical data;
interpretable machine learning
|
Abstract:
|
High dimensional categorical data are routinely collected in biomedical and social sciences. It is of great importance to build interpretable parsimonious models that perform dimension reduction and uncover meaningful latent structures from such discrete data. Identifiability is a fundamental requirement for valid modeling and inference in such scenarios, yet is challenging to address when there are complex latent structures. In this article, we propose a class of identifiable multilayer discrete latent structure models for discrete data, termed Bayesian pyramids. We establish the identifiability of Bayesian pyramids by developing transparent conditions on the sparsity structure of the pyramid-shaped directed graph. The proposed identifiability conditions can ensure Bayesian posterior consistency under suitable priors. As an illustration, we consider the two-latent-layer model and propose a Bayesian shrinkage estimation approach. Simulation results for this model corroborate identifiability and estimability of the model parameters. Applications of the methodology to DNA nucleotide sequence data uncover useful discrete latent features that are highly predictive of sequence types.
|
Authors who are presenting talks have a * after their name.