Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 38 - Inference, Optimization, and Computation on Discrete Structures
Type: Invited
Date/Time: Sunday, August 8, 2021 : 3:30 PM to 5:20 PM
Sponsor: IMS
Abstract #316915
Title: Bayesian Pyramids: Identifying Interpretable Discrete Latent Structures from Discrete Data
Author(s): Yuqi Gu* and David Dunson
Companies: Columbia University and Duke University
Keywords: deep belief network; identifiability; interpretable machine learning; multivariate categorical data; latent variable model; tensor decomposition
Abstract:

Multivariate categorical data are routinely collected in biomedical and social sciences. It is of great importance to build interpretable models that perform dimension reduction and uncover meaningful latent structures from such discrete data. Identifiability is a fundamental requirement for valid modeling and inference, yet is challenging to address when there are complex latent structures. We propose a class of interpretable discrete latent structure models for discrete data and develop a general identifiability theory. Our theory is applicable to various types of latent structures, ranging from a single latent variable to deep layers of latent variables organized in a sparse graph (termed a Bayesian pyramid). As an illustration, we consider the two-latent-layer model and propose a Bayesian shrinkage estimation approach. Simulations corroborate identifiability, and applications to DNA nucleotide sequence data uncover discrete latent features that are both interpretable and highly predictive of sequence types. The proposed framework provides a recipe for interpretable unsupervised learning of discrete data and can be a useful alternative to popular machine learning methods.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program