Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 107 - SPEED: Statistical Methods, Computing, and Applications Part 1
Type: Contributed
Date/Time: Monday, August 8, 2022 : 8:30 AM to 10:20 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #322976
Title: Interpretable Modeling of Genotype-Phenotype Landscapes with State-of-the-Art Predictive Power
Author(s): Peter Tonner* and David Ross and Abe Pressman
Companies: National Institute of Standards and Technology and National Institute of Standards and Technology and National Institute of Standards and Technology
Keywords: Interpretability; Machine Learning; Genotype-phenotype landscapes; Variational inference
Abstract:

Our ability to predict biological function (phenotype) from genetic background (genotype) impacts numerous biological domains. Black-box neural networks are currently the dominant modeling choice due to their unsurpassed ability to generate accurate out-of-sample predictions. These models suffer, however, from their inherent inability to explain their predictions. As an alternative, we developed a hierarchical Bayesian model that is inherently interpretable, called LANTERN. LANTERN learns a low-dimensional latent space where mutations combine additively - with dimensionality learned automatically from the data through a hierarchical prior on the variance of each dimension. The latent phenotype is then transformed to observed phenotype measurements through a smooth, non-linear surface, which we learn with a nonparametric Gaussian process prior. To facilitate scalability to large-scale data, we adopted a stochastic variational inference approach. Through its design, LANTERN's predictions are easily decomposed into interpretable components. Despite this simplicity, LANTERN outperforms or equals the predictive accuracy of neural networks across multiple large-scale measurements.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program