Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 42 - Recent Developments of Statistical Methods for Microbiome Research
Type: Invited
Date/Time: Sunday, August 7, 2022 : 4:00 PM to 5:50 PM
Sponsor: Biometrics Section
Abstract #319185
Title: Deep Learning to Predict the Biosynthetic Gene Clusters in Bacterial Genomes
Author(s): Hongzhe Li*
Companies: University of Pennsylvania
Keywords: data augmentation; functional microbiome; long short-term memory RNN; protein family domains
Abstract:

Biosynthetic gene clusters (BGCs) in bacterial genomes code for important small molecules and secondary metabolites. Based on the validated BGCs and the corresponding sequences of protein family domains (Pfams), Pfam functions and clan information, we develop a deep learning method e-DeepBGC, that extends DeepBGC, for detecting the BGCs and their biosynthetic class in bacterial genomes. We show that e-DeepBGC leads to reduced false positive rates in BGC identification and an increased sensitivity in identifying BGCs compared to DeepBGC. We apply e-DeepBGC to 5,666 Ref Seq bacterial genomes and detect a total of 170, 685 BGCs with an average of 30.1 BGCs in each genome. We summarize all the predicted BGCs, their functional classes and the distributions of the BGCs in different bacterial phyla.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program