Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 21 - Advances of Statistical Methodologies in Proteogenomic Research
Type: Topic Contributed
Date/Time: Sunday, August 7, 2022 : 2:00 PM to 3:50 PM
Sponsor: Biometrics Section
Abstract #323050
Title: HID Machine: A Random Forest-Based High Order Interaction Discovery Method for High-Dimensional Genomic Data
Author(s): Min Lu* and Yifan Sha and Xi Steven Chen
Companies: University of miami and University of Miami and University of miami
Keywords: random forests; variable selection; variable interaction; permutation importance; tree minimal depth
Abstract:

Although many methods are available for detecting two-way interactions in genomic datasets, detecting high-order interactions remains a major challenge due to high dimensionality. We developed a random forest-based algorithm, high order interaction discovery (HID), to efficiently select interaction signals. The random forest framework allows our method to analyze different types of outcomes, including categorical, continuous and survival outcomes. In our method, an initial variable selection phase utilizing pairwise minimal depth indices is applied to choose potential interactive features. For interaction selection, we proposed and evaluated two variable importance measures, minimal depth interaction importance (MDII) and permutation-based interaction importance (PBII). The MDII serves as a fast filter for potential interaction candidates, and PBII is used to finalize the ranking of the interaction terms. Our simulation studies revealed that HID exhibited good and consistent performance in ranking interactions for different types of outcomes.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program