Abstract:
|
Although many methods are available for detecting two-way interactions in genomic datasets, detecting high-order interactions remains a major challenge due to high dimensionality. We developed a random forest-based algorithm, high order interaction discovery (HID), to efficiently select interaction signals. The random forest framework allows our method to analyze different types of outcomes, including categorical, continuous and survival outcomes. In our method, an initial variable selection phase utilizing pairwise minimal depth indices is applied to choose potential interactive features. For interaction selection, we proposed and evaluated two variable importance measures, minimal depth interaction importance (MDII) and permutation-based interaction importance (PBII). The MDII serves as a fast filter for potential interaction candidates, and PBII is used to finalize the ranking of the interaction terms. Our simulation studies revealed that HID exhibited good and consistent performance in ranking interactions for different types of outcomes.
|