Online Program Home
My Program

Abstract Details

Activity Number: 394 - Brushing up Your Skills in Genomic Data Analysis
Type: Topic Contributed
Date/Time: Tuesday, July 30, 2019 : 2:00 PM to 3:50 PM
Sponsor: Korean International Statistical Society
Abstract #307083
Title: Application of Machine Learning to Find Needle in a Genomic Haystack
Author(s): Kwang-Youn Kim*
Companies: Northwestern University
Keywords: gene expression; machine learning; classification; genetics; consulting; genes

Genetic data can be characterized by its vast size and complexity. The human genome, for example, consists of three billion bases arranged in a complex three dimensional structure. To apply traditional statistical methods such as linear regression may suffer from reduced power after correcting for multiple comparisons--at the most extreme, if one were to test for association at every single base, for example, then that would be three billion statistical tests. Classification techniques, such as random forest, on the other hand is suited to tackle data sets with multiple inputs. We could apply this method to the genomic data such as using gene expression as input and disease phenotypes as the output. In this talk, we discuss examples of applying machine learning methods using various aspects of the genomic data as inputs and outputs.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program