Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 244 - Statistical methods for microbiome data analysis and beyond
Type: Contributed
Date/Time: Wednesday, August 11, 2021 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #318007
Title: Significant Gene Array Analysis and Cluster-Based Modeling for Disease Class Prediction
Author(s): Myrine A. Barreiro-Arevalo* and Hansapani Rodrigo
Companies: The University of Texas Rio Grande Valley and The University of Texas Rio Grande Valley
Keywords: gene expression analysis; disease prediction; Bayesian neural networks; random forest modeling; GXNA; model comparison
Abstract:

Gene expression analysis has been of major interest to biostatisticians for many decades. Such studies are necessary for the understanding of disease risk assessment and prediction, so that medical professionals and scientists alike may learn how to better create treatment plans to lessen symptoms and perhaps even find cures. In this study, we will investigate various gene expression analyses and machine learning techniques for disease class prediction, as well as assess predictive validity of these models and uncover differentially expressed (DE) genes for their relevant datasets. Multiple gene expression datasets will be used to test model accuracy and will be obtained using the Affymetrix U133A platform. Our models to be addressed are: (1) simple random forest modeling, (2) Gene eXpression Network Analysis (GXNA), (3) RF++, (4) LASSO regression, and (5) Bayesian Neural Networks. Significant Analysis of Microarrays (SAM) is used to identify potential disease biomarkers, as well a Principal Component Analysis to determine any significant clusters before applying clustering techniques. Our ultimate goal is to find co-expressed genes and identify the effect of clustering analysis.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program