Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 428 - Clustering and Dimension-Reduction Methods: From Omics to Single-Cell Data
Type: Contributed
Date/Time: Wednesday, August 10, 2022 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #322489
Title: Exploring Regularized Regression Methods to Improve the Accuracy and Consistency of Component-Wise Sparse Mixture Regression Clustering for High-Dimensional Data
Author(s): Bo NMN Zhang* and Devin Koestler and Prabhakar Chalise and Jianghua He and Jinxiang Hu
Companies: University of Kansas Medical Center and University of Kansas Medical Center and University of Kansas Medical Center and University of Kansas Medical Center and University of Kansas Medical Center
Keywords: Component-wise sparse mixture model; Regression based clustering; Relationship heterogeniety; Clustering
Abstract:

Genomic studies often seek to explore the association between molecular markers and biological phenotype(s) to gain insight into the molecular basis of health and disease. However, patient-level heterogeneity often obfuscates the relationship between molecular markers and a phenotype of interest (POI) since the same phenotype can be product of completely different biological pathways. While Component-wise Sparse Mixture Regression (CSMR), a recently proposed regression-based clustering method, shows promises in detecting heterogeneous relationships between molecular markers and a POI, it sometimes yields inconsistent results when applied to high-dimensional data due to its inherent feature selection and regularization method. We explored different regularized regression methods within the CSMR framework to evaluate the internal consistency and accuracy of our proposed modifications using an extensive set of simulation studies. Across simulation scenarios where CSMR would yield inconsistent clusters, adaptive lasso improves cluster consistency and accuracy. Our modification of the CSMR method improves its ability to handle high-dimensional data, which are common in genomic studies.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program