Online Program Home
My Program

Abstract Details

Activity Number: 629 - New Developments in Nonparametric and Semiparametric Statistics
Type: Contributed
Date/Time: Thursday, August 2, 2018 : 8:30 AM to 10:20 AM
Sponsor: ENAR
Abstract #330689
Title: An Integrated Bayesian Nonparameteric Method for Clustering of High-Dimensional Mixed Data
Author(s): Chetkar Jha* and Subharup Guha
Companies: University of Missouri and University of Florida
Keywords: bayesian nonparametric; simultaneous clustering; mixed dataset; high dimension; GLM; Integrative biology

Motivation :- Advances in next-generation sequencing methods has enabled researchers/agencies to collect a wide variety of sequence data across multiple platforms . The primary motivation behind such an exercise is to analyze these datasets jointly to gain new insights into disease prevention, treatment, and cure. Clustering of such datasets, can provide the much-needed insight into disease subtypes, and biological associations. However, the differing scale, the heterogeneity, and the size of the mixed dataset is hurdle for such analysis.

Result :- We propose an integrated bayesian nonparameteric approach for clustering of high-dimensional mixed data. We make use of Generalized Linear Model (GLM), and latent variable approaches to integrate the mixed dataset. We apply our method to glioblastoma multiforme dataset. Our method performs simultaneous clustering of high-dimensional mixed data. Moreover, we show that the cluster detection is aposteriori consistent, as the number of covariates and subject grows. As a byproduct of our work, we derive a working value approach to perform bayesian beta regression.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program