Online Program Home
My Program

Abstract Details

Activity Number: 528 - Analysis of Big Data
Type: Contributed
Date/Time: Wednesday, August 1, 2018 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #329690 Presentation
Title: Fusion Learning with High-Dimensionality
Author(s): Xin Gao* and Raymond J. Carroll
Companies: York University and Texas A & M University
Keywords: group penalization; Information criterion; model selection; pseudolikelihood; large deviation; sparsity

We propose a fusion learning method which learns from multiple data sets across different experimental platforms through group penalization. The responses of interest may include a mix of discrete and continuous variables. The responses may share the same set of predictors, however, the model and parameters differ across different platforms. Integrating information from different data sets can enhance the power of model selection. The goal is to select which predictors affect any of the responses, where the number of such informative predictors tends to infinity as sample size increases. We specify a pseudolikelihood combining the marginal likelihoods, and propose a pseudolikelihood information criterion. Under regularity conditions, we establish selection consistency for this criterion with unbounded true model size. The proposed method includes a Bayesian information criterion with appropriate penalty term as a special case. Numerical results indicate that fusion learning can dramatically improve upon using only one data source. In the talk, we will demonstrate the use of the R package "FusionLearn" to perform the proposed fusion learning tasks.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program