Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 97 - New Methods for Structured Variable Selection
Type: Topic Contributed
Date/Time: Monday, August 8, 2022 : 8:30 AM to 10:20 AM
Sponsor: SSC (Statistical Society of Canada)
Abstract #322447
Title: Variable Selection in High-Dimensional Linear Regression Accounting for Heterogeneity in Covariate Effects Across Multiple Data Sources
Author(s): Tingting Yu*
Companies: Harvard Pilgrim Health Care Institute and Harvard Medical School
Keywords: Data heterogeneity; Variable selection; Coefficient clustering; K-means; ADMM

When analyzing data combined from multiple sources, the heterogeneity across different sources must be accounted for. We consider high-dimensional linear regression models for integrative data analysis with heterogeneity across units modeled as unit-specific covariate effects. A fully heterogeneous model that assumes distinct covariate effects for each source can be over-parameterized and may impair statistical power when the sample size is small and the number of predictors or units is large. Therefore, identifying sub-homogeneity among heterogeneous covariate effects is necessary to build a more parsimonious model. We propose a new adaptive clustering penalty (ACP) method to simultaneously select variables and cluster unit-specific covariate effects with sub-homogeneity. We show that the estimator based on the ACP method enjoys a strong oracle property under certain regularity conditions, and develop an efficient alternating direction method of multipliers (ADMM) algorithm for parameter estimation. We conduct simulation studies to compare the performance of the proposed method to existing methods and apply the method to real datasets.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program