Abstract:
|
In modern scientific research, data heterogeneity is commonly observed due to the abundance of complex data. We propose a factor regression model for data with heterogeneous subpopulations. In particular, the proposed model can be represented as a decomposition of heterogeneous and homogeneous terms. The heterogeneous term is driven by latent factors in different subpopulations. The homogeneous term captures common variation in the covariates and shares common regression coefficients across the subpopulations. Our proposed model attains a good balance between a global model and a group-specific model. The global model ignores data heterogeneity, while the group-specific model fits each subgroup separately. Both theoretical and numerical results are used to demonstrate the performance of the proposed model. Finally, analysis of a dataset from Alzheimer's Disease Neuroimaging Initiative further demonstrates the competitiveness and interpretability of our proposed factor regression model.
|