Abstract:
|
Heterogeneity occurs in many regression problems, where members from different latent subgroups respond differently to the covariates of interest (e.g., treatments) even after adjusting for other covariates. To identify such subgroups, we propose a Bayesian model based on a mixture of finite mixtures (MFM), for which the number of subgroups needs not be specified a priori and is modeled as a random variable. The Bayesian MFM model was not commonly used in earlier applications largely due to computational difficulties. Instead, an alternative Bayesian model, the Dirichlet Process Mixture Model (DPMM) has been widely used for clustering although it is a misspecified model for many applications. The popularity of DPMM is partly due to its mathematical properties that enable efficient computing algorithms. We propose a class of conditional MFMs tailored to regression setups and solve the computing problem by extending the results in Miller and Harrison (2017). Using simulated and real data, we show the benefits of our conditional MFM, notably more reasonable clustering results, compared to that of existing frequentist methods, the DPMM, and the original MFM models in various setups.
|