Abstract:
|
Using sophisticated models to capture the intrinsic complexity of data is a commonly used approach. However, in practice this could be difficult when the data is of high-dimension. Striking a balance between parsimony and flexibility is essential to tackle complexity while maintaining satisfactory prediction performances. In this work, we propose Structured Mixture of Gaussian Locally Linear Mapping (SMoGLLiM) to estimate associations between high-dimensional predictors and low-dimensional responses. SMoGLLiM adopts mixtures of linear associations to approximate nonlinear patterns, uses inverse regression to mitigate the complications of high-dimensionality and utilizes cluster refinement through cluster-size constraints and outliers trimming to achieve robustness. Its hierarchical structure enables shared covariance matrices and latent factors across smaller clusters, which effectively reduces the number of parameters. Using three real-life datasets, we present the capability of modeling non-linear mappings, combating outliers and managing large-scale complex data. These examples illustrate the wide applicability of SMoGLLiM on handling different aspects of complex data structure.
|