Abstract:
|
We consider APLM for analyzing ultra-high-dimensional data where both the number of linear components and the number of nonlinear components can be much larger than the sample size. We propose a two-step approach for estimation, selection and simultaneous inference of the components in the APLM. In the first step, the nonlinear additive components are approximated using polynomial spline basis functions, and a doubly penalized procedure is proposed to select nonzero linear and nonlinear components based on adaptive LASSO. In the second step, local linear smoothing is then applied to the data with the selected variables to obtain the asymptotic distribution of the estimators of the nonparametric functions of interest. The proposed method selects the correct model with probability approaching one under regularity conditions. The estimators of both the linear part and nonlinear part are consistent and asymptotically normal, which enables us to construct confidence intervals and make inferences about the regression coefficients and the component functions. The performance of the method is evaluated by simulation studies. The proposed method is also applied to a maize gene dataset.
|