Abstract:
|
Analysis of correlated data is ubiquitous in many scientific studies. With rapid advances in modern technologies, it has sparked great interests in both statistical theory and applications to estimate and infer joint associations between high-dimensional covariates and correlated outcomes. We propose a novel inferential method for linear combinations of high-dimensional regression coefficients in generalized estimating equations, which have been widely adopted for correlated data analysis for decades. Our estimator, obtained via constructing a system of projected estimating equations, is shown to be asymptotically normally distributed under certain regularity conditions. Due to lack of existing procedures for effectively selecting the tuning parameter in the projection direction estimation, a working data-driven cross-validation procedure is designed for this setup. We demonstrate the reliable finite-sample performance, especially in estimation bias and confidence interval coverage, of the proposed method via extensive simulations, and apply the method to a gene expression study on riboflavin production with Bacillus subtilis.
|