Abstract:
|
Longitudinal data are very popular in practice, but they are often missing in either outcomes or time-dependent risk factors, making them highly unbalanced and complex. Missing data may contain various missing patterns or mechanisms, whose proper incorporation to make an unbiased and valid inference presents a significant challenge. Here, we propose a novel semi-parametric framework for analyzing longitudinal data with both missing responses and covariates that are missing at random and intermittent, a general and widely encountered situation in observational studies. Within the framework, we propose multiple robust estimation procedures based on innovative calibrated propensity scores, which offers additional relaxation of the mis-specification of missing data mechanisms and shows more stable numerical performance. Also, the corresponding robust information criterion of our proposed modeling for consistent variable selection is developed. These advocated methods are evaluated in both theory and extensive simulation studies in a variety of situations, and illustrate the utility of our approach by analyzing the missing data collected in the HIV Epidemiology Research Study.
|