Abstract:
|
Multivariate clustered data are collected in various fields of applied sciences. It is common to have missing values in outcomes. The missingness is nonignorable when it depends on the outcome, especially if there is substantial cluster-level missingness, i.e., the outcome of a whole cluster is missing. The multiple outcomes are also not independent. In this work, we propose a multivariate selection model and an expectation conditional maximization (ECM) algorithm, to explicitly estimate the missing-data mechanism and correlations among outcomes. We show the power of testing associations between covariates and outcomes, with a comparison to the conventional analysis and our previously proposed univariate method. The bias of ignoring the missing data is also assessed. We apply the proposed method to analyzing multiple peptides from each protein in a breast cancer proteomics dataset.
|