Abstract:
|
Drug-drug interaction (DDI) is a situation in which one drug affects the activity of another. Drug-disease interaction (DDSI) is a situation where medications may exacerbate a pre-existing disease. Both DDI and DDSI are causes of adverse drug effects. Many health-related datasets used to identify DDIs and DDSIs suffer from missing data (e.g., lab results are only partially observed). Missing data can lead to biased estimates and weakens generalizability of findings.
Common methods for handling missing data rely on a set of conditional or full model specifications to impute missing values. In high-dimensional setting, these approaches may not be appropriate. Covariate selection and dimensionality reduction (e.g., via matrix factorization) are two ways to reduce the dimensionality of the problem, but these usually assume non-informative missingness. We describe methods that combine multiple imputations with dimension reduction algorithms to impute missing values. These combinations are computationally efficient by ensuring that follow-up analysis does not need to deal with missing values, and propagate the imputation error properly, resulting in valid operating characteristics.
|