Abstract:
|
Validation studies are often used to reduce measurement error and get more reliable information on certain variables of interest. These studies consist of selecting a sample of patients from which error-prone records had been collected previously, e.g. in an observational database, and performing either a more detailed measurement or more refined data collection procedure. In practice, however, more than one round of data validation may be required, and direct application of standard design-based or multiple imputation techniques may lead to estimators that are inefficient, as information available in intermediate validation steps are ignored or only partially considered. We present two novel extensions of generalized regression estimators and a multiple imputation technique that makes full use of all available data and show through simulations that incorporating information from intermediate steps may lead to substantial gains in efficiency. This is illustrated using electronic health record data from 85,324 HIV-positive women, of whom 5,080 had their charts reviewed, and then 1,285 also had a telephone interview to validate key variables for a study of contraceptive effectiveness
|