Abstract:
|
Our work was motivated by the analysis projects using the linked US SEER-Medicare database to study treatment effects in men of age 65 years or older who were diagnosed with prostate cancer. Such data sets contain up to 100,000 human subjects and over 20,000 claim codes. The data were obviously not randomized with regard to the treatment of interest, for example, radical prostatectomy versus conservative treatment. Informed by previous instrumental variable (IV) analysis, we know that confounding most likely exists beyond the commonly captured clinical variables in the database, and meanwhile the high dimensional claims codes have been shown to contain rich information about the patients’ survival. Hence we aim to incorporate the high dimensional claims codes into the estimation of the treatment effect. The orthogonal score method is one that can be used for treatment effect estimation and inference despite the bias induced by regularization under the high dimensional hazards outcome model and the high dimensional treatment model. In addition, we show that with cross-fitting the approach has rate doubly-robust property in high dimensions.
|