Abstract:
|
Observational longitudinal studies are a common means to study treatment efficacy and safety in chronic mental illness. In many such studies, treatment changes may be initiated either by the patient or their clinician and can thus vary widely across patients in their timing, number, and type. Estimation of an optimal treatment regime using such data is challenging as one cannot naively pool together patients with the same treatment history, as required by inverse probability weighting; nor is it possible to apply backwards induction over the decision points, as is done Q-learning. Current scientific theory for many chronic mental illnesses maintains that a patient's disease status can be conceptualized as transitioning among a small number discrete states. We use this theory to inform the construction of a partially observable Markov decision process model of patient health trajectories wherein observed health outcomes are dictated by a patient's latent health state. Using this model, we derive an estimator of an optimal treatment regime under two common paradigms for quantifying long-term patient health, with application to the observational pathway of the STEP-BD study.
|