All Times EDT
Keywords: propensity score, drug safety studies, machine learning
Propensity score (PS) methods have become a popular analytic choice for postmarket drug safety studies utilizing large observational databases. PS, defined as the conditional probability of receiving treatment given covariates, are used to create balanced covariate distributions between non-equivalent drug user groups, and thus account for confounding bias that may arise from a non-randomized design. While parametric regression models (e.g., main-effects logistic model) are commonly used to estimate PS in regulatory settings, there is a growing interest in the utility of machine learning (ML) algorithms such as random forest, bagging, boosting, or neural network for PS estimation. However, a common misconception about ML is that these algorithms are completely free of human intervention in performing a task. Commonly used ML algorithms involve specification of a number of tuning parameters before learning from data. In this project, we highlight the importance of the specification of tuning parameters in assessment of ML-based PS methods via simulation studies. We also provide insights on which hyperparameters in ML algorithms can impact the performance of PS analysis for drug safety studies.