Abstract:
|
The opioid crisis represents a large and growing public health burden in the United States (US). Understanding factors driving opioid prescribing in outpatient settings is vital to implementing targeted interventions to reduce preventable harm from potentially inappropriate opioid prescriptions. The National Ambulatory Medial Care Survey (NAMCS), which collects annual, cross-sectional data from outpatient physician office visits, can be used to undertake such investigation. Although the NAMCS data can be analyzed by logistic regression, we could also use a popular machine learning method, Random Forest (RF). The logistic regression enjoys the advantage of easy model interpretation, but when the number of covariates gets large, model selection is needed to build a parsimonious model. The RF method does not suffer from the curse of dimensionality, however, it tends to have the drawback of difficulty in model interpretation. In this study, we investigate the application of RF approach in the NAMCS complex survey data set and compare it to the penalized logistic regression. A simulation study will be conducted to evaluate the performance of the RF approach under various scenarios.
|