Abstract:
|
A recent meta-analysis demonstrated that suicide risk factors lead to only slightly better than chance predictions and thus called for a shift in focus towards machine learning-based risk algorithms. We present a data science approach for predicting future suicide attempts using all questions asked on an extensive survey. Data come from the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) Wave 1 and 2 (N = 34,653). At wave 2, N = 222 participants endorsed having attempted suicide since the previous interview three years prior. Our final predicting model using balanced random forest yielded a cross-validated AUC of 0.857. We identified four risk groups based on positive predictive value. The most important variables were previous suicidal ideation/behavior, recent low energy and mood periods, age, education, and recent major financial crisis. Our results suggest that we can extend conventional machine learning methods under a data science pipeline to the architecture of complex surveys for predicting suicide. This algorithmic approach can be used in future studies to target interventions for high-risk patients.
|