Abstract:
|
With an increasing number of studies using machine learning (ML) to examine psychiatric disorders with brain imaging data, the reproducibility of their results has drawn great attention. Few studies investigate the reproducibility of feature selection or estimated feature coefficients. In this work, we quantify the reproducibility of each feature as penalization weight in ML models. We propose a replicability index, which is computed as the coefficient of variation from bootstrap samples. Synthetic data with 20 informative, and varying number of noise features are simulated, following a multivariate normal distribution with varying pair-wise correlations. Brain imaging data of 91 subjects are selected from the Philadelphia neurodevelopment cohort, including 59 PTSD patients and 32 healthy controls. LASSO, elastic net, and their weighted versions are evaluated with the simulated and brain imaging data. The weighted models show higher prediction accuracy and smaller error of coefficient estimation than the standard models, especially with a larger number of noise features. They also outperform the standard models on most of the brain imaging modalities and atlases.
|