Keywords: Unsupervised learning, RBF Kernel, neuroimaging, heterogeneity, bipolar disorder; schizophrenia
Background: Disease boundaries and diagnosis of severe psychotic disorders (Bipolar, Schizophrenia, and Schizoaffective) remain largely subjective - guided by historically developed clinical categories. Due to complex underlying biological mechanisms, existing diagnostic validity is often called to question. We detail a novel approach to identifying biologically distinct 'subtypes' of severe psychosis by applying a machine learning technique on integrated laboratory and clinical phenotypic data.
Methods: Patients with a psychotic disorder (n=400) were recruited and underwent MRI, Electroencephalography(EEG), and comprehensive clinical and neuropsychological assessments. A stratified unsupervised machine-learning (nonlinear K-means) approach using both clinical and laboratory information was tested against classifications built using just clinical information alone. Kernelized PCA using an RBF kernel was used to reduce high-dimensionality and account for non-linear interactions. To ascertain if the identified clusters were indeed biologically distinct, we investigated differences in regional brain-thickness measures extracted from 3T MRI scans.
Results: Three optimal clusters were identified. Using silhouette scores (and other clustering metrics) to evaluate the separation of clusters, we observed that the stratified approach -- clinical + laboratory information -- outperformed the others with a score of 0.523; improving on clinical or laboratory only, with scores of 0.230 and 0.290, respectively. The stratified clusters also differed more on regional brain thickness compared to clinical classification, with Group 3 showing largest deficits (Cohen’s d=0.4-1.1).
Conclusions: The findings allude towards improved classifications for psychotic disorders by leveraging existing clinical insights augmented with biological markers of psychiatry. The findings also demonstrate that machine learning can be instrumental in improving diagnostic validity.