The UK Biobank presents many challenges related to data size, format complexity, feature heterogeneity, and sampling incongruence. We performed unsupervised clustering based on a signature vector of thousands of neuroimaging biomarkers. Two distinct sub-cohorts of 9,914 subjects emerged based on 7,614 imaging, clinical and phenotypic features. The twenty most salient features contributing to the cluster separation were identified based on parametric and nonparametric tests comparing the biomarker distributions across clusters. We jointly represented and modeled some significant clinical, demographic variables with the selected salient neuroimaging features, and developed decision rules to predict the presence and progression of depression or other mental illness. This approach provides clinicians with additional clinical decision support information, which may be useful to pinpoint specific data elements to collect, study, model, and analyze for different disease phenotypes. External validation of this technique on different populations may lead to reducing healthcare expenses and improving the processes of diagnosis, forecasting, and tracking of normal and pathological aging.