Abstract:
|
Recent technological advances for analyzing single cells motivate similar improvements in computational, visualization and interpretability tools. The increasing generated data sets with hundreds of thousands of data points are hard to interpret due to their high dimensionality. With the renewed interest in innovative dimensionality reduction methods, there is an urgent need to robustly assess the performance of these methods. Some methods are powerful at keeping the intrinsic structure of the data but are time costly and generally lack a close form that allows the model to be reused on new data points for explicit low dimensional embedding. Others are not visually interpretable but time efficient. We define a multivariate metric that can be used for assessing the quality of projection in terms of running time efficiency, fidelity and close form representation of the data structure in addition to quality of interpretability and visualization in terms of coverage and spread. We further motivate the use of Autoencoders, a growing category of Neural Networks that allows for optimal data visualization, corrupt data detection and data correction while outperforming matrix factorization
|