Abstract:
|
Visualizing single-cell genomic data in a low-dimensional space, such as PCA or UMAP, is a crucial step in exploratory data analysis. However, visualization with scatterplots often introduces two biases. The first bias arises when there are a large number of cells. Since cells are plotted sequentially, cells plotted earlier are masked by cells plotted later. Thus, the scatterplot only reflects a small proportion of information. The second bias arises when the numbers of cells in different samples are very unbalanced, which leads to a false impression of the cell distributions when comparing the samples with different features. To address these, we developed SCUBI, an unbiased visualization method for single-cell data. To address the first bias, SCUBI splits the coordinate space into small non-overlapping bins and visualizes the aggregated information of cells within each bin. To address the second bias, SCUBI divides the number of cells in each bin by the total number of cells and visualizes the difference of cell proportions across samples. We show that SCUBI can more faithfully visualize the true biological signal in a real single-cell RNA-seq dataset.
|