Activity Number:
|
186
- Statistical Methods for Assessing Genomic Heterogeneity
|
Type:
|
Topic Contributed
|
Date/Time:
|
Monday, August 8, 2022 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Section on Statistics in Genomics and Genetics
|
Abstract #320831
|
|
Title:
|
Identifying Novel Cells in Annotating Single Cell RNA-Seq Data
|
Author(s):
|
Ziyi Li* and Yizhuo Wang and Kim-Anh Do
|
Companies:
|
The University of Texas MD Anderson Cancer Center and The University of Texas MD Anderson Cancer Center and MD Anderson Cancer Center
|
Keywords:
|
single cell RNA sequencing;
machine learning;
cell annotation;
feature selection;
cluster
|
Abstract:
|
Single cell RNA sequencing (scRNAseq) has been widely used to decompose complex tissues into functionally distinct cell types. Recently, many supervised annotation methods have been developed and shown to be more convenient than unsupervised cell clustering. One challenge faced by all the supervised methods is the identification of the novel cell type. Existing methods usually label the cells based on the correlation coefficients or confidence scores, which sometimes results in an excessive number of unlabeled cells. We developed a straightforward yet effective method combining autoencoder with iterative feature selection to automatically identify novel cells from scRNA-seq data. Our method trains an autoencoder with the labeled training data and applies the autoencoder to the testing data to obtain reconstruction errors. By iteratively selecting features that demonstrate a bimodal pattern and reclustering the cells using the selected features, our method can accurately identify novel cells that are not present in the training data. Extensive numerical experiments using five real datasets demonstrated favorable performance of the proposal over existing methods.
|
Authors who are presenting talks have a * after their name.