Abstract:
|
Patients at high risk of developing certain types of cancer, e.g., liver, pancreas, and lung, are often followed by physicians and regularly screened using costly imaging technologies, which may not even be accessible in low-resource settings. The circulating tumor cell (CTC) chip technology allows for efficient extraction of leukocytes and cancer cells shed into the bloodstream by an existing tumor from whole blood samples. Subsequent RNA-sequencing (RNA-seq) provides genome-wide RNA transcript profiles, which should enable accurate classification of subjects into disease categories, thus enabling the replacement of costly imaging techniques with a diagnostic blood test. One challenge, however, is that the proportion of cell types in each sample is unknown, and a reference profiles for CTCs is not available. To address this issue, we have developed a procedure using principal component analysis (PCA) to identify the signal from any present CTCs that does not rely on having a reference profile for said CTCs, allowing for reliable prediction of disease status. We show that using this novel procedure results in improved classification performance when identifying pancreatic cancer.
|