Abstract:
|
Cell-free DNA (cfDNA) in circulating blood has great clinical potential to be a more specific biomarker for the diagnosis, prognosis, and the early detection of cancer. Recently, next generation sequencing technologies (NGS) have been applied to generate low-coverage DNA-sequencing data from cfDNA samples. However, inherent sampling variation causes estimation of DNA copy numbers unstable. And the low percentage of tumor cells in a cfDNA sample poses a great challenge to sensitively detect specific DNA copy alterations. Therefore, there is an urgent need to develop a robust analysis procedure for cfDNA data.
We develop an integrative data analysis approach that simultaneously normalizes the sequencing read counts, corrects the GC-content bias, optimizes the genomic bin width, estimates tumor cell percentage, and infers DNA copy change. Our algorithm employs multiple statistical methods including smoothing, and robust segmentation. We compare the performance of our method with several conventional techniques by analyzing prostate cancer cfDNA data.
|