Solid tumor sample is a mixture of cancer and normal cells. The mixing proportion (or the tumor "purity") brings extra noises and needs to be accounted for in cancer genomics data analysis. Estimating and adjusting for tumor purity have gained tremendous interests lately. Several computational methods and software tools were developed using gene expression, DNA methylation, copy number variation or point mutation data.
We discover that the methylation measurements from Illumina Infinium 450k are very informative for predicting tumor purities. We develop simple and efficient methods for tumor purity estimation, and differential methylation detection adjusting for purity. Analyses of a number of datasets from The Cancer Genome Atlas (TCGA) demonstrate improved performance of the proposed methods.
|