Abstract:
|
Detection of differential DNA methylation regions (DMR) is of great interest among scientists because it provides meaningful results to help scientists better understand connections between methylation and different medical or biological conditions. In the DMR detection preprocessing steps, many probes are filtered out for different reasons (e.g., low quality). One of the filtering steps eliminates potential single nucleotide polymorphisms (SNPs) due to their possible influence on probe readings. When both methylation and SNP data are collected on the same individual, it becomes possible to refine the preprocessing involving SNPs, by not filtering the probes that are not actually a SNP and using data imputation methods for the probes that actually have a SNP. In this work, we evaluate multiple imputation methods, including mean, Kth nearest neighbor, linear regression, and regularized linear regression. We will compare the imputation accuracy, as well as the imputation effect on DMR detection, using simulated data.
|