Abstract:
|
DNA methylation is an epigenetic mechanism that plays a role in the development and progression of many human diseases. Illumina arrays are commonly used platforms to generate DNA methylation data, especially for large epidemiological studies in blood. Many studies have used both the older 450K, and the new EPIC platform, which poses challenges for data harmonization within a study. The pre-processing pipeline prior to a statistical meta-analysis of studies using different platforms is not trivial. Our study examined various pre-processing steps using data from the Diabetes Autoimmunity Study in the Young (DAISY) cohort, which prospectively follows genetically high-risk children for the development of type 1 diabetes. DNA methylation was measured on a subset of the cohort using both the 450K and EPIC platform. Assessment of technical replicates within and between platforms were examined using Bland-Altman plots, correlation and variance measures to give best recommendations. Of the different steps, we found that normalization and filtering had the biggest effect on data harmonization. Correcting for genomic inflation also helped harmonize the two platforms.
|