Keywords: epigenetics neural networks
DNA analysis is now a data intensive discipline. New technology has transformed biomedical research by making a plethora of molecular data available at reduced costs and great speeds. Large consortiums and many individual laboratories have already generated vast datasets: as an example, one such database, the GEO contains more than 1.8 million samples. This data is readily, publicly available but analyzing it requires computational and statistical resources. This study uses neural networks to identify genes that cause a skin cell to differ from a muscle cell. The factors that cause one cell type to be different from another have been shown to have an epigenetic dimension: they influence gene activity and not the DNA itself. Gene analysis and epigenetics in particular are gradually more reliant on numerical analysis: - Scientists are now able to identify epigenetic mechanisms that affect the behavior of a gene - We can now map these mechanisms and visualize the patterns they produce - These patterns have been shown to differ from one gene to another - These patterns are numerical and can be analyzed with regular statistical and computational tools - By analyzing patterns, we will be able to differentiate between different types of cells.
We developed a neural network model to distinguish between cell identity genes: the output is binary as each gene is classified as cell identity or not by using the epigenetic signal as the input.
We have intriguing preliminary data that gene MECOM manifests epigenetic signatures and expression patterns that are distinct for cell identity genes of endothelial lineage. Our work will be the first to systemically study MECOM function in skin cells.