Online Program

Return to main conference page

All Times EDT

Thursday, June 4
Machine Learning
Software & Data Science Technologies
Machine Learning and Software and Data Science Technologies Posters
Thu, Jun 4, 2:00 PM - 5:00 PM
TBD
 

Developing a Computational Framework for Precise TAD Boundary Prediction Using Genomic Elements (308483)

Mikhail Dozmorov, Virginia Commonwealth University 
*Spiro C Stilianoudakis, Virginia Commonwealth University 
Katarzyna Tyc, Virginia Commonwealth University 

Keywords: genomics, machine learning, random forest, computational statistics, hierarchical clustering,

Chromosome conformation capture combined with high-throughput sequencing experiments (Hi-C) have revealed that chromatin undergoes layers of compaction through DNA looping and folding, forming dynamic 3D structures. Among these are Topologically Associating Domains (TADs), which are known to play critical roles in cell dynamics like gene regulation and cell differentiation. Precise TAD mapping remains difficult, as it is strongly reliant on Hi-C data resolution. Obtaining genome-wide chromatin interactions at high-resolution is costly resulting in variability in true TAD boundary location by TAD calling algorithms. To aid in the precise identification of TAD boundaries we developed a computational framework built upon a random forest classifier that leverages the spatial relationship of many high resolution ChIP-seq defined genomic elements. Our framework precisely predicts chromosome-specific TAD boundaries on multiple cell types. We show that known molecular drivers of 3D chromatin including CTCF, RAD21, and SMC3 are more enriched at our predicted TAD boundaries compared to the boundaries identified by the popular ARROWHEAD TAD caller. Our results provide useful insights into the 3D organization of the human genome.