Abstract:
|
The recently developed Hi-C technology enables a genome-wide view of spatial organizations of chromosomes, and has shed deep insights into genome structure and genome function. Although the technology is extremely promising, multiple sources of biases and uncertainties pose great challenges for data analysis. Statistical approaches for inferring three-dimensional (3D) chromosomal structure from Hi-C data are far from their maturity. Most existing models are highly over-parameterized, lacking clear interpretations, and sensitive to outliers. In this study, we propose parsimonious, easy to interpret, more flexible and robust poly-helix models for reconstructing 3D chromosomal structure from Hi-C data. We also develop a negative binomial regression approach to accounting for over-dispersion in Hi-C data. When applied to a real Hi-C dataset, poly-helix models achieve much better model adequacy scores than existing models. More importantly, these poly-helix models reveal that geometric properties of chromatin spatial organizations as well as chromatin dynamics are closely related to genome functions.
|