Abstract:
|
High-throughput chromosome conformation capture (Hi-C) has become the state-of-the-art technology for studying genome-wide chromatin structures. Despite rapid technological advance, Hi-C data still is highly noisy. Assessing reproducibility of Hi-C data across replicates is an important way to monitor data quality. Though some statistical methods have been developed for assessing reproducibility of findings from high-throughput experiments represented in the form of ranked lists, these methods are not entirely suitable for Hi-C data, as Hi-C data has a strong spatial structure that is not taken account of in the existing methods.
In this work, we develop the irreproducible discovery rate regression, a method for incorporating covariate information into the assessment of reproducibility. By modeling the spatial structure through covariates, this method effectively takes account of the spatial effect in Hi-C data and improve the identification of reproducible signals. In fact, this method is generic and is applicable to many other settings. We illustrate our method using both simulations and real data analyses.
|