Abstract:
|
An inference task that holds key to numerous applications is the comparison of multiple data sets to identify the underlying difference. A fundamental challenge in modern multi-sample comparison problems is the presence of many potential confounders, or extraneous sources of variation that contribute to the difference across the distributions even within the same condition, resulting in false positives in many applications. We consider the ANOVA design that allows the intrinsic (i.e., scientifically interesting) variation in the probability distributions to be identified from the extraneous ones, under which replicate data sets are collected under each experiment setting. We introduce a flexible multi-resolution model-based framework for cross-group comparison that takes into account the experimental design using local hierarchical Binomial testing defined on scanning windows of a cascade of resolutions. We introduce a tree-structured graphical model hyperprior to incorporate spatial-scale dependency among the scanning windows thereby allowing effective borrowing of strength among them. We apply the method to DNase-seq for identifying differences in transcriptional factor binding.
|