Abstract:
|
Recently, multiple testing with structured hypotheses has been gaining more attention. In many application domains, hypotheses are connected to each other via various kinds of structures, such as group or graph relationships. We consider testing problems when the hypotheses are nodes in a directed acyclic graph (DAG). For example, the Gene Ontology (GO) is a DAG where each node is a set of genes known to be associated with a biological process. Edges point from one biological process to another if the second process is a special case of the first. Results of biological assays (e.g. gene expression experiments) are often interpreted in terms of "enrichment" with respect to a set of biological processes. This leads to a multiple testing problem on the GO DAG. Applying standard FDR procedures like BH, which ignore the DAG structure, results in redundant discoveries, since many ancestors of relevant biological processes will also be discovered. In the context of trees, Yekutieli (2008) proposed the outer nodes FDR as an error rate suitable to this context. In this work, we propose a novel methodology to control outer nodes FDR on DAGs and demonstrate its utility using genomic data.
|