Abstract:
|
In recent years, it has become increasingly commonplace for biologists to perform more than one type of measurement on a single set of observations. For instance, a researcher might profile gene expression, methylation, and DNA sequence on a single set of tissue samples. On the basis of such multiple-view data, many authors have considered the task of determining whether there are subgroups, or clusters, among the observations. In this paper, we instead consider a more nuanced question: are the sets of clusters from each data view related or independent? In order to answer this question, we propose a mixture model for multiple data views. We use this model to develop a pseudo likelihood ratio test for whether the clusterings of the observations in two data views are independent. We explore the performance of the proposed approach in a simulation study, and in applications to multiple-view gene expression and DNA copy number data sets.
|