Abstract:
|
In contemporary biomedical research, it is becoming increasingly common to collect "multi-view data": that is, data in which multiple data types (e.g. gene expression, DNA sequence, clinical measurements) have been measured on a single set of observations (e.g. patients). We will consider the following question: given a set of n observations with measurements on L data types, can a single clustering of the n observations be defined on all L data types, or does each data type have its own clustering of the observations? To answer this question, we will introduce a general framework for modeling multi-view data, as well as hypothesis tests that can be used in order to characterize the extent to which the clusterings on each of the L data types are the same or different.
|