Keywords: computational statistics, classification, large scale inference, composite likelihood
Genomic studies frequently seek to uncover relationships between many subjects of interest (e.g., genes), and probe how these relationships change across multiple conditions (e.g., in different tissues or cell types). To elucidate shared and differential gene expression patterns across many conditions, it is desirable to analyze all conditions jointly, as condition-by-condition analyses are severely underpowered to uncover shared or differential effects. Several joint analyses have been developed, yet each still suffers from some critical drawbacks both in model flexibility and computational intractability for even a modest number of conditions due to a combinatorial explosion in the number of latent classes included in the model. Using a novel limited information mixture model, we sidestep this computational intractability by paring down the collection of candidate latent classes. We then feed this output into an empirical Bayesian framework which simultaneously performs classification and parameter estimation, without making the restrictive assumptions made by competing methods.