Abstract:
|
We present a method for performing non-negative matrix factorization (NMF) motivated by the problem of normalizing omics data from diverse platforms. Recent work has suggested that rank (quantile) normalization can provide both comparisons to a baseline and robust discovery of differences between observations. Rank approaches should also be maximally robust to use of different platforms, providing a method to compare data gathered with different technologies and at different times. However, such normalization is incompatible with the application of current NMF methods, which is unfortunate given the success of NMF in identifying differences in biological process activities between individuals. Here we present a stochastic NMF method that operates on rank normalized matrices to recover underlying patterns in omics data. Further, we demonstrate the effect of different metrics on the ability of the method to recover meaningful patterns in the data and explore the limitations of the method when applied to gene expression data. In addition, through simulations, we explore the stability of the matrix factorization under increasing noise levels.
|