Abstract:
|
In multi-platform genomic data analysis, multiple large sequences are observed from normal and abnormal subjects, and the sample sizes are relatively small. How to analyze such data jointly is challenging. Existing methods include the Lasso or graphical Lasso by selecting only few of the data for the analysis, resulting in information loss and possible biases. Here we propose two new methods: the integrative correlation to characterize the innate relationships within the sequences, and the empirical process/the smoothing method to combine information for prediction. These methods take the full data into the analysis and are very simple to use. Simulation studies are conducted to evaluate the performance of the methods, and a real multi-platform genomic data is analyzed to illustrate the application of the second method.
|