Abstract:
|
The interplay between regulatory elements and transcriptional activities has been a crucial scientific question in genomics. In a previous study, we leverage the gene expression measurements from human exon array data to predict chromatin accessibility and develop a big data regression model, BIRD. The GEO database contains a large amount of microarray samples compared to the existing exon array data, so the number of samples with regulatory activity profiles can be significantly expanded if cross-platform prediction is feasible. Our goal is to predict chromatin accessibility across different gene expression profiling platforms. We first build the BIRD model with exon array data. However, it is inappropriate to directly apply the model to new microarray samples due to the platform effects. We propose new normalization methods to transform the data matrix within each gene, then incorporate a weighted ridge regression approach to extend the BIRD model. To demonstrate the availability of cross-platform prediction, we evaluate our method on the microarray data. We also apply our method to the RNA-seq data for further analysis.
|