Abstract:
|
Alternative splicing occurs during RNA processing and greatly increases the biodiversity of proteins encoded by the genome.It is already known that RNA-binding proteins (RBP) play a central role in the regulation of splicing, while at the molecular level it is still unclear how proteins interact and crosslink with RNA. The HITS-CLIP method allows genome-wide mapping of RBP-binding footprint regions at single-nucleotide resolution. Together with information of protein-RNA complex 3-dimensional structures, we can make inference of crosslinking at amino-acid-nucleotide level by using statistical models, which is hardly detected at large scale in the experiments. In this paper, we introduce a multinomial logistic regression with latent responses to model the potential crosslinking between 20 different amino acids and the nucleotide. We also introduce a set of variable selection indicators for each category. Under the Bayesian framework, we are able to make inference of latent responses and association between explanatory variables and the response based on the posterior distribution. The results well coincide with our current understanding of RBPs and the experiments at small scale.
|