Abstract:
|
Recent development in high throughput proteomics and genomics profiling makes it possible to study regulations of genetic factors on protein activities in a systematic manner. In this paper, we propose a new statistical method --- ProMAP --- a penalized multivariate linear mixed effects model for integrative proteo-genomic analysis. The motivation problem is to systematically characterize the regulatory relationships between proteins and DNA copy number alterations in breast tumor samples based on iTRAQ (isobaric tag for relative and absolute quantitation) data and SNP array data from CPTAC-TCGA studies. Because of the dynamic nature of iTRAQ technique and mass spectrometry instruments, data from iTRAQ experiments usually have severe batch effects, high percentages of missing and non-ignorable missing-data patterns. Thus, we utilize a linear mixed effects model to account for the batch structure and explicitly incorporate the batch-level abundance-dependent-missing-data mechanism of iTRAQ data in ProMAP. In addition, we employ a multivariate regression framework to characterize the multiple-to-multiple regulatory relationships between DNA copy number alterations and proteins. Moreo
|