Abstract:
|
One of the sources of big data in medicine is when clinical data is paired with other sources of data, for example blood samples are used to obtain GWAS, gene expression, rnaSeq data. In addition Healthcare provider databases include real world patient data that contains information of the same disease that in many cases complements the clinical data. We propose new statistical methodology to help combine and analyze such big data. We propose a scheme of weights attached to the variables, not the observations, that helps combine the different sources of data in a more reasonable way. Our scheme is adaptable to penalized methods such Lasso or Glmnet.
|