Abstract:
|
Advances in biotechnology have culminated in the development of large scale genetic and sequencing association studies. Within the context of these studies, there is increasing interest in considering the effect of genetic markers on complex, high dimensional and potentially structured, outcomes such as metabolomics, imaging, and microbiome data. Therefore, we propose an approach based on the kernel machine framework wherein we assess the global association between a high-dimensional or complex outcome and a group of genetic variants, such as within a gene. We encode both the genetic effects and the outcome data using kernels and construct a kernelized RV coefficient measuring the dependence between genetics and outcome. We approximate the finite sample distribution of the dependence measure to facilitate p-value calculation. Simulations and real data applications are used to demonstrate that our method controls type I error while having reasonable power.
|