Abstract:
|
Much effort has been devoted to developing statistical models for the identification of important gene - environment (G×E) interactions. While the commonly-adopted marginal approach cannot accommodate the joint effects of a large number genetic markers, the existing joint-effect approaches may have limitations in violating the "main effects, interaction" hierarchical structure and adopting inefficient techniques. In this study, we propose a Bayesian sparse group lasso model to identify pivotal G×E interactions and main effects. Our approach respects the strong hierarchy where if an interaction term is identified, then both of the corresponding main effects will also be identified. Such a type of hierarchy is suitable for G×E interactions as the environment factors are often of low dimension and predetermined as important. We establish the theoretical properties of the proposed approach and demonstrate its advantage over alternatives through extensive simulation study. Analysis of the Nurses' Health Study with SNP measurements shows that markers with important implications have been identified by the proposed approach.
|