Abstract:
|
With the advent of next-generation sequencing, investigators have access to more reliable genetic information. However, to conduct an entire study with next generation sequencing can still be prohibitively expensive. One potential remedy could be to combine next generation sequencing data from cases with publicly available sequencing data for controls, but the case-control status could be completely confounded by differences in data quality, such as sequencing depths. We propose a regression calibration-based method and consider maximum-likelihood for conducting association study with such a combined sample. The methods allow for adjusting for non-confounding covariates as well as population stratification. Both methods control type I error and have comparable power to analysis conducted using the true genotype with sufficiently high but different sequencing depths. The regression calibration method allows for analysis with naive variance estimate and standard software under certain circumstances.
|