Abstract:
|
In recent years, the use of EHRs in US hospitals has neared complete coverage. With the advent of EHRs, not only physicians are given faster access to patient information, but through patients’ consent of using their information for research purposes, exciting new opportunities for public health researchers are now viable. However, using EHR data for epidemiological or clinical research presents various challenges, with one of the most troublesome being that these data represent a convenience sample from the population. Hence, they can potentially yield biased inference on, say, the association between disease and exposure. In this paper, we propose a Bayesian hierarchical spatial model that uses a log Gaussian Cox process for the geocoded locations of EHR subjects, and whose intensity function is in turn used to derive sampling weights for the EHR data. By melding the EHR data, appropriately weighted, with publically available data on exposure and risk factors, our model allows us to estimate the association between disease and exposure without issues due to sampling bias. We apply our model to EHR data to explore the association between smoking and lung cancer incidence.
|