As enthusiasm for conducting research using electronic health record (EHR) data has grown, so has recognition of the many limitations of this data source. EHR data are incomplete, error-prone and arise via biased sampling. Despite these limitations, a wealth of information is contained in EHR, motivating the search for statistical approaches that support valid inference using this challenging data source. One promising approach is to pair EHR data, which are incomplete and error-prone but contain a broad range of measures, with population disease-registry data, which are extensively curated but provide information on only a single, narrowly focused disease area. In this talk we will use the example of EHR data paired with cancer registry information. We will discuss exposures that can be extracted from EHR and sources of bias that arise if these data are naively analyzed in relation to cancer diagnoses in registry data. Through simulation studies and a real-world study of second breast cancers in women with a prior breast cancer, we demonstrate alternative methods to address these biases. We conclude with an overview of considerations for working with linked EHR and registry data.