Since 2010, clinical medicine and public health have benefited from a rapid surge of clinical research on chronic diseases using data from electronic health records (EHRs). EHRs are appealing because they can offer large sample sizes, timely information, and a wealth of clinical information beyond that obtained from either health surveys or administrative data. However, while millions of patient records are included in large EHR networks, they are not population-representative random samples, a constraint which has restrained their utility for population health research. The non-representative nature of patients represented in EHR data also poses a major challenge when performing cross-site validation of EHR-based findings, as study findings tend to reflect the unique characteristics of populations served by specific health care systems. We propose to assess the EHR-based risk predictions for the purpose of population inference, and to develop individualized absolute risk predictions.