According to the IOM, the electronic health record (EHR) is a longitudinal collection of electronic health information for and about persons used to support efficient processes for health care delivery. As the adoption of EHR has significantly increased, large quantities of structured, coded, electronic data are available for clinical research. Opportunities span quality improvement of clinical care, study recruitment, public health initiatives, replicating results of trials, and big data research. However, EHR-based research is more challenging than traditional retrospective studies. Records are frequently inaccurate, incomplete, and data are complex. A thorough understanding of traditional epidemiologic approaches to study designs for the use of EHR data are essential for optimizing our ability for reproducible research. The impact of poor data quality, poorly designed studies, and bias and confounding play a significant role in causal inference using EHR data. We will highlight the significant challenges with designing and implementing EHR-based studies, including examples from the literature regarding the impact pitfalls have on our ability to interpret findings.