Online Program

Return to main conference page

All Times EDT

Friday, June 5
Practice and Applications
Practice and Applications Posters, Part 1
Fri, Jun 5, 10:00 AM - 1:00 PM
TBD
 

The Children's National Data Lake (CNDL): A Partnership with Amazon Web Services and Cerner (308228)

*James Bost, Children's National Hospital 
Marcin Gierdalski, Children's National Hospital 
Dongkyu Kim, Children's National Hospital 
Hiroki Edward Morizono, Children's National Hospital 

Keywords: big data, machine learning, cloud computing, EMR

Children's National Hospital (CNH) launched an initiative with Cerner and Amazon Web Services (AWS) to access a new Cerner product HealtheDataLab. It is a single environment where Data Scientists and Researchers can access their HealtheIntent data (the Cerner Electronic Medical Record) along with other ad hoc datasets. HealtheDataLab leverages AWS along with other open source tools to provide an elastic environment where computing power can scale based on project needs. We will migrate the Health Facts database (Cerner deidentified EMR data from over 500 hospitals and 10 years) into HealtheDataLab along with ad hoc datasets such as AHRQ HCUP, that can be used for analysis in conjunction with the HealtheIntent data for retrospective research projects.

We will Utilize AWS Glue service to Extract Transform and Load data from HealtheDataLab into the CNH Data Lake(CNDL) and other data collection tools such as REDCap. We will construct the Children’s data lake within the AWS environment providing researchers with their own secure data folder. The ultimate goals is to use our EMR data to identify potential patients for future research studies and / or retrospective review; upload and use Healthfacts ® for machine learning modeling; automate the pull of identified data for research and registry support with the capability to migrate directly into REDCap or OpenClinica®. For each researcher we will create a place to store data assembled for research studies and data registries across multiple conditions. The CNDL portal will provide a user friendly dashboard for researchers to request data and access their data folders. CNDL agents will provide training and consulting services. The CNDL data scientists will implement data extraction scripts using the AWS notebook. The CNDL biostatisticians will work with the researcher and the data scientist to develop analysis friendly data extracts.