Online Program

Return to main conference page
Friday, October 20
Fri, Oct 20, 7:30 AM - 8:30 AM
Aventine Ballroom G
Continental Breakfast and Speed Poster 2 sponsored by Bank of America

Reproducible research with R: application of tumor registry data from the Surveillance, Epidemiology, and End Results (SEER) Program. (304049)

*Zhanna Galochkina, UNM Comprehensive Cancer Center 

The SEER Program of the National Cancer Institute (NCI) provides publicly available information on cancer data. These registry data contain more than 9 million records for the last 4 decades. Generating analysis reports using the SEER data could be intensively iterative when implementing changes in a target study population over time, such as choosing particular cancer type, years of diagnosis, ethnicity/race, or age groups, etc. To handle this challenge we developed an R program in companion with Rstudio to generate reports. Reports in forms of HTML, PDF, MS Word, and MS PowerPoint with built-in R code will be presented and their strengths and weaknesses will be summarized. In addition, we will provide examples and codes for drawing customizable geo-maps using cartographic boundary shapefiles in the same R environment, as it might be of investigators’ interest to accompany a report visually for spatial data. Using our R program will enhance the integrity of statistical analysis. We hope that this program increases the efficiency of statisticians’ time, reduce potential errors, and enhance research reproducibility in the SEER data analysis.