St. James Ballroom
Using Google APIs to Automate Data Extraction for Sampling Frames (303865)
Heather Driscoll, ICF*Adam Lee, ICF
Robynne Locke, ICF
Randy ZuWallack, ICF
Keywords: API, Python, Web Scrapping, Data Science, Sampling, Data Cleaning, ETL
In this poster, we present a novel approach to extract data using data science techniques, specifically a geocoded dataset (area sampling frame) through the Google Maps and Street View APIs. In 2018, we began work on a study of Multi-Unit Residential Buildings (MURB) across 8 major metropolitan areas in Canada. There were no definitive data sources that could identify MURBs and this required us to build our own sampling frame. The traditional approach to this type of sample frame development would be to physically canvas an area and review buildings to determine their eligibility into the study. The geographic area that field staff would need to cover would be a significant undertaking as it spanned 6 provinces. As an alternative, we virtualized the canvasing of areas by creating a custom interface with Google Maps API that allowed us to extract geocoding data from buildings, and a secondary program that allowed for bulk extraction of Google Street View images to evaluate their eligibility. As a result, significant time and resources were saved by implementing this process.