We draw upon recent advances in computer vision algorithms and in the availability and resolution of geospatial big data to improve the way data are collected for social science research. High quality surveys require a list of housing units from which to select a sample. Each year, several US studies send field staff out to create these lists. Not only is this process expensive, inefficient and redundant, it also tends to miss some housing units, particularly in rural areas.
Our approach uses computer vision techniques to detect dwellings in satellite images. We trained an algorithm to detect housing units in several counties in North Carolina. The approach achieves high accuracy and does not require large amounts of training data. We are now working to expand the approach to detect demographics of the households in rural areas (e.g. income, presence of children) to assist surveys that want to oversample some demographic groups.
|