Abstract:
|
One shortcoming of survey sampling frames, such as list frames, is that they may not cover the entire target population for a given survey. In short, frames may suffer from what statisticians refer to as undercoverage. As a result, research organizations sometimes use area frames to address potential undercoverage issues on list frames. One drawback of area frames is that they are costly and require significant resources to build and maintain. To explore the idea of addressing undercoverage on list frames using less resource-intensive methods than area frames, the National Agricultural Statistics Service (NASS) undertook an effort to build a sampling frame from scratch using data gathered from web-scraping technology. This paper details how NASS transformed raw and limited data from web-scraping technology into a robust survey sampling frame that ultimately allowed for the selection of a complex survey sample through the use of several data science methods. Survey results from the selected sample showed that this approach successfully addressed undercoverage in some areas. A path forward using this methodology is discussed in the conclusion of this paper.
|