Andrew J. Dau
United States Department of Agriculture, National Agricultural Statistics Service
![IconGems-Print](images/IconGems-Print.png)
58 – Leading the Dance with Dirty Data
Dancing with the Software: Selecting Your Imputation Partner
Andrew J. Dau
United States Department of Agriculture, National Agricultural Statistics Service
Darcy Miller
United States Department of Agriculture
The United States Department of Agriculture (USDA) National Agricultural Statistics Service (NASS), in conjunction with the USDA Economic Research Service (ERS), conducts the three-phase Agricultural Resource Management Survey (ARMS) to study the economic well-being of farm households. Due to item nonresponse, some of the ARMS data are missing. Prior to 2015, a complete data set for use by NASS was formed by a mixture of conditional mean imputation and manual imputation. Since 2015, Iterative Sequential Regression (ISR), a multivariate imputation methodology, has been used for ARMS’s third phase (ARMS 3). ISR is an in-house developed software program that requires a significant amount of support to maintain. Also, ISR has been developed for use on continuous and semi-continuous data, and NASS needs to impute other data types including categorical and ordinal data. Hence, NASS is exploring alternative commercial off-the-shelf (COTS) imputation approaches, specifically, IVEware, a product of the University of Michigan, and SAS® PROC MI. ISR, IVEware, and PROC MI are empirically compared for use in the ARMS 3 survey with attention not only given to data quality but also to ease of implementation and maintainability.