697 – Collection and Linkage Challenges in Data Acquisition
Outlier Detection for the Manufacturing, Mining, and Construction Sectors in the 2012 Economic Census
Nicole Czaplicki
U.S. Census Bureau
Katherine Jenny Thompson
U.S. Census Bureau
In 2002, the U.S. Census Bureau began using a modified Hidiroglou-Berthelot (HB) edit for outlier identification to find outlying tabulations in the Geographic Area Series (GAS) reports of the Economic Census. This outlier-detection procedure compares ratios of tabulations, either of the same item over two time periods (historic ratios) or of two different but related items from the current time period (current cell ratio). The methodology implemented in production was developed by a group of subject matter experts and methodologists from five of the eight trade areas covered by the Economic Census. Seeking to expand the use of this methodology for the 2012 Economic Census, we conducted a feasibility study for the manufacturing, mining and construction sectors to see if they could also use this approach or a further modified version. The data collected by these sectors differ from the service sectors in several meaningful ways, such as the number of the collected items and the correlation between historic ratio pairs. This paper presents the results of our empirical investigation along with our conclusions.