Online Program

Return to main conference page

All Times ET

Program is Subject to Change

Wednesday, June 16
Wed, Jun 16, 10:30 AM - 12:00 PM
Leveraging Machine Learning to Improve Economic Surveys and Programs

Industry Classification Using Machine Learning: An Application to the Economic Census (308052)

*Anne Sigda Russell, U.S. Census Bureau 
Javier Miranda, US Census Bureau 
Justin Clifford Smith, US Census Bureau 

Keywords: Machine Learning, Natural Language Processing, Industry Classification, NAICS, Census Bureau, establishments, firms, classifier

The US Census Bureau spends considerable time and resources identifying and classifying the industry of establishments in the U.S using the North American Industrial Classification System (NAICS). This information is critical to Census Bureau statistical products. The burden of this collection to businesses is considerable as well. We have developed a natural language processing and machine learning pipeline that uses a novel combination of proprietary and public data for the purpose of predicting complete NAICS codes. We focus on how this new approach can be implemented into the typical workflow for analysts, how it can be used to automate the easy-to-solve cases leaving analysts with more resources to tackle difficult to classify establishments, and how it could be used to eliminate entire survey operations while maintaining, and conceivably, improving data quality while lowering costs.