572 – Working with Imperfect Data
Edit and Imputation Processing for Ethnocultural Variables: The Experience of the 2011 Canadian National Household Survey
Sean Crowe
Statistics Canada
Chunxiao (William) Liu
Statistics Canada
The ethnocultural (EC) questions on the 2011 Canadian National Household Survey (NHS), which complements the Canadian Census, measured ethnic and cultural characteristics of the Canadian population. These characteristics are very closely related. For prior Canadian Censuses, the five topics of Immigration and Citizenship, Place of Birth of Parents, Aboriginal, Ethnic Origin and Population Group were processed sequentially through edit and imputation (E&I). In addition to the less-than-optimal efficiency of separate processing, an unacceptable quantity of outlier combinations was present in the imputed data, necessitating a great number of manual post-E&I fixes. For the 2011 NHS, the five separate EC topics were combined into one unified topic with the goal of simplifying the E&I processing, improving the internal coherence of the imputed data and reducing manual intervention after imputation. This paper describes the challenges that were faced and the solutions developed in order to accomplish this task.