Hibiscus B
Quality Control Measures in Data Cleaning (303329)
*Josefina Venegas Almeda, University of the PhilippinesKeywords: quality control measures, data cleaning, double entry data, percentage of errors
Data collection for surveys is a vigorous procedure which is prone to errors and thus entails highlighting of data quality. Incomplete and inaccurate collected data are unacceptable since it deteriorates the analysis and conclusions of the study. Quality assurance is a continuing process throughout the survey from preparation, sampling, data collection, data cleaning, data processing, data analysis until report writing. This paper focuses on quality control measures in data cleaning of submitted questionnaires coming from the field. To achieve maximum quality, data cleaning passes through several stages namely 1) preparation of code book on how to clean the data; 2) manual data editing of questionnaires by survey specialists; 3) testing of database program by database expert; 4) encoding and checking of encoded forms vis-a-vis the database; 5) spot checking of data encoders’ work; 6) checking and compilation of database by database expert; 7) entering data twice; 8) identifying the discrepancies and/or errors on double entry data; 9) processing of tables by programmer; and, 10) documentation of all the rules applied during the cleaning process, validation and computation of percent of error. For each stage, this paper provides several protocols on data cleaning. It covers checking completeness of accomplished questionnaire, naming and labeling the data, checking the unique identifiers, range checking and setting variable bounds, checking skip patterns and missing data, checking logical consistency, standardizing string variable coding, treatment of missing data, performing double entry, computing for percentage of errors, and reshaping the data. The protocols ensure reduction of errors in data collection, strictly monitoring survey rules and procedures, and refining data collection expertise.