Online Program

Return to main conference page

All Times ET

Thursday, June 3
Practice and Applications
Classification and Simulation: Methods, Analyses, and Applications
Thu, Jun 3, 10:00 AM - 11:35 AM
TBD
 

Auto-classification of occupational data (309770)

Presentation

*Ning Chong, Ministry of Manpower 
Jeremy Heng, Ministry of Manpower 

Keywords: classification, occupation, machine-learning

With greater demands for timely and accurate statistics by governments and policymakers around the world, more surveys have to be conducted with increasing sample sizes and data items. One particular data item that is onerous and complicated to collect is occupational data of individuals. For the compilation of official statistics in Singapore, occupational data is classified under the Singapore Standard Occupational Classification (SSOC), whereby each occupation of the civilian working population is classified to a unique 5-digit code.

To ensure that the data collected is accurate, consistency among interviewers and respondents is important. Previously, respondents provide information of their job title and job duties and interviewers manually classify the job information into the most appropriate SSOC code. This adds a layer of subjectivity as interviewers have to rely on their own understanding and interpretation of occupational details to classify each occupation, often leading to inconsistency and inaccuracy among SSOC codes.

The Singapore Ministry of Manpower (MOM) has developed an automatic classification system that can classify occupations based on occupational details without human intervention. By applying language modelling on historical text inputs on occupations, we fine-tune a Bidirectional Encoder Representations from Transformers (BERT) model, leveraging on language semantics derived through the training of BERT on unlabelled corpuses. With the model, we are able to predict the most appropriate SSOC code from a variety of text descriptions.

With the implementation of the automatic classification system, it has led to a reduction in man-hours spent on data collection and brought about cost savings to the organization. We look at the challenges of collecting occupational data, the concept behind the automatic classification system and how it helps to enhance data quality and operational efficiency.