Statistical institutes such as the UK’s Office for National Statistics (ONS) play a critical role in producing the data, statistics and analysis that inform the biggest decisions across government and industry. Despite the competition from the rich range of administrative and alternative (big) data sources, high quality surveys continue to provide high-impact insight and understanding.
Data science and machine learning approaches offer the potential to strengthen all phases of survey programmes, from design through collection and processing to outputs. This presentation will review how the ONS is experimenting with and implementing new approaches, including: error and anomaly detection; automated coding and classification approaches using natural language processing and machine learning; producing plausible synthetic data to develop and test pipelines; combining survey, administrative and big data sources to strengthen outputs.
We will illustrate with examples from producing weekly publications from the world’s largest Covid Infection Survey, developing and validating Census 2021 outputs during the pandemic, and supporting high priority work across government.
|