Abstract:
|
Federal statistical agencies are increasingly integrating data from multiple sources for analyses. These data sources often include data not collected for statistical purposes, including data used for public or private program administration. The Federal Conference on Statistical Methodology’s 2020 report “A Framework for Data Quality” provides valuable guidance for standardizing data quality assessment practices. One area identified for further study is best practices for communicating quality, including graphical and interactive tools. We emphasize that exploratory data analysis is an important aspect of data quality assessment for understanding data sources’ strengths and weaknesses. Applying methods informed by the data quality literature, we present graphical methods for exploring a data file and assessing the data quality components of accuracy, completeness of records, and comparability of the data over time and among subgroups. Further, we discuss the Data File Orientation Dashboard developed at NORC at the University of Chicago allowing users to interactively explore their data files, apply data quality analyses, and interpret the results to assess their data files.
|