Abstract:
|
State and local administrative data sources, including data used for managing benefit programs, are increasingly recognized as powerful resources for evidence-building, either as standalone data sources or through linkage to other sources. Evaluating administrative data quality is critical for agencies to make proper use of their data and to improve the data for future use. However, state and local agencies often lack the resources and training for staff to conduct rigorous evaluations of data quality. We present an R-based toolkit to assist researchers working with these administrative datasets to assess data quality, providing guidance and code for checks on data accuracy, the completeness of the records, and the comparability of the data over time and among subgroups of interest. The data quality assessment methods employed draw from the literature, incorporating descriptive statistics, data visualization, and exploratory data analysis to identify sets of records or variables for which quality may be a concern. Further, we discuss principles for undertaking customized data quality analyses for a specific data source that go beyond these general tools.
|