Abstract:
|
The Maryland Longitudinal Data System (MLDS) is a central repository of student and workforce data, including data provided by the Maryland State Department of Education, the Maryland Higher Education Commission and the Maryland Department of Labor, Licensing and Regulation. The Institute of Educational Sciences is funding a project to produce and release synthetic versions of selected longitudinal state-level datasets. The use of synthetic data is of increasing interest to many state longitudinal integrated data systems that also seek to balance researcher access with data privacy concerns. Practical tools that implement generic synthesization methods exist. Nevertheless, synthesizing large integrated data presents specific methodological and practical challenges. Longitudinal integrated data involves: a lot of variables, redundancy (and inconsistency) of information, specific often non-random missing data patterns, and different levels or dimensions. We propose to detail the nature and implications of these challenges and describe the solutions we are applying in our ongoing MLDS Synthetic Data Project.
|