605 – Recent Developments in Analysis of Psychiatric and Other Health Outcomes
Estimation for Cells Suppressed in Tabulation with Application to Output Disclosure Treatment of the NSF Survey of Earned Doctorates
Avi Singh
NORC at the University of Chicago
Joshua M. Borton
NORC at the University of Chicago
Stephen Cohen
National Science Foundation
Vince Welch Jr.
NORC at the University of Chicago
G Brianna
Y Lin
NORC at the University of Chicago
National Center of Science and Engineering Statistics employs the method of cell suppression in tabulation (or cs-tabulation) for disclosure-treatment of sensitive cells in the Survey of Earned Doctorates (SED). The symbol 'D' is used to replace cell values subject to suppression including complementary suppressed cells. However, its impact on estimates for underrepresented minority of race/ethnicity groups and women may be quite severe. To alleviate this concern, it is proposed to enhance cs-tabulation by estimating 'D' cells via log-linear modeling. The resulting complete table is expected to have high utility for users at large because model-based best prediction is used for suppressed cells from the available unsuppressed information. In particular, the estimates can be used to check the underlying trend over different subgroups cross-sectionally and for a given subgroup longitudinally. The proposed method uses only information released under cs-tabulation and therefore does not increase the disclosure risk. Also it preserves all the unsuppressed cells and marginal counts. Applications to NSF-SED data are discussed.