Title: Methods for Identifying Outliers for Carry-Forward Imputation in the Survey of Graduate Students and Postdoctorates in Science and Engineering
The Survey of Graduate Students and Postdoctorates in Science and Engineering (GSS)-is an annual census of postsecondary academic institutions in the United States that collects the number of graduate students, postdoctoral appointees, and doctorate-holding non-faculty researchers in science, engineering, and selected health fields at the unit (departments, affiliated research centers, and health care facilities) level. Carry-forward imputation is a predominant method used to impute missing values based on historical data. A unit's prior year values are adjusted by an inflation factor to predict the current year values. The inflation factors are ratios of the current-year totals to the prior-year totals over a carefully formed group of units. Although the inflation factors represent general trends, outlying reported values may unduly affect the factors. This presentation compares four alternative methods for identifying outliers: the Median Absolute Deviation, Tukey's, standard deviation (SD), and adjusted boxplot. Our results indicate that the mean plus 8 SD is the most appropriate for GSS data because it takes into account distributional changes of the data each year.

