Online Program Home
My Program

Abstract Details

Activity Number: 558
Type: Contributed
Date/Time: Wednesday, August 3, 2016 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Education
Abstract #319380
Title: Methods for Identifying Outliers for Carry-Forward Imputation in the Survey of Graduate Students and Postdoctorates in Science and Engineering
Author(s): Jiantong Wang* and Kimberly Ault and Rachel Harter
Companies: Research Triangle Institute International and Research Triangle Institute International and RTI International
Keywords: missing data ; influential points ; outlier detection ; item nonresponse ; skewness ; data distribution

The Survey of Graduate Students and Postdoctorates in Science and Engineering (GSS)-is an annual census of postsecondary academic institutions in the United States that collects the number of graduate students, postdoctoral appointees, and doctorate-holding non-faculty researchers in science, engineering, and selected health fields at the unit (departments, affiliated research centers, and health care facilities) level. Carry-forward imputation is a predominant method used to impute missing values based on historical data. A unit's prior year values are adjusted by an inflation factor to predict the current year values. The inflation factors are ratios of the current-year totals to the prior-year totals over a carefully formed group of units. Although the inflation factors represent general trends, outlying reported values may unduly affect the factors. This presentation compares four alternative methods for identifying outliers: the Median Absolute Deviation, Tukey's, standard deviation (SD), and adjusted boxplot. Our results indicate that the mean plus 8 SD is the most appropriate for GSS data because it takes into account distributional changes of the data each year.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association