|
Activity Number:
|
547
|
|
Type:
|
Contributed
|
|
Date/Time:
|
Thursday, August 2, 2007 : 10:30 AM to 12:20 PM
|
|
Sponsor:
|
Social Statistics Section
|
| Abstract - #310264 |
|
Title:
|
Examination of Two Issues Regarding Electronic Essay Scoring
|
|
Author(s):
|
Sandip Sinharay*+ and Shelby Haberman and Jiahe Qian
|
|
Companies:
|
Educational Testing Service and Educational Testing Service and Educational Testing Service
|
|
Address:
|
Educational Testing Service, Princeton, NJ, 08541,
|
|
Keywords:
|
Press statistics ; linear regression ; E-rater
|
|
Abstract:
|
Electronic essay-scoring usually involves the application of a linear regression model fitted on a sample of essays; the human rater score of an essay serves as the dependent variable and several numerical features of the essay serves as the independent variables in the regression. We examine two aspects of essay scoring. First, we focus on the problem of determination of the minimum sample size that will allow us to score essays with enough precision. This involves a study of the Press statistic that provides the error in predicting a new observation. Second, we examine when there is a need of human intervention in electronic essay scoring. For example, because of the use of a linear regression model, an ordinary essay with an extremely large value of a numerical feature may receive a high score. We employ outlier analysis to set up rules to flag essays with unusual numerical features.
|