Abstract:
|
Prediction and classification challenges are an exciting feature of the statistical and machine learning community. In a probabilistic binary classification challenge, competitors are asked for a vector of marginal probabilities where each element gives the probability of assigning the corresponding case to class 1. However, in order for such a competition to be considered worth entering, the challenge organizers must be seen to evaluate the submissions in a fair and open manner. We discuss a class of proper scoring rules called linear scoring rules that are specifically adapted to probabilistic binary classification. When applied in competition situations, we show that all linear scoring rules essentially balance the needs of organizers and competitors. Furthermore, since scoring rules have a statistical decision theoretic foundation, a linear scoring rule can be constructed for any user-defined misclassification loss function. Given the scoring rule that will be used on the test cases, we also discuss the questions of training and calibrating the classifier.
|