Abstract:
|
The supervised prediction problem consists of predicting the target (outcome) for the cases (observations) where it is unknown, based on a data sample (training data set) where both input variables (predictors) and target are known. Traditional approaches to this problem include neural networks, logistic regression (for ordinal target), and decision trees (classification and regression trees). For interval target, the problem can also be solved with multiple linear, nonlinear, and nonparametric regression. A new nonparametric method for the supervised prediction problem is proposed. On a fictitious mortgage scoring data set (with binary target) described in Data Mining Using Enterprise MinerTM Software: A Case Study Approach, SAS Institute (2000), the method was compared with logistic regression, neural networks (multilayer perceptron), and decision tree algorithms. In terms of predictive accuracy, on those data, the new method turned out to be competitive with neural networks and superior to both logistic regression and decision trees.
|