Abstract:
|
Historically, variable selection algorithms are tuned to maximize the predictive ability of the model. In some applications, such as medical research, prediction is not the primary aim of the model development. Instead, the primary goal is to correctly identify variables on the causal pathway. Wu, Boos, and Stefanski (2007) developed an approach for tuning variable selection algorithms through the addition of pseudovariables. We extend their approach to select the tuning parameter to maximize the F-score, a measure of the quality of proper classification. We also present a version of the method which avoids the generation of pseudovariables in forward selection. We present numerical simulations establishing the performance of our method.
|