As the volume and complexity of data continues to grow, more attention is being focused on solving so-called big data problems. One field where this focus is pertinent is credit card fraud detection. Model selection approaches can identify key predictors for preventing fraud. Stagewise Selection is a classic model selection technique that has experienced a revitalized interest due to its computational simplicity and flexibility. Over a sequence of simple learning steps, stagewise techniques build a sequence of candidate models that is less greedy than the stepwise approach.
This paper introduces a new stochastic stagewise technique that integrates a sub-sampling approach into the stagewise framework, yielding a simple tool for model selection when working with big data. Simulation studies demonstrate the proposed technique offers a reasonable trade off between computational cost and predictive performance. We apply the proposed approach to synthetic credit card fraud data to demonstrate the technique's application.