Abstract:
|
Many statisticians agree that building models that predict well should be a high priority (Harville, 2014, Stern, 2014, Berry & Berry, 2014). The purpose of this paper is to test the predictive ability of various Bayesian models using two sets of professional sports data. The first data set contains the scores for a group of 22 closely matched members of the Professional Golf Association (PGA) playing on 18 different golf courses in 2014. The other data set contains homeruns per at bat for major league baseball players with at least 100 at bats in the 2015 season. We fit six different models with the intention of determining which predicts better in these two disparate data sets. We varied model complexity across two different dimensions. In one dimension we fit model intercepts using parametric Bayesian, nonparametric Bayesian, and hierarchical Bayesian methods. In the other dimension, we either included covariates for each sport or we did not include the covariates. We then use these models to predict scores in the following season for the same golfers/players as well as other golfer/players. Preliminary results indicate that nonparametric Bayesian methods seem marginally better.
|