Abstract:
|
In Major League Baseball (MLB), a dominant pitcher has the ability to shutdown an opposing team and control the game. As a result, predicting the next pitch can give a team a competitive advantage; so much so, that controversy has developed in the MLB around teams stealing signals that relay information about which type of pitch will be thrown next. While the MLB prohibits attempts to steal signs from opposing teams, its decision to allow iPads in the dugout opens the use of predictive modeling techniques. The MLB stores data on every game played in the online PITCHf/x database. The aim of this research is to build a model capable of predicting the pitch type classifier in PITCHf/x. A Bayesian hierarchical multinomial regression model was fit to data scraped from the PITCHf/x database; pitchers were clustered based on physical characteristics and player profile. Predicted pitch type probabilities were compared to empirical probabilities based on high-dimensional cross-classification arrays and evaluated using the log loss function. The analyses showed that the model outperformed these probabilities. The utility of this model and practical applications of the results are explored.
|