Online Program

Friday, February 21
CS15 Rumors and Recommendations Fri, Feb 21, 3:15 PM - 4:45 PM
Bayshore VI

Are the Rumors True? Using Text Mining to Predict Future Baseball Trades (302740)

*Michael Greene, Deloitte Consulting 

Keywords: Text mining, predictive modeling, web spider, screen scraping, sports analytics, rumor mining

With instant updates on blogs, Twitter, and other media outlets, the latest inside tip on a potential transaction in the sports world is more valuable than ever. Trade rumors are as old as baseball itself, where professionals and fans alike dream about the possibilities of “dream teams” during the “hot stove” season. However, with expanding computing power and advanced statistical and machine learning algorithms, we can begin to mine this mountain of data to uncover trade insights buried within the public text. We demonstrate that raw blog text can predict the outcomes of potential trades and free-agent signings in baseball and which words and phrases signal an upcoming trade or signing. This presentation will focus on how to create a database of blog entries on potential trade and free-agent signings and use text mining methods to generate structured data from raw text. Then, leveraging network analyses and logistic regression models, we will illustrate that certain trades and signings are more likely to happen in the future. This analysis shows which rumors are more likely to come true, giving both fans and front offices a better chance to prepare for various future scenarios.