![IconGems-Print](images/IconGems-Print.png)
258 – Applications in Big Data
Machine Learning for Machine Data from a CATI Network
Sou-Cheng Choi
NORC at the University of Chicago
We present machine learning and high-accuracy prediction methods of rare events in semi-structured or unstructured log files produced at high velocity and high volume by NORC's computer-assisted telephone interviewing network. These machine log files are generated by our internal Voxco Servers for a telephone survey. We adapt natural language processing (NLP) techniques and data-mining methods to train powerful learning and prediction models for error messages in the log files in the absence of source code, updated documentation, and relevant dictionaries.