
Keywords: text mining, predictive model, binary classifier
Litigated claims are the most costly claims for insurance companies. Predicting which claimants are likely to litigate enables proper handling of claims prior to attorney involvement. Using small business claims data from an insurance company, we develop predictive models that will indicate whether or not a claimant will litigate. Our models utilize machine learning algorithms and natural language processing to perform prediction, using quantitative data and text data. By evaluating each model’s performance, an insurance company can choose which model to use in order to decide which claims and claimants need proper assignment.