Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 474 - Emerging Methods and Applications in Insurance Data Science
Type: Topic Contributed
Date/Time: Wednesday, August 10, 2022 : 2:00 PM to 3:50 PM
Sponsor: Casualty Actuarial Society
Abstract #320971
Title: Statistical Learning Approaches to Analyzing Textual Insurance Data
Author(s): Gee Y. Lee*
Companies: Michigan State University
Keywords: Textual data analysis; Statistical learning; Insurance claims analytics; Loss modeling; Generalized additive model; Group lasso

In this talk, the speaker will provide an overview of methods to predict the amount of ultimate insurance loss given a textual description of the claim using a large number of words found in the description of the claim. Initial insurance losses are often reported with a textual description of the claim, and in order to transform words into numeric vectors, the proposed method is to use word cosine similarities and word embedding matrices. When one considers all unique words found in the training dataset and impose a generalized additive model to the resulting explanatory variables, the resulting design matrix is high dimensional. For this reason, statistical learning approaches, such as the group lasso approach, are used to reduce the number of coefficients in the model. Details of the implementation of the estimation routine using the Rcpp library will be explained during the talk.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program