JSM 2014 Home
Online Program Home
My Program

Abstract Details

Activity Number: 634
Type: Topic Contributed
Date/Time: Thursday, August 7, 2014 : 10:30 AM to 12:20 PM
Sponsor: Government Statistics Section
Abstract #311518 View Presentation
Title: Topics in Time: Exploring Trends in Accident Reports Using Document Clustering
Author(s): Wendy Martinez*+
Companies: Bureau of Labor Statistics
Keywords: clustering ; text analysis ; unsupervised learning ; trend analysis
Abstract:

Semi-structured or unstructured text fields in government records are often unutilized. In this talk, I explore ways we can analyze text in survey data. In particular, I discuss the ideas behind document clustering (a type of unsupervised learning), where we want to group documents together, such that documents in a group have similar topics. I describe how this analysis can be used as an aid for computer-assisted coding, editing, and verification of survey records. A motivating example using accident reports from the Occupational Safety and Health Administration (OSHA) is presented. I show how we can cluster accident reports at different time intervals, assign topics to the clusters, and use this information to assess accident trends over time.

clustering, text analysis, unsupervised learning, trends


Authors who are presenting talks have a * after their name.

Back to the full JSM 2014 program




2014 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Professional Development program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

ASA Meetings Department  •  732 North Washington Street, Alexandria, VA 22314  •  (703) 684-1221  •  meetings@amstat.org
Copyright © American Statistical Association.