Abstract Details
Activity Number:
|
634
|
Type:
|
Topic Contributed
|
Date/Time:
|
Thursday, August 7, 2014 : 10:30 AM to 12:20 PM
|
Sponsor:
|
Government Statistics Section
|
Abstract #311518
|
View Presentation
|
Title:
|
Topics in Time: Exploring Trends in Accident Reports Using Document Clustering
|
Author(s):
|
Wendy Martinez*+
|
Companies:
|
Bureau of Labor Statistics
|
Keywords:
|
clustering ;
text analysis ;
unsupervised learning ;
trend analysis
|
Abstract:
|
Semi-structured or unstructured text fields in government records are often unutilized. In this talk, I explore ways we can analyze text in survey data. In particular, I discuss the ideas behind document clustering (a type of unsupervised learning), where we want to group documents together, such that documents in a group have similar topics. I describe how this analysis can be used as an aid for computer-assisted coding, editing, and verification of survey records. A motivating example using accident reports from the Occupational Safety and Health Administration (OSHA) is presented. I show how we can cluster accident reports at different time intervals, assign topics to the clusters, and use this information to assess accident trends over time.
clustering, text analysis, unsupervised learning, trends
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2014 program
|
2014 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Professional Development program, please contact the Education Department.
The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Copyright © American Statistical Association.