|Friday, February 16|
|CS04 Working with Messy Data||
Fri, Feb 16, 9:15 AM - 10:45 AM
Practical Time-Series Clustering for Messy Data in R (303480)
Keywords: time-series clustering, time-series visualization, classification, event data
Identifying patterns in a pool of event data is hard. Time-series clustering provides a bag of lenses to change the level of detail when viewing and analyzing a collection of event data. Jonathan Page will present time-series clustering techniques using the dtwclust package in R and share his experience applying these techniques to mobile banking data in Kenya.
Attendees will receive a handout with graphs of time-series to manually classify as an exercise. Following the classification exercise and the presentation of two important clustering techniques, Dynamic Time Warping with Sakoe-Chiba constraints and the more recent k-Shape clustering algorithm, he will display the assignments given by the respective algorithms. This exercise and the discussion about it gives the participants an intuition about the benefits and drawbacks of each algorithm.
The code for generating the presented time-series analysis will be posted to GitHub. Participants will also receive a handout with a step-by-step process taking them from messy event data, to organized time-series, to classified time-series, and finally to visualizations useful for presenting the resulting time-series clusters.