Online Program

Return to main conference page
Thursday, May 17
Computing Science
Reasoning with Data
Thu, May 17, 1:30 PM - 3:00 PM
Grand Ballroom D
 

Task-Centric Document Curation based on Node Embeddings from a Graphical Representation of Workflows (304539)

*Paul Jones, Laboratory for Analytic Sciences 

Keywords: graphical representation; classification; workflows

We present a graphical representation of workflows that allows documents to be associated with tasks in team-based multi-tasking environments, using a semi-supervised classification algorithm. Previous approaches to the problem of automatically applying task labels to documents have been limited to small feature spaces or have not taken into account multi-user environments. Many different clues to potential task associations are available through user, task and document similarity metrics, as well as through temporal patterns in individual and team workflows. We present a graph-based classification algorithm for automatic task-centric document curation, and show how this can guide a 'recent-work dashboard' interface, which organizes user's documents and gathers feedback from them. Our approach efficiently computes representations of users, tasks and documents in a common vector space, and can easily take into account many different types of associations through the creation of edges in a multi-layer graph. We have demonstrated the effectiveness of this approach using labelled document corpora from three empirical studies with students and intelligence analysts. We have also shown how to leverage relationships between different entity types to increase classification accuracy by up to 20% over a simpler baseline, and with as little as 10% labelled data.