Abstract:
|
Extracting topics from text is a valuable step forward in making use of all available data. With document topic relationship matrices, we can make exciting applications like creating cross-reference lists, recommended reading lists, and optimal reading lists of a given length. This work will show the application building process from text extraction and preparation to topic discovery and applications. We will focus on the statistical aspects of working with multiple and hierarchical document topic relationships and building recommendation systems from this information. The authors use abstracts from past JSM meetings to train an unstructured model. This model is applied to the current JSM program to suggest a customized schedule for an individual's interest. The authors will also share best practices for gathering, parsing, and preparing text data for this type of analysis.
|