The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
Abstract Details
Activity Number:
|
73
|
Type:
|
Contributed
|
Date/Time:
|
Sunday, July 31, 2011 : 4:00 PM to 5:50 PM
|
Sponsor:
|
Section on Bayesian Statistical Science
|
Abstract - #301871 |
Title:
|
Probabilistic Modeling of Text Data: A Review
|
Author(s):
|
Shibasish Dasgupta*+
|
Companies:
|
University of Florida at Gainesville
|
Address:
|
Department of Statistics, Gainesville, FL, 32608, USA
|
Keywords:
|
probabilistic inference ;
text classification ;
information retrieval ;
document generalization
|
Abstract:
|
The management of large and growing collections of information is a central goal of modern statistical science. Data repositories of texts have become widely accessible, thus necessitating good methods of retrieval, organization, and exploration. Probabilistic models have been paramount to these tasks, used in settings such as text classification, information retrieval, text segmentation, information extraction etc.
These methods entail two stages: (1) Estimate or compute the posterior distribution of the parameters of a probabilistic model from a collection of text; & (2) For new documents, answer the question at hand (e.g., classification, retrieval) via probabilistic inference.
The goal of such modeling is document generalization. Given a new document, how is it similar to the previously seen documents? Where does it fit within them? What can one predict about it? Efficiently answering such questions is the focus of the statistical analysis of document collections. This talk will consider the problem of modeling text corpora.
|
The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.
Back to the full JSM 2011 program
|
2011 JSM Online Program Home
For information, contact jsm@amstat.org or phone (888) 231-3473.
If you have questions about the Continuing Education program, please contact the Education Department.