Online Program Home
My Program

Abstract Details

Activity Number: 104
Type: Invited
Date/Time: Monday, August 1, 2016 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #318132 View Presentation
Title: Automatic Summarization
Author(s): Junhui Wang and Xiaotong Shen* and Yiwen Sun and Annie Qu
Companies: University of Illinois at Chicago and University of Minnesota and University of Minnesota and University of Illinois at Urbana-Champaign
Keywords: Text mining ; Unstructured predictors ; Large margin

Automatic summarization by key words and phases creates a summary of document using keywords or phases, which retains the essential part of the original document. A traditional method extracts a summary from a single document by examining relative importance of its words. In this presentation, we will present a method to solve this problem through learning the complex process from a variety of documents to provide more efficient summarization. In particular, we introduce a loss to measure the discrepancy between predicted and actual tag sets, which is expressed in terms of a weighted sum of pairwise margins between two tags, weighted by the degrees of similarity between them. On this ground, we construct a regularized empirical loss to incorporate certain linguistic knowledge, and identify a tagger maximizing the separations between the pairwise margins. As a result, the proposed method is capable of detecting novel tags absent from a training sample by exploring similarity among existing tags. Computational and theoretical aspects of the proposed method will be further discussed.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

Copyright © American Statistical Association