JSM 2015 Preliminary Program

Online Program Home
My Program

Abstract Details

Activity Number: 514
Type: Invited
Date/Time: Wednesday, August 12, 2015 : 10:30 AM to 12:20 PM
Sponsor: General Methodology
Abstract #314210
Title: Large-Scale Tagging of Unstructured Data
Author(s): Junhui Wang* and Xiaotong Shen and Yiwen Sun and Annie Qu
Companies: University of Illinois at Chicago and University of Minnesota and University of Minnesota and University of Illinois at Urbana-Champaign
Keywords: ADMM algorithm ; Large margin learners ; Large n and p ; Natural language processing ; Tagging ; Unstructured data
Abstract:

Tagging is to summarize various information with keywords, where the information can be texts, graphs, or videos. Tagging has gained wide popularity due to the growth of web searching, social networking and graphical information sharing. In this talk, we will introduce a novel matching framework for large-scale tagging, where a matching function is constructed to measure the similarity between tags and pieces of information, and the prior knowledge regarding similarity between predictors and tags is also incorporated to improve the tagging accuracy. The prior knowledge is incorporated through relational penalties over matching function coefficients. One advantage is that the proposed method is able to detect new tagging which does not appear in training samples. The proposed methods are implemented via a scalable alternating direction method of multipliers (ADMM) algorithm based on sparse word representations. We demonstrate the numerical performance with a variety of simulated experiments and a large-scale example of 20 newsgroup dataset. In addition, the asymptotic properties are illustrated, confirming the advantages of the proposed tagging framework.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2015 program





For program information, contact the JSM Registration Department or phone (888) 231-3473.

For Professional Development information, contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

2015 JSM Online Program Home