Abstract:
|
Tagging is to summarize various information with keywords, where the information can be texts, graphs, or videos. Tagging has gained wide popularity due to the growth of web searching, social networking and graphical information sharing. In this talk, we will introduce a novel matching framework for large-scale tagging, where a matching function is constructed to measure the similarity between tags and pieces of information, and the prior knowledge regarding similarity between predictors and tags is also incorporated to improve the tagging accuracy. The prior knowledge is incorporated through relational penalties over matching function coefficients. One advantage is that the proposed method is able to detect new tagging which does not appear in training samples. The proposed methods are implemented via a scalable alternating direction method of multipliers (ADMM) algorithm based on sparse word representations. We demonstrate the numerical performance with a variety of simulated experiments and a large-scale example of 20 newsgroup dataset. In addition, the asymptotic properties are illustrated, confirming the advantages of the proposed tagging framework.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.