Abstract Details
Activity Number:
|
433
|
Type:
|
Contributed
|
Date/Time:
|
Tuesday, August 11, 2015 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Section on Statistical Learning and Data Mining
|
Abstract #314962
|
|
Title:
|
WITHDRAWN: Words Segmentation in Chinese Language Processing
|
Author(s):
|
Xinxin Shu and Annie Qu and Junhui Wang and Xiaotong Shen
|
Companies:
|
Merck and University of Illinois at Urbana-Champaign and University of Illinois at Chicago and University of Minnesota
|
Keywords:
|
Language processing ;
words segmentation ;
cutting-plane algorithm
|
Abstract:
|
The digital information in this "Big Data" era has become a essential part of modern life, from scientific research, entertainment business, product marketing to national security protection. So developing fast automatic process of information extraction becomes extremely demanding. Chinese language is the second popular language among all internet users but is still severely under studied. In the paper, we address words segmentation arisen from Chinese language processing. One major challenge is that Chinese language is a highly context-dependent language. Segmentation for Chinese documents consequently becomes crucial. We propose an optimization model with linguistically-embedded features and computational feasible losses. The proposed model is investigated through the Peking university corpus. The result shows that the proposed method delivers better performance against the current two top performers.
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2015 program
|
For program information, contact the JSM Registration Department or phone (888) 231-3473.
For Professional Development information, contact the Education Department.
The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
2015 JSM Online Program Home
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.