Abstract Details
Activity Number:
|
257
|
Type:
|
Contributed
|
Date/Time:
|
Monday, August 10, 2015 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Section on Statistical Graphics
|
Abstract #317647
|
|
Title:
|
Exploratory Data Analysis of a Large Parallel Corpus: A Case Study on the Open Document System of the UN
|
Author(s):
|
Mario Morales and Roxana Gib*
|
Companies:
|
Stanford University/Mount Sinai School of Medicine and West University of Timisoara
|
Keywords:
|
Exploratory Data Analysis ;
word2vec ;
Machine Translation ;
Natural Language Processing ;
EDA ;
ODS
|
Abstract:
|
Recent advances in natural language processing in the area of machine translation are producing measures called word2vec that are based on PCA. In our study we attempt to discover the working mechanisms hidden in the datasets that allow these measures to work using exploratory data analysis techniques. Our other goal, apart from describing patterns or anomalies in the data, is the creation of an interactive based tutorial that can be used by current and future students and practitioners.
|
Authors who are presenting talks have a * after their name.
Back to the full JSM 2015 program
|
For program information, contact the JSM Registration Department or phone (888) 231-3473.
For Professional Development information, contact the Education Department.
The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.
2015 JSM Online Program Home
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.