JSM 2011 Online Program

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

Abstract Details

Activity Number: 442
Type: Invited
Date/Time: Wednesday, August 3, 2011 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistics in Defense and National Security
Abstract - #300216
Title: Joint Embedding of Disparate Information with Applications in Text Analysis
Author(s): David Marchette*+ and Carey Priebe
Companies: Naval Surface Warfare Center and The Johns Hopkins University
Address: , , ,
Keywords: multidimensional scaling ; disparate data fusion ; dimensionality reduction ; text data mining ; implicit translation
Abstract:

The analysis of scientific documents often results in the extraction of features of very different type. For example, in addition to standard text analysis features such as used in bag-of-words models, features extracted from natural language processing methods, or similar text-based features, one may extract graphs such as co-authorship or citation networks, features extracted from images or figures within the documents. One way to perform inference on such different and complex data is to embed the data into a single space so that the different types of information can be combined in a useful way. We discuss several approaches to this problem, derived from a dissimilarity framework and utilizing multidimensional scaling ideas to define the embeddings. We will illustrate these techniques on several text document datasets.


The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2011 program




2011 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.