Online Program

Return to main conference page
Thursday, May 17
Data Visualization
Text Data Analytics and Visualization
Thu, May 17, 3:30 PM - 5:00 PM
Grand Ballroom E
 

Fast k Nearest Neighbor Graph Construction Experiments on a Large Scientific Publication Corpus (304664)

*Avory Bryant, Naval Surface Warfare Center 

Keywords: fast k nearest neighbor graph construction; text analysis

The performance of several existing fast k nearest neighbor graph construction approaches are investigated using a large scientific publication corpus (on the order of tens of millions of publications). Particular attention is given to the domain specific case of cosine similarity between document vectors (i.e., sparse high-dimensional non-negative vectors). Exact and approximate methods are included, and performance reported with respect to time and k nearest neighbor recall. Additional analysis on the methods is reported with respect to varying corpus size, choice of k, and dimensionality (i.e., feature selection).