Thursday, May 17

Text Data Analytics and Visualization

Thu, May 17, 3:30 PM - 5:00 PM
Grand Ballroom E

Fast k Nearest Neighbor Graph Construction Experiments on a Large Scientific Publication Corpus (304664)

*Avory Bryant, Naval Surface Warfare Center

Keywords: fast k nearest neighbor graph construction; text analysis

The performance of several existing fast k nearest neighbor graph construction approaches are investigated using a large scientific publication corpus (on the order of tens of millions of publications). Particular attention is given to the domain specific case of cosine similarity between document vectors (i.e., sparse high-dimensional non-negative vectors). Exact and approximate methods are included, and performance reported with respect to time and k nearest neighbor recall. Additional analysis on the methods is reported with respect to varying corpus size, choice of k, and dimensionality (i.e., feature selection).

Online Program

Fast k Nearest Neighbor Graph Construction Experiments on a Large Scientific Publication Corpus (304664)

ASA Meetings Department