2014 Joint Statistical Meetings - Statistics: Global Impact - Past, Present and Future

JSM 2014 Online Program

Online Program Home
My Program

Activity Number:	9
Type:	Invited
Date/Time:	Sunday, August 3, 2014 : 2:00 PM to 3:50 PM
Sponsor:	Section on Statistical Computing
Abstract #310798
Title:	Learning Binary Representations for Fast Similarity Search in Massive Databases
Author(s):	Sanjiv Kumar*+
Companies:	Google
Keywords:	Nearest Neighbor Search ; Binary Coding ; Hashing ; Graph Laplacian
Abstract:	Binary coding based Approximate Nearest Neighbor (ANN) search in huge databases has attracted much attention recently due to its fast query time and drastically reduced storage needs. There are several challenges in developing a good ANN search system. A fundamental question that comes up often is: how difficult is ANN search in a given dataset? In other words, which data properties affect the quality of ANN search and how? Moreover, for different application scenarios, different types of learning methods are appropriate. Next, I will discuss nearest neighbor search when data lives on a manifold, i.e., the given distance metric only applies in local neighborhoods. This leads to manipulating a large graph, which is solved approximately using Anchor Graphs. Preliminary experimental results on real-world data verify the effectiveness of the proposed method.

Authors who are presenting talks have a * after their name.

2014 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Professional Development program, please contact the Education Department.

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.