Online Program

Return to main conference page

All Times ET

Friday, June 4
Computational Statistics
New Models and Methods
Fri, Jun 4, 1:20 PM - 2:55 PM
TBD
 

Multi-scale Affinities with Missing Data: Estimation and Applications (309818)

Eric Chi, North Carolina State University 
Gal Mishne, University of California, San Diego 
*Min Zhang, North Carolina State University 

Keywords: Missing data, Kernels, Penalized estimation

Many machine learning algorithms depend on weights that quantify row and column similarities of a data matrix. The choice of weights can dramatically affect the effectiveness of the algorithm. Nonetheless, the problem of choosing weights is not given enough study. When a data matrix is completely observed, Gaussian kernel affinities can be used to quantify the local similarity between pairs of rows and pairs of columns. Computing weights in the presence of missing data, however, becomes challenging. In this paper, we propose a new method to construct row and column affinities even when data is missing by building off a co-clustering technique. This method takes advantage of solving the optimization problem for multiple pairs of cost parameters and filling in the missing values with increasingly smooth estimates. It exploits the coupled similarity structure among both the rows and columns of a data matrix. We show these affinities can be used to perform tasks such as data imputation, matrix completion on graphs, and clustering.