Conference Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 32 - Nonparametric Methods with High-Dimensional Data
Type: Contributed
Date/Time: Sunday, August 7, 2022 : 2:00 PM to 3:50 PM
Sponsor: Section on Nonparametric Statistics
Abstract #322272
Title: Variable Selection Using Kernel Partial Correlation Coefficient
Author(s): Zhen Huang* and NABARUN DEB and Bodhisattva Sen
Companies: Columbia University and Columbia University and Columbia University
Keywords: model-free variable selection; conditional dependence; geometric graph; nearest neighbor methods; reproducing kernel Hilbert spaces; conditional mean embedding
Abstract:

Suppose we have a response Y and 1,000 covariates Xi, and Y is a function of the first three covariates plus noise. Y and Xi may not even be Euclidean. How can we identify the "true" three predictive variables exactly given 200 iid observations? A variable selection algorithm, which we call KFOCI, can do the job even without the need to specify the number of variables to select, achieving much superior performance compared with its predecessors, and is provably consistent under sparsity assumption. KFOCI is an application of the kernel partial correlation coefficient (KPC) we propose, which is a number between 0 and 1 measuring the strength of conditional dependence--the KPC between two random variables Y and Z given a third variable X is 0 if and only if Y is conditionally independent of Z given X, and 1 if and only if Y is a measurable function of Z and X. Given the predictors that have already been selected, KFOCI selects the next predictor Xi such that the sample KPC between Y and Xi given the selected predictors is maximized, and stops when all such sample KPC are negative. Both KPC and KFOCI are easily accessible through our package KPC available on CRAN.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2022 program