Online Program Home
My Program

Abstract Details

Activity Number: 616 - New Advances in Semiparametric Modeling and Testing for Complex Data
Type: Topic Contributed
Date/Time: Thursday, August 2, 2018 : 8:30 AM to 10:20 AM
Sponsor: Section on Nonparametric Statistics
Abstract #330163 Presentation
Title: The Role of Kernels in Data Analysis
Author(s): Marianthi Markatou*
Companies: University at Buffalo
Keywords: kernels; goodness of fit; two-sample tests; k-sample tests; Neyman-Pearson lemma for kernels; non-centrality index
Abstract:

Kernels are essential elements in the construction of learning systems and machine learning algorithms. In statistics, we use kernels as tools to achieve specific data analytic goals. We address the inferential potential of kernels and elaborate on their role in goodness of fit, density estimation and clustering. In the context of goodness of fit, we show that we can write classical goodness of fit statistics, such as Cramer-von Mises and Kolmogorov-Smirnov, as functions of specific kernels. We introduce the concept of a root kernel and discuss considerations that enter the design of a kernel. We derive an easy to use normal approximation to the power of kernel-based tests, and base the construction of a non-centrality index, an analogue of the traditional non-centrality parameter, on it. This leads to a method akin to Neyman-Pearson lemma for constructing optimal kernels for specific alternatives. We introduce a "mid-power" analysis as a device for choosing optimal degrees of freedom (DOF) for a family of alternatives of interest, and present goodness of fit tests for the two- and k-sample problems, along with a comparison with tests appearing in the machine learning literature.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program