Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 139 - Challenges and Breakthroughs in Biomedical High-Dimensional Data Analysis in the Big Data Era
Type: Invited
Date/Time: Tuesday, August 4, 2020 : 10:00 AM to 11:50 AM
Sponsor: Caucus for Women in Statistics
Abstract #309366
Title: Smoothing Kernels for Categorical and Mixed-Scale Data in Density Estimation
Author(s): Marianthi Markatou*
Companies: University at Buffalo
Keywords: diffusion; kernels; density estimation; canonical diffusion kernels
Abstract:

Kernels are essential elements in the construction of learning systems and have received considerable attention in machine learning. In statistics, kernels are used as tools for achieving specific data analytic goals, such as density estimation. The literature includes methods for constructing multivariate kernels for interval scale data. We discuss the construction and properties of a special class of kernels, the class of diffusion kernels. We first offer a statistical definition of this class, and present an important sub-class, the set of canonical diffusion kernels. Using these kernels, we present an algorithm to construct kernels for categorical scale, either nominal or ordinal, data. We further extend this construction to obtain kernels appropriate for use with mixed-scale, that is both categorical and interval scale, data. Our algorithm uses ideas that relate to the theory of continuous time Markov processes and the theory of Toeplitz matrices. We illustrate the construction of these kernels in high-dimensional density estimation. Time permitting we will indicate the construction of test statistics, akin to chi-squared tests for independence.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program