Keywords: discriminant analysis, kernel method, polynomial kernel, Gaussian kernel, classification, feature mappings
Fisher’s linear discriminant analysis is a classical method for classification, yet it is limited to capturing linear features only. The kernel discriminant analysis (KDA) as an extension is known to successfully alleviate the limitation through a nonlinear feature mapping. We study the geometry of nonlinear embeddings for KDA with polynomial kernels and Gaussian kernels by identifying the theoretical discriminant function given the data distribution. In order to obtain the theoretical discriminant function, we solve a generalized eigenvalue problem with between-class and within-class variation operators. For explicit description of the discriminant function, we use a particular representation for Gaussian kernels by utilizing the exponential generating function for Hermite polynomials We also show that the discriminant function for Gaussian kernels can be approximated using randomized projections of the data. Our results illuminate how the data distribution and the kernel interplay in determination of the nonlinear embedding, and provide a guideline to choice of the kernel and its parameters.