Abstract:
|
A popular strategy for visually summarizing bivariate data is plotting contours of an estimated density surface. Most commonly, the density is estimated with a kernel density estimator (KDE) and the plotted contours correspond to equally spaced intervals of the estimated density's height. Notably, this is the case for geom_density_2d() and geom_density_2d_filled() from ggplot2. The proposed ggdensity package extends ggplot2, providing more interpretable visualizations of bivariate density estimates using highest density regions (HDRs). geom_hdr() and geom_hdr_lines() serve as drop-in replacements for the aforementioned ggplot2 functions, plotting density contours that are chosen to be inferentially relevant. By default, they plot the smallest regions containing 50%, 80%, 95%, and 99% of the estimated density (the HDRs). ggdensity also implements the estimation and plotting of HDRs resulting from estimators other than the standard KDE; densities can be estimated by histograms, frequency polygons, and fitting a parametric bivariate normal model. Also included are the functions geom_hdr_fun() and geom_hdr_fun_lines() for plotting HDRs of user-specified probability density functions.
|