Abstract:
|
In the current computing age, models can often have hundreds or even thousands of parameters. With these large models comes the risk of losing the ability to communicate and understand the precise meaning of the individual parameters in a model. In a frequentist setting, one can use an L1 penalty to reduce the number of parameters in a model but similar methods have not been developed for Bayesian settings where the quantity of interest involves an integral over the posterior distribution. We introduce a new method using a penalized 2-Wasserstein distance to reduce the dimensionality of the parameter space while still obtaining a distribution over the remaining dimensions. Our method allows users to select a budget on how many parameters they wish to understand, interpret, and communicate to an audience, and, in a data dependent way, select a reduced posterior of the selected dimension that minimizes the distance to the full posterior. We provide simulation results comparing performance to other posterior summary methods and apply the method to cancer and environmental health data.
|