Authors: CJ Alverson, CR Rose
It is well known that estimates of the mean are susceptible to distortion when extreme values are present. This distortion includes parameter estimates from mean-based models. A number of remedies exist for this problem in modelling, including quantile (median) regression, modelling the original data using a transformation, e.g., log scale, as well as modifying the data prior to fitting models (including for example the gamma hurdle model). A natural setting in which this problem is present is in health economics, where health care costs are of interest, and where it is plausible to encounter extreme values. Using simulated cost data, we explore how different modelling approaches yield different results when extreme values are present and discuss the potential implications on inference.