Abstract:
|
Binary response data, encountered in applications ranging from market research to toxicology, are typically analyzed using logistic or probit regression, which imply latent threshold variables subject to parametric models (logistic or Gaussian, respectively). These can produce misleading results if the true threshold model does not match assumptions. The “pool adjacent violators” method provides a nonparametric maximum likelihood estimate for the threshold distribution (CDF), which minimizes the mean squared error relative to the observed data. We offer an alternative, based on kernel regression followed by an evolutionary algorithm that guarantees a monotone estimate of the CDF. We conjecture that this method minimizes the sum of squared errors relative to the true CDF, and show evidence based on simulations with various threshold distributions.
|