Abstract:
|
Many examples of large Binary data sets exhibit groups of observations with large intergroup Correlations. Conditional random effects binary logistic regression models, which directly model groups of observations using a logistic regression framework, are sometimes used in this case. We show that these models are useful for estimating intragroup correlation and, using a natural extension, also intergroup correlation. Another approach to estimating intragroup correlation involves modelling the energy as a quadratic function of sigmoids; we also naturally extend this to simultaneously estimating intergroup correlations. Our results are applied to a problem involving whether passengers booked for an airline flight show up or not. Groups of passengers exhibiting intragroup correlation arise naturally in such settings. We train the aforementioned models to predict whether passengers show up or not. Our results are compared with those obtained using a variety of alternative models. We show that incorporating the proposed group correlation structure improves the predictive performance over alternative methods.
|