Activity Number:
|
417
- Contributed Poster Presentations: Section on Statistics in Epidemiology
|
Type:
|
Contributed
|
Date/Time:
|
Tuesday, August 1, 2017 : 2:00 PM to 3:50 PM
|
Sponsor:
|
Section on Statistics in Epidemiology
|
Abstract #324380
|
|
Title:
|
A Strategy for Evaluating Goodness-of-Fit for a Logistic Regression Model Using the Hosmer-Lemeshow Test on Samples from a Large Data Set
|
Author(s):
|
Michael Pennell* and Adam Bartley and Stanley Lemeshow and Gary Phillips
|
Companies:
|
Ohio State University and Department of Health Sciences Research, Mayo Clinic and College of Public Health, The Ohio State University and Center for Biostatistics, The Ohio State University
|
Keywords:
|
big data ;
binary outcome ;
power ;
regression modeling ;
resampling
|
Abstract:
|
The Hosmer-Lemeshow test is a commonly used method for assessing the goodness-of-fit of logistic regression models. As widely used as the Hosmer-Lemeshow test is, it can yield a high rate of rejecting acceptable models when used with large samples. Several studies have suggested that one way around this would be to perform the test on random samples of fewer observations from the original data. This procedure would be easy to do and would certainly reduce the power of the test, but no guidelines were given for how to implement the procedure or how to interpret the results. At least two studies have used this technique with little justification for their conclusions. The purpose of this study is to evaluate the method proposed by others and give a recommendation for implementation. Results of a simulation study suggest that when one hundred subsets of five thousand observations were taken, the model should be considered suspect if more than 10 of the subsets had significant Hosmer-Lemeshow tests.
|
Authors who are presenting talks have a * after their name.