JSM 2017 Online Program

Online Program Home

My Program

Abstract Details

Activity Number:	417 - Contributed Poster Presentations: Section on Statistics in Epidemiology
Type:	Contributed
Date/Time:	Tuesday, August 1, 2017 : 2:00 PM to 3:50 PM
Sponsor:	Section on Statistics in Epidemiology
Abstract #324380
Title:	A Strategy for Evaluating Goodness-of-Fit for a Logistic Regression Model Using the Hosmer-Lemeshow Test on Samples from a Large Data Set
Author(s):	Michael Pennell* and Adam Bartley and Stanley Lemeshow and Gary Phillips
Companies:	Ohio State University and Department of Health Sciences Research, Mayo Clinic and College of Public Health, The Ohio State University and Center for Biostatistics, The Ohio State University
Keywords:	big data ; binary outcome ; power ; regression modeling ; resampling
Abstract:	The Hosmer-Lemeshow test is a commonly used method for assessing the goodness-of-fit of logistic regression models. As widely used as the Hosmer-Lemeshow test is, it can yield a high rate of rejecting acceptable models when used with large samples. Several studies have suggested that one way around this would be to perform the test on random samples of fewer observations from the original data. This procedure would be easy to do and would certainly reduce the power of the test, but no guidelines were given for how to implement the procedure or how to interpret the results. At least two studies have used this technique with little justification for their conclusions. The purpose of this study is to evaluate the method proposed by others and give a recommendation for implementation. Results of a simulation study suggest that when one hundred subsets of five thousand observations were taken, the model should be considered suspect if more than 10 of the subsets had significant Hosmer-Lemeshow tests.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

Copyright © American Statistical Association

Privacy Policy | Conduct Policy | Previous JSMs