![IconGems-Print](images/IconGems-Print.png)
34 – Advances in Analysis of Categorical Data
A Power Study of the Gffit Statistic as a Lack-of-Fit Diagnostic for Sparse Two-Way Subtables
Junfei Zhu
Arizona State University
Mark Reiser
Arizona State University
Maduranga Dassanayake
Arizona State University
Silvia Cagnone
University of Bologna
The Pearson and likelihood ratio statistics are commonly used to test goodness-of-fit for models applied to data from a multinomial distribution. When data are from a table formed by cross-classification of a large number of variables, the common statistics may have low power and inaccurate Type I error level due to sparseness in the cells of the table. It has been proposed to assess model fit by using a new version of GFfit statistic based on orthogonal components of Pearson chi-square as a diagnostic to examine the fit on two-way subtables. However, due to variables with a large number of categories and small sample size, even the GFfit statistic may have low power and inaccurate Type I error level due to sparseness in the two-way subtable. In this paper, a method based on choosing different orthogonal components for the GFfit statistic on the subtables is developed to improve the performance of the GFfit statistic. Simulation results for power and type I error rate for several different cases along with comparisons to other diagnostics are presented.