Online Program

Return to main conference page
Friday, February 16
CS02 Practical Considerations for Modeling Fri, Feb 16, 9:15 AM - 10:45 AM
Salons BC

Evaluating Model Fit for Predictive Validity (303555)

John L. Gatta, Northwestern University 
*Katherine M. Wright, Northwestern University 

Keywords: Model selection, generalizability, model validation, predictive modeling

Goals for statistical modeling are often to describe, to explain, or to predict. In practice, conflation between these goals is common and potentially problematic. Overly complex models may explain large amounts of variation in a development sample, but these models do not necessarily yield the best predictive power when implemented in a production environment. Disentangling these concepts is crucial to building optimal models. This paper presents both theoretical and empirical explorations of the tradeoffs between model complexity and generalizability.

Using a real longitudinal dataset, we evaluate the model fit and predictive accuracy of different linear models of varying complexity. Data were randomly partitioned into development and replication samples to build and evaluate models separately. Comparisons show unnecessarily complex models suffer poorer predictive performance than simpler models with less parameters. Motivated by these findings, we present pragmatic and technical guidance for predictive model-building in the context of a real-world production environment.