Since 2001, most tuberculosis (TB) cases in the United States have occurred among non-U.S.-born persons. This situation has become increasingly common in industrialized countries, as latent TB infection acquired before migration may progress to disease years later.
TB incidence varies by country of origin and, within these subpopulations, depends on current year, individuals’ current age, and their time since entry. Differences in risk reflect TB epidemiology and secular trends in disease burden and migration. However, resolution at this level of granularity is made difficult by small subpopulations for many countries of origin with relatively large numbers of incident TB cases. In this talk, we describe a generalized additive model framework applied to the National TB Surveillance System dataset of reported cases from 2000-2016, controlling for population size estimated from U.S. Census Bureau data. After assessing a variety of models by cross-validation, we adopted a thin-plate spline regression with tensor product interactions and negative binomial family distribution. Our analysis identifies high-risk groups and offers a resolution unavailable from the raw data.