The Surveillance, Epidemiology, and End Results (SEER) Program collects data from about 34% of cancer cases diagnosed in the US. SEER variables include demographic information for individuals diagnosed with cancer, e.g., sex, age at diagnosis and county of residence, and cancer type and related information, e.g., stage at diagnosis, tumor location(s), initial treatments, survival and cause of death. Public health officials, cancer researchers and the public use SEER data to understand, investigate and research trends in cancer incidence and survival.
Collecting variables not currently in SEER, such as comorbidities and background variables like body mass index and smoking history, would enable new understandings of US cancer incidence and survival. Unfortunately, it is likely not possible to collect new variables for all SEER cancer cases. However, statistical sampling can enable sound inferences based on data collected for a small, carefully chosen subset of cases.
Using data from the Louisiana State University Health Sciences Center Louisiana Tumor Registry, this talk explores sampling methods that could guide collection of new variables for a small subset of SEER cancer cases.