Abstract:
|
Sociological research has long-studied the impact of social determinants as measured at the location level on outcomes. Geocoding is a well-known technique for extracting such information by mapping geographical location to something like census tract and pulling out relevant community information from tract level databases. However, when location is unknown, geocoding cannot be done. Examples include today's modern genomic databases which seek to probe the effects of genomic alterations on outcomes. These databases typically do not store location information on patients. However, for some diseases like cancer statewide registries exist which provide a strategy for building a linking model between a set of analysis observations and a reference sample drawn from the registry. This linking model can then be used to predict locations for analysis observations. We call this predictive geocoding. We detail the methodology, study empirical performance via a series of simulations, and then perform predictive geocoding on the Florida Cancer Data Systems (FCDS) registry. The results indicate the methodology works very well. A major consequence is that large genomic databases like The Cancer
|