Abstract:
|
The Bayesian Improved Surname Geocoding (BISG) method estimates a vector of six race/ethnicity probabilities (White, Black, Hispanic, Asian, AI/AN, and multiracial) for individuals based only on surname and address using Bayes' rule. The distribution of self-reported race/ethnicity by surname is available from the 2000 US census for surnames with at least 100 occurrences; these surnames account for almost 90% of all individuals in the US. Addresses are geocoded to block group, and distributions of block group by self-reported race/ethnicity can be derived from 2010 US census data. An extension also incorporates an imperfect administrative measure of race/ethnicity from Social Security (SS) data. This talk will describe recent work to improve the performance of the method by (a) allowing the association of SS data with race/ethnicity to vary by age and (b) newly addressing compound surnames. Additional analyses incorporate additional data elements, explore alternative functional forms, and consider classification rules. We evaluate alternative methods using data from the 2014 Medicare CAHPS surveys, a large nationally representative dataset of Medicare beneficiaries.
|