Online Program

Return to main conference page
Saturday, February 22
Sat, Feb 22, 9:15 AM - 10:45 AM
Regency B
Statistical Methods in Health Care

Efficient Nonparametric Estimation of Population Size from Incomplete Lists (303974)

*Manjari Das, Carnegie Mellon University 
Edward Kennedy, Carnegie Mellon University 

Keywords: efficient influence function, population size, unbiased estimation

Estimation of total population size using incomplete lists has long been an important problem across many biological and social sciences. For example, partial and overlapping lists of casualties in the Syrian civil war are constructed by multiple organizations, and it is of great interest to use this information to estimate the magnitude of destruction of the war. Earlier approaches to solving these kinds of problem have either used strong parametric assumptions or suboptimal plugin-type non-parametric techniques; however, both approaches can lead to substantial bias, the former via model misspecification and the latter via smoothing. Under an identifying assumption that two lists are conditionally independent given covariate information, we make the following advances: First, we derive a nonparametric efficiency bound for estimating the capture probability, based on the efficient influence function. Then we construct a bias-corrected estimator that attains this bound under weak nonparametric conditions. Finally, finite-sample properties of the proposed estimator are studied with simulations, and we apply our methods to estimate HIV prevalence in Alameda County, California in 2013.