Activity Number: 78 - Nonparametric Modeling
Type: Contributed
Date/Time: Sunday, July 28, 2019 : 4:00 PM to 5:50 PM
Sponsor: Section on Nonparametric Statistics
Abstract #306452
Title: Targeted Learning of the Population Size Based on Capture-Recapture Designs
Author(s): Yue You* and Mark van der Laan and Nicholas Jewell and Robin Mejia
Companies: Biostatistics, UC Berkeley and UC Berkeley and Biostatistics, UC Berkeley and Carnegie Mellon University
Keywords: Asymptotic linear estimator; capture-recapture; MLE; influence curve; targeted maximum likelihood estimation (TMLE); population size

We propose a method to estimate population size based on capture-recapture designs of K samples. The observed data is formulated as a biased sample of n iid K-dimensional vectors of binary indicators from a conditional distribution given the vector is not 0, where the k-th component indicates that subject being caught by the k-th sample. The target quantity is the prob. that the vector is not 0. We cover models assuming a single general constraint on the K-dimensional distribution so that the target quantity is identified and the statistical model is unrestricted. We present worked out solutions for common constraints (K-way additive interaction=0 and conditional independence). We show the dramatic impact of the choice of constraint on the estimand value, so it’s crucial for the constraint to hold by design. For the K-way multiplicative interaction=0 constraint, MLE suffers from the curse of dimensionality. We propose a targeted MLE that combines machine learning to smooth across the 2^K cells while targeting the fit towards the target parameter of interest. For each problem, we provide simulations wrt assumption violations, inference with CI and experimental designs with software.

