Abstract:

We propose a method to estimate population size based on capturerecapture designs of K samples. The observed data is formulated as a biased sample of n iid Kdimensional vectors of binary indicators from a conditional distribution given the vector is not 0, where the kth component indicates that subject being caught by the kth sample. The target quantity is the prob. that the vector is not 0. We cover models assuming a single general constraint on the Kdimensional distribution so that the target quantity is identified and the statistical model is unrestricted. We present worked out solutions for common constraints (Kway additive interaction=0 and conditional independence). We show the dramatic impact of the choice of constraint on the estimand value, so itâ€™s crucial for the constraint to hold by design. For the Kway multiplicative interaction=0 constraint, MLE suffers from the curse of dimensionality. We propose a targeted MLE that combines machine learning to smooth across the 2^K cells while targeting the fit towards the target parameter of interest. For each problem, we provide simulations wrt assumption violations, inference with CI and experimental designs with software.
