Within the large literature on estimating the number of kinds, solutions typically assume either full sampling data (where the number of individuals sampled and their kind are observed) or temporal data (all that is observed is the times between successive discoveries of new kinds). However, one of the most important applications of this problem, that of estimating the number of species, does not fit neatly into either solution type; full sampling data are not available, but discovery dates and known and there is proxy information about how much sampling might have occurred.
In this paper we look at a hybrid model where a latent process governs the number of individuals sampled, about which there is noisy data from some proxy e.g. amount of effort going into discovering new species as measured by the number of authors publishing papers in the field.
Bayesian learning is implemented by rejection sampling ABC. The performance of the implementation has been assessed using simulated data and applied to 2 large species discovery data registers: the Catalogue of Life and the World Register of Marine Species.
|