Abstract:
|
How does one quantify the number of identifiable deaths in the ongoing Syrian conflict? What might seem an easy question to answer belies a complex problem in the statistical field of record linkage, which seeks to combine multiple sources of data in cases where records cannot be matched easily. Simply put, war zone records are messy and large-they are full of inaccuracies and duplicates, are comprised of multiple data sets from multiple sources, and contain hundreds of thousands of records. Further, existing data sets are far from complete, and represent only a fraction of the overall number of fatalities. In this talk we will use Bayesian record linkage methods to address the problem of entity resolution for four databases of casualties from the Syrian civil war and provide multiple evaluations of the validity of the methods.
|