Abstract:
|
Unique modeling and computational challenges arise in locating the geographic origin of individuals based on their genetic backgrounds. In our original program, OriGen, we tackled this problem for SNPs, single-nucleotide polymorphisms. Here, we will present work done on extending the model to microsatellites. Specifically, we divide the region into pixels and operate locus by locus. We estimate allele frequencies across the landscape by maximizing a product of multinomial loglikelihoods penalized by nearest neighbor interactions. This penalization smooths allele frequency estimates over the region and allows estimation at pixels with no data. Additionally, we explore a penalty on alleles of similar lengths since they tend to have more similar frequencies. Maximization is accomplished by a minorize-maximize (MM) algorithm. Once the allele frequency surfaces are estimated, we apply Bayes' rule to compute the posterior probability that each pixel is the pixel of origin of an individual. For admixed individuals, we estimate the fractional contribution of each pixel to a person's genome using another MM algorithm. We applied this model to various datasets and present our results.
|
ASA Meetings Department
732 North Washington Street, Alexandria, VA 22314
(703) 684-1221 • meetings@amstat.org
Copyright © American Statistical Association.