Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 81 - Contributed Poster Presentations: Section on Statistics in Epidemiology
Type: Contributed
Date/Time: Monday, August 3, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #313130
Title: An EM-Based Method for Genotyping and SNP Discovery in Allotetraploids
Author(s): Yudi Zhang*
Companies:
Keywords: allotetraploids; mixture model; EM; SNPs; genotyping; logistic regression
Abstract:

Genotyping and single nucleotide polymorphism (SNP) discovery via short read technology are vital for selection and breeding in crop species. These tasks are challenging in allotetraploids, like cotton, peanut, where alignment ambiguities lead to an excess of heterozygous calls. We propose a model-based method to infer the genomic origin of each read and genotype allotetraploids in targeted resequencing projects. We use a multinomial logistic regression model for sequencing errors with Poisson process to penalize indels, estimate the parameters via an EM, and determine optimal haplotypes by maximizing the likelihood. This method is implemented in Rcpp taking a SAM alignment file as input. Peanut resequencing data of several inbred lines, where heterozygosity is not expected, were used to validate our method. When applied to 12 target locations, each about 400bp, for one inbred line, our method demonstrates high accuracy in calling SNPs. Only 2 heterozygous calls were made in two targets, which compares favorably to samtools based genotyping, which called around 40 heterozygous positions in one target location and provides the typical input for SNP calling in allotetraploids.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program