Online Program Home
  My Program

Abstract Details

Activity Number: 43 - Statistical Methods for Complex Networks
Type: Invited
Date/Time: Sunday, July 30, 2017 : 4:00 PM to 5:50 PM
Sponsor: National Science Foundation
Abstract #321989
Title: DNA Seriation Under Planted Hamiltonian Path Model
Author(s): Jiaming Xu*
Companies: Purdue University
Keywords: DNA sequencing ; planted models ; information-theoretic limits ; spectral methods ; computational barriers ; statistical optimality
Abstract:

One of the most pressing challenges in genomics is to reconstruct a long, and contiguous genome sequence from short contigs. Recent work shows that long-range link information contained in Hi-C data and Chicago date can be exploited to recovering the correct orders among contigs. In particular, there tends to be higher Hi-C link or Chicago link densities between pairs of two closely located contigs. For studying the problem of ordering contigs based on long-range link information, we propose a planted Hamiltonian path model, where edges on the hidden Hamiltonian path tend to be more heavily weighted than other edges. The goal is to recover the hidden Hamiltonian path from observation of edge weights. We identify the sharp information-theoretic limits of exact recovery and show that a simple greedy algorithm achieves the exact recovery limit within a factor of 2.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2017 program

 
 
Copyright © American Statistical Association