Abstract:
|
Next-generation sequencing has been used to addressed a diverse range of biological problems through. However, the error rates for next-generation sequencing are often higher, which impacts the downstream genomic analysis, in comparison with conventional sequencing. Recently, a shadow regression approach was proposed to estimate the error rates under the assumption of a linear relationship between the number of error-free sequenced reads and the number of reads containing errors (denoted as shadows). However, this linear read-shadow relationship assumption may not be appropriate for all types of sequencing data. Therefore, it is essential to propose a more reliable approach to estimate the sequencing error rates without assuming linearity. In this study, we proposed an empirical error rate estimation approach, which is free of linearity, and provides more accurate error rate estimations for next-generation sequencing data.
|