Metrics to benchmark sequencing integrity and fidelity of data in targeted NGS.
*Bonnie LaFleur, HTG Molecular Diagnostics, Inc  Dean Billheimer, University of Arizona  Dominic LaRoche, HTG Molecular Diagnostics, Inc.,  Kurt Michels, HTG Molecular Diagnostics, Inc  Shripad Sinari, University of Arizona 

Keywords:

Recent work by Lovell (2015) and others (Billheimer, et. al.) have outlined strategies for analysis of data generated from biologic measurement systems, such as Next Generation Sequencing (NGS), in the context of compositional data. Compositional data arise when measurement is described in terms of relative quantities, or parts of a whole. For mRNA NGS applications, quantitation of genes/probes are constrained by sequencing capacity resulting in sample level attribute of sequencing depth. The compositional framework provides an ideal framework to understand this quantitation while taking into consideration the constraint of sequencing depth. Quality control can be performed as well as technical variation can be analyzed to identify samples that are outliers. Here, technical variation can be considered a special case of a batch effect, when the variation is analyzed for run quality. Technical replicates, both within and between sequencing runs are used to benchmark run quality. We demonstrate that the context of relative abundance coupled with multivariate methods, such as principle components analysis, providing a more intuitive and useful evaluation of technical variation.