Online Program Home
My Program

Abstract Details

Activity Number: 563
Type: Contributed
Date/Time: Wednesday, August 3, 2016 : 11:35 AM to 12:20 PM
Sponsor: Section on Statistics in Genomics and Genetics
Abstract #321804
Title: Clustering Functional Data from High-Throughput Sequencing Assays
Author(s): Emery Goossens* and Heejung Shim
Companies: and Purdue University
Keywords: clustering ; functional data analysis ; Bayesian hierarchical modeling ; wavelets ; genomics
Abstract:

Clustering methods are an essential part of exploratory analysis of genomic high-throughput sequencing data. K-means or hierarchical clustering are frequently applied to summary statistics computed from a predefined 'window' (e.g. a gene or read count peak) with a dissimilarity measure such as Euclidean distance. While traditional clustering methods can use higher-resolution information within these windows such as read counts per base pair, they are not able to model the spatial structure of the data. We consider various functional clustering methods that account for spatial structure by employing Bayesian wavelet-based modeling. Wavelet-based approaches have been shown to effectively model high-throughput sequence data using a sparse representation. We investigate their performance in multiple applications of ATAC-seq, which measures genome-wide chromatin accessibility. One example includes clustering diverse patterns of (co-)transcription factor binding. We explore improving functional clustering methods in the genomics context by including gene annotation and sequence information.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2016 program

 
 
Copyright © American Statistical Association