Abstract:
|
The completion of the human genome two decades ago gave birth to the expansive and cross-disciplinary field of Genomics, and along with it, our own community of Statistical Genomics. From microarrays to high throughput sequencing, from genome-wide association studies to the recent advances in single cell profiling, wave after wave of technological innovation have fed Statistics with new data challenges that spurred methodological and theoretical developments. In this lecture, I will focus on two specific topics in Genomics: single cell sequencing and DNA copy number profiling, and describe the critical role of Statistics in their scientific development. I will start with DNA copy number profiling in bulk tissues, review the scientific background and early models, and describe how these models have adapted to adjust to the shifting sands of technological change. I will briefly survey the statistical developments that were seeded by these scientific inquiries, from change-point detection to multi-channel scan statistics to latent variable modeling. On the scientific side, I will focus on DNA copy number profiling in cancer and its role in the study of cancer cell evolution.
Despite our best computational efforts, bulk tissue sequencing can only tell us so much about how DNA copy number varies between single cancer cells within a tumor. Cancer is a Darwinian evolution of cells driven by somatic mutations, and it is important to detect and study these cell-to-cell DNA copy number variations. In the second half of my talk, I will turn to the modeling of data from single cell technologies, which have revolutionized the field of biology during the last decade. I will describe how the large, sparse data matrices from single cell experiments have inspired new models and statistical problems. I will also describe, to some detail, a specific method that we developed for allele-specific copy number estimation at the single cell level. The method, Alleloscope [1], has enabled the discovery of previously hidden types of variation within tumor cell populations.
[1] Wu C-Y, Lau BT, Kim H, Sathe A, Grimes SM, Ji HP, Zhang NR. Integrative single-cell analysis of allele-specific copy number alterations and chromatin accessibility in cancer. Nature Biotechnology, May 20, 2021. HTTPS://DOI.ORG/10.1038/S41587-021-00911-W
|