Abstract:
|
Data normalization is an important preprocessing step for transcriptomics data containing unwanted data variation due to experimental handling. There has been a critical yet over-looked disconnection between the use of data normalization and the goals of subsequent analysis: on one hand, methods for data normalization that have been developed for group comparison frequently encounter ‘off-label’ use for other analysis goals such as sample subtyping; on the other hand, analysis are often performed on normalized data neglecting potential normalization ‘side-effects’ such as over-compressed data variability. A bridge between these two is made possible by a unique pair of microRNA array datasets on the same set of tumor tissue samples that were collected at Memorial Sloan Kettering Cancer Center. In this talk, I will illustrate the use of this dataset pair to study the impact of data normalization on the development of molecular signature for tumor subtyping.
|