Abstract:
|
Data normalization is a very popular technique to align the scales of measurements coming from different samples in genomic data, test scoring, and in patient reported outcomes, and questionnaire data in social sciences etc. All these data have in common that the instrument used to measure the outcome may modify the scale from sample to sample or test to test. This fact introduces high variability in the data that may observe the signal that would be observed when conducting standard analyses such as modeling or clustering. In order to address this issue the data is transform by a normalization method such as z-scores, linear or nonlinear quantile normalization or other. We show that the traditional transformation methods (linear transformations, z-scores, quantile normalization) are not always adequate to fulfill this task and we propose a new data normalization method applying the Fisher-Yates transformation. We performed a simulation to compare all these methods and to illustrate situation where Fisher-Yates would be more appropriate than the other methods. Finally we illustrate the Fisher-Yates normalization on data from clinical trials and DNA microarrays.
|