Formalin-fixed, paraffin-embedded (FFPE) samples have great potential for cancer biomarker discovery and retrospective studies of other diseases. However, its application is hindered by the unsatisfactory performance of traditional gene expression profiling techniques on damaged RNAs extracted from these samples. NanoString nCounter platform is a medium-throughput technology that measures gene expression with high sensitivity. This platform is compatible with FFPE samples, which may turn the large collections of FFPE samples into valuable resources for academic research and clinical applications. However, statistical methods for normalizing NanoString nCounter data generated with FFPE samples are far behind those for traditional technologies such as microarray.
In this paper, we construct an integrated system of random-coefficient hierarchical regression models called RCRnorm to capture main patterns and characteristics observed from real NanoString nCounter data for FFPE samples, and develop a Bayesian approach to estimate model parameters and further normalize gene expression across different samples. Performance of RCRnorm is validated on simulated datasets and real data.