Reproducible Research: Case Studies in Forensic Bioinformatics
View Presentation View Presentation
*Keith Baggerly, UT M.D. Anderson Cancer Center 

Keywords: reproducibility, forensic bioinformatics

High-throughput biological assays let us ask detailed questions about disease, and promise to let us personalize therapy. Data processing, however, is often not described well enough to allow for reproduction, leading to exercises in “forensic bioinformatics” where raw data and results are used to infer what the methods must have been. Unfortunately, poor documentation can shift from an inconvenience to an active danger when it obscures not just methods but errors.

We examine several related papers using array-based signatures of drug sensitivity derived from cell lines to predict patient response. Clinical trials are being run based on these results. However, the results incorporate several errors that may be putting patients at risk. The most common errors are simple (e.g., row or column offsets). We briefly discuss steps we are taking to avoid such errors in our own investigations.