Receiver Operating Characteristic (ROC) curves are often used to assess the performance of binary classification systems, allowing stakeholders to understand tradeoffs between Type I (false positive) and Type II (false negative) errors. This works well in textbook cases. In real world experiments, however, ROC curves can have unexpected subtleties that make them difficult to construct and interpret. For example, the Department of Defense is sponsoring the development of novel sensors and software to identify Unexploded Ordnance (UXO) among clutter. UXO are duds—munitions that were previously armed and fired but failed to explode. UXO can still pose a risk of detonation even decades later, threatening the safety of nearby humans, animals, vegetation, and structures. Conducting blind tests to demonstrate finding UXO is fraught with safety, logistical, and cost constraints that make it difficult to construct the textbook ROC curves. Yet, with careful planning and a few key assumptions, ROC-like curves can still be crafted to quickly tell the story of how well novel sensors and software can detect, classify, and locate UXO versus clutter in terrestrial and underwater experiments.