Online Program Home
My Program

Abstract Details

Activity Number: 85 - Machine Learning in Biomedical Data
Type: Contributed
Date/Time: Sunday, July 28, 2019 : 4:00 PM to 5:50 PM
Sponsor: ENAR
Abstract #307008 Presentation
Title: Assessment of Classifier Performance Using a Reference Classifier with Known Performance and an Unlabeled Dataset
Author(s): Alexej Gossmann* and Weijie Chen and Berkman Sahiner
Companies: U.S. Food and Drug Administration, Center for Devices and Radiological Health and Food and Drug Administration and U.S. Food and Drug Administration, Center for Devices and Radiological Health
Keywords: performance assessment; unlabeled data; imperfect reference; diagnostic tests; classification algorithms; performance comparison

Classification performance is usually estimated by comparing a set of predicted outcomes to corresponding ground truth labels. However, acquisition of the ground truth can be difficult, unethical, or impossible in many cases (e.g., medical applications). We derive lower bounds on sensitivity and specificity of a new test (e.g., a biomarker or an AI classifier) based on a reference test with known performance and the predictions of the two tests on the same unlabeled data. We also propose hypotheses tests for comparing the performance of a new test with a reference test when the ground truth labels are unavailable. Our methods are model-free and rely only on basic assumptions that can be reasonably expected to hold about the dependency between the outcomes of the two tests. We perform simulations as well as case studies on real data to demonstrate the performance of our methods, and to compare them to alternative approaches. This methodology is potentially useful in assessing whether a new test meets a pre-specified performance goal or is superior in performance to a reference test when a dataset with ground truth is not available, but a reference test with known performance exists.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program