Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 398 - Statistical and Computational Methods to Tackle Complex Diagnostic Challenges
Type: Invited
Date/Time: Thursday, August 12, 2021 : 2:00 PM to 3:50 PM
Sponsor: WNAR
Abstract #315526
Title: Information-Theoretic Classification Accuracy: A Data-Driven Approach to Combining Ambiguous Outcome Labels
Author(s): Jingyi Jeessica Li*
Companies: UCLA
Keywords: outcome prediction; ambiguous labels; label combination; information theory; multi-class classification
Abstract:

Outcome labeling ambiguity is ubiquitous in real-world datasets. Practitioners commonly combine ambiguous outcome labels based on expert knowledge to improve the accuracy of machine-learning classification. However, such an approach is ad hoc, and there lacks a principled method to guide the combination by any optimality measure. To address this problem, we propose the information-theoretic classification accuracy (ITCA), a measure of “information” conditional on outcome prediction, to guide practitioners on how to combine ambiguous outcome labels. ITCA indicates an optimal balance in the trade-off between prediction accuracy (how well predicted labels match observed labels) and prediction resolution (how many labels are predictable). We also develop two search strategies for the optimal outcome combination indicated by ITCA. ITCA and the search strategies are adaptive to all machine-learning classification algorithms. We investigate the theoretical properties of ITCA for the oracle classifier and the linear discriminant analysis classifier. Extensive simulations and real-data applications verify the effectiveness and wide application potentials of ITCA and search strategies.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program