Abstract:
|
The existence of label noise in data has been a long-lasting problem in many classification applications. It is shown that errors in labels may affect the effectiveness of many widely used classification methods. In this work, we focus on label noise in asymmetric binary classification, specifically Neyman-Pearson paradigm. Suppose every observation has a true label and a corrupted label, then under the assumption of the mixture model for the corrupted classes, then with respect to true label, usual Neyman-Pearson classifiers will yield an unnecessary small type I error and possibly a type II error too large to be accepted. Thus, we propose a theory-backed algorithm that can keep the type I error under control with high probability while adjusting the effect of label noise.
|