Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 288 - SLDS CSpeed 5
Type: Contributed
Date/Time: Wednesday, August 11, 2021 : 1:30 PM to 3:20 PM
Sponsor: Section on Statistical Learning and Data Science
Abstract #318089
Title: WITHDRAWN Assessing Classification Uncertainty on Astronomical Objects with Measurement Error
Author(s): Sarah Shy and Hyungsuk Tak and Eric Feigelson and John Timlin and Jogesh Babu
Companies: Pennsylvania State University and Pennsylvania State University and Pennsylvania State University and Pennsylvania State University and Pennsylvania State University
Keywords: Bayesian; posterior predictive distribution; random forest; support vector machine; quasar
Abstract:

In astronomical data, measurement error uncertainties are often given in the data. However, many popular classification methods are unable to account for this unique property. We propose a model-agnostic method to incorporate heteroscedastic error into existing classification methods. First, we simulate pseudo-datasets from the Bayesian posterior predictive distribution of a measurement error model. Then, the classifier is fit to each simulation. The variation of any quantity across the simulations reflects the uncertainty propagated from the errors in both the training and test set. We demonstrate the approach via two studies: (1) a simulation study applying the procedure to SVM and random forest, and (2) identifying high-z (2.9 < = z < = 5.1) quasars from a merged catalog of the Sloan Digital Sky Survey, the Spitzer IRAC Equatorial Survey, and the Spitzer-HETDEX Exploratory Large-area survey. The proposed method reveals that out of 10,520 high-z quasars identified by a random forest without incorporating measurement error, 2,273 are potential misclassifications. In addition, out of ~1.8 million objects not identified as high-z quasars, 765 can be considered new candidates.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2021 program