Online Program Home
My Program

Abstract Details

Activity Number: 136 - Recent Advances in Dimension Reduction
Type: Contributed
Date/Time: Monday, July 29, 2019 : 8:30 AM to 10:20 AM
Sponsor: Section on Statistical Learning and Data Science
Abstract #304416
Title: Representative Approach for Big Data Dimension Reduction with Binary Responses
Author(s): Xuelong Wang*
Companies: University of Illinois at Chicago
Keywords: Sufficient Dimension Reduction; Binary response; Representative approach; Clustering; Big data
Abstract:

Sufficient dimension reduction (SDR) reduces the data dimensionality without specifying a regression model and thus being called "sufficient" for regression analysis. Most SDR approaches, such as Sliced Inverse Regression (SIR) and Sliced Average Variance Estimation (SAVE), work well with continuous responses, but not with binary cases due to the limited number of slices. In this article, we develop a novel SDR approach, called the representative approach, to deal with binary responses. By converting a block of data points into a representative data point, the corresponding binary responses become continuous and the size of the data is reduced significantly. Therefore, the proposed representative approach provides an ideal solution for big data dimension reduction and can be incorporated with the classical SDR approaches naturally. By both theoretical justification and simulation studies, we show that the proposed approach can recover the central subspace better than the original SDR methods. In order to be applicable for a massive dataset, we develop a streaming algorithm for big data dimension reduction and apply it to a real big dataset, the Airline on-time performance data.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2019 program