Online Program

Friday, February 20
CS07 Text Analytics and Dimension Reduction Methods Fri, Feb 20, 11:00 AM - 12:30 PM
Napoleon C

Sparse Partial Robust M Regression (302898)

Christophe Croux, KU Leuven  
Peter Filzmoser , TU Vienna 
Irene Hoffmann , TU Vienna  
*Sven Serneels, BASF Corp.  

Keywords: Regression, sparsity, robust statistics, PLS, biplot.

As data dimensions increase, so does the need to automate parts of a statistical analysis on them. But then, together with data dimensionality, the probability will increase that the data contain uninformative parts and anomalies.Therefore, efficient analysis of Big Data requires estimators that select and visualize informative subspaces in the data automatically while not being distorted by a moderate fraction of outliers. Sparse partial robust M regression, introduced in this talk, is a new data analysis method tailored for exactly those needs. Sparse partial robust M regression combines the advantages of three widespread methods: partial least squares, sparse regression, and robust regression. Partial least squares regression has the advantage of being a regression tool based on latent subspace estimation that can be visualized and interpreted easily through biplots. Sparse regression methods such as the LASSO automatically deselect uninformative variables. Robust regression methods yield regression estimates unaffected by outliers.Thanks to the new method, these three steps can be combined into a single analysis, opening windows toward automation.