Abstract #300174


The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2002 Program page



JSM 2002 Abstract #300174
Activity Number: 88
Type: Invited
Date/Time: Monday, August 12, 2002 : 10:30 AM to 12:20 PM
Sponsor: SSC
Abstract - #300174
Title: The Challenge of Non-linear Regression on Large Datasets with Asymmetric Heavy Tails
Author(s): Yoshua Bengio*+ and Ichiro Takeuchi and Ronan Collobert
Affiliation(s): Université de Montréal and Mie University and Université de Montréal
Address: CP 6128, Succursale Centre-ville, Montréal , Quebec, H3C 3J7, Canada
Keywords: robust regression ; data mining ; machine learning ; asymmetric noise ; neural networks ; non-linear regression
Abstract:

As is typical in data mining, this paper studies problems where both a statistical and a computational issue are entangled. A problem that occurs in insurance and in finance is that of non-linear regression in the presence of asymmetric noise with heavy tails. Traditional robust regression methods obtain a reduced variance by downweighting outliers, which is fine if the noise is symmetric. However, in many important applications (e.g., claim amounts in insurance, asset returns in finance) the outliers are only on one side of the distribution, yielding heavily biased estimators. We study new approaches based on combinations of models which are individually biased but whose combination is unbiased. Because of unknown high-order dependencies, we apply these ideas using artificial neural networks as the building blocks. These methods have been applied to very large datasets (millions of examples), which raise other, more computational issues. Methods that require quadratic training time must be ruled out (e.g., SVMs). To address this issue, we present results on new divide-and-conquer learning algorithms which yield apparent linear training time.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2002 program

JSM 2002

For information, contact meetings@amstat.org or phone (703) 684-1221.

If you have questions about the Continuing Education program, please contact the Education Department.

Revised March 2002