JSM 2005 Online Program

Abstract #302378

This is the preliminary program for the 2005 Joint Statistical Meetings in Minneapolis, Minnesota. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 7-10, 2005); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.

The Program has labeled the meeting rooms with "letters" preceding the name of the room, designating in which facility the room is located:

Minneapolis Convention Center = “MCC” Hilton Minneapolis Hotel = “H” Hyatt Regency Minneapolis = “HY”

Back to main JSM 2005 Program page

Legend:

= Applied Session,

= Theme Session,

= Presenter

Activity Number:	298
Type:	Invited
Date/Time:	Tuesday, August 9, 2005 : 2:00 PM to 3:50 PM
Sponsor:	IMS
Abstract - #302378
Title:	Inferring Label Sampling Mechanisms and Automatic Bayes Carpentry Using Unlabeled Data
Author(s):	Hui Zou*+ and Saharon Rosset and Ji Zhu and Trevor Hastie
Companies:	Stanford University and IBM and University of Michigan and Stanford University
Address:	Department of Statistics, Stanford, CA, 94305,
Keywords:	semi-supervised ; unlabeled data ; sampling mechanism ; method of moments ; bayes carpentry
Abstract:	In semi-supervised learning, we are given a set of labeled data and a huge amount of unlabeled data. Typically, the labeled data are assumed random samples from a underlying joint distribution of the response and features. In this work, we consider the situation where the ``label sampling'' mechanism stochastically depends on the true response (as well as potentially on the features). Ignoring the violation of the random sampling assumption will produce misleading results. For example, when the labeled data are collected by biased sampling, supervised learning algorithms are no longer Bayes consistent due to an inherent bias. We suggest a method of moments for estimating the stochastic dependence using the unlabeled data. With the inferred result, we propose a universal carpentry technique that ensures the Bayes consistency for most popular supervised classifiers, including boosting and kernel machines. Numerical experiments well support the proposed methodology.

The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2005 program