Abstract #300742

This is the preliminary program for the 2003 Joint Statistical Meetings in San Francisco, California. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 2-5, 2003); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2003 Program page



JSM 2003 Abstract #300742
Activity Number: 359
Type: Topic Contributed
Date/Time: Wednesday, August 6, 2003 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistical Computing
Abstract - #300742
Title: Intelligent Information Filtering: Learning Prospective User Profiles
Author(s): Sofus Attila Macskassy*+ and Haym Hirsh and Foster Provost
Companies: Stern Business School, New York University and Rutgers University and Stern Business School, New York University
Address: Information, Ops. & Mgmt. Sciences Dept., New York, NY, 10012-1126,
Keywords: information filtering ; machine learning ; data mining ; user profiling
Abstract:

In many applications, large volumes of time-sensitive textual information require triage: rapid, approximate prioritization for subsequent action. We here explore the use of user-specified prospective criteria of importance of documents to produce a filtering/ranking system. By prospective, we mean importance that could be assessed by actions or events that occur in the future (e.g., a news story may be assessed as being important, based on events that occurred after the story appeared, such as a stock price plummeting). We describe here a process for creating and evaluating procedures that are based on prospective criteria. Using this process, it is possible to automatically build very large, labeled, training corpora, which can be used to train text classifiers. We illustrate the process with two case studies, demonstrating the ability to predict whether a news story will be followed by many very similar news stories, and also whether the stock price of one or more companies associated with a news story will move significantly following the appearance of that story. We conclude by discussing how the comprehensibility of the learned classifiers can be critical to success.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2003 program

JSM 2003 For information, contact meetings@amstat.org or phone (703) 684-1221. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2003