JSM 2004 - Toronto

Abstract #300987

This is the preliminary program for the 2004 Joint Statistical Meetings in Toronto, Canada. Currently included in this program is the "technical" program, schedule of invited, topic contributed, regular contributed and poster sessions; Continuing Education courses (August 7-10, 2004); and Committee and Business Meetings. This on-line program will be updated frequently to reflect the most current revisions.

To View the Program:
You may choose to view all activities of the program or just parts of it at any one time. All activities are arranged by date and time.

The views expressed here are those of the individual authors
and not necessarily those of the ASA or its board, officers, or staff.


Back to main JSM 2004 Program page



Activity Number: 325
Type: Topic Contributed
Date/Time: Wednesday, August 11, 2004 : 10:30 AM to 12:20 PM
Sponsor: Section on Statistics and Marketing
Abstract - #300987
Title: Effect of Simpson's Paradox on Market Basket Analysis
Author(s): Yuejiao Ma*+ and Dennis K.J. Lin
Companies: Pennsylvania State University and Pennsylvania State University
Address: 326 Thomas Bldg., University Park, PA, 16802,
Keywords: Market Basket Analysis ; association rule ; confidence ; improvement ; Simpson's Paradox ; common improvement
Abstract:

One of the well-studied problems in data-mining is pruning for association rules in Market Basket Analysis. Association rules discovery is an important database-mining algorithm that finds interesting association or correlation relationships among a set of items. Association rule induction is a powerful method for Market Basket Analysis, which aims at finding regularities in the shopping behavior of supermarket customers, online shops and the like. A useful association rule usually satisfies three measurements: support, confidence, and improvement, with minimum support, minimum confidence, and improvement greater than one. The measurements of Association rules are based on the aggregated dataset. It is very easy for the decision-maker to misinterpret the real relationship reflected in a nonaggregated dataset and miss potentially useful association rules. This well-known phenomenon is due to Simpson's Paradox that was defined by Simpson in 1951. For example, based upon the aggregated dataset, we find that the rule {if a customer buys product A, then this customer will also buy product B} does not satisfy the requirements at all.


  • The address information is for the authors that have a + after their name.
  • Authors who are presenting talks have a * after their name.

Back to the full JSM 2004 program

JSM 2004 For information, contact jsm@amstat.org or phone (888) 231-3473. If you have questions about the Continuing Education program, please contact the Education Department.
Revised March 2004