NAME: Lotto 6/42 Selections from Individuals, Irish National Lottery,
and S-Plus Simulation
TYPE: Six-tuples selected without replacement from {1, 2, 3, ..., 42},
plus a group identifier
SIZE: 234 + 264 + 264 observations, 1 + 6 variables
DESCRIPTIVE ABSTRACT:
The dataset consists of samples of size six taken without replacement
from the integers {1, 2, 3, ..., 42}. There are actually three
datasets from three different sources, and in each case the six-tuples
are (in theory) random selections or samples. The observations in each
sample are given in the order in which they were obtained or selected.
SOURCES:
The data were obtained from three sources. The first 234 samples
(coded with a 1) were obtained from university students in a large
statistics class. After discussing the difficulties of being truly
random in making selections, the 234 students were asked to act (once
each) as a random generator for the Lotto 6/42 game by writing down in
any order six numbers selected from {1, 2, 3, ..., 42}. The next 264
samples (coded with a 2) were the actual winning combinations for the
Irish National Lottery game Lotto 6/42 during the period from September
24, 1994, until March 8, 1997. The final 264 samples (coded with a 3)
were obtained by computer simulation using the package S-Plus.
VARIABLE DESCRIPTIONS:
Columns
1 Code for source of sample (1, 2, or 3)
9 - 10 First selection in sample
17 - 18 Second selection in sample
25 - 26 Third selection in sample
33 - 34 Fourth selection in sample
41 - 42 Fifth selection in sample
49 - 50 Sixth selection in sample
Values are aligned and delimited by blanks.
STORY BEHIND THE DATA:
We thought it might be an interesting classroom exercise to get
students to attempt to be random generators of numbers, and also to
test their intuition about randomness. In the Lotto n/N game, the
winning numbers are chosen by a highly sophisticated mechanical device,
which appears to select numbers randomly. This is, of course, one of
the appealing aspects of the Lotto game -- because the winning
numbers are randomly selected, each of the possible winning n-tuples is
equally likely to be chosen. A consequence is that the typical gambler
has the same chance of winning on the basis of one n-tuple as any
so-called expert. In Lotto, individuals often select numbers related
to birthdays, ages, favorite dates, car registration numbers, etc. --
and hence the numbers are clearly not randomly selected.
An interesting experiment for students in a class is to challenge them
to attempt to act as truly random generators for a Lotto game. A
comparison of the results from a group of students with the actual
winning combinations from the Irish National Lotto 6/42 game (as well
as some data simulated from a statistical package like S-Plus) can,
with the use of basic graphs and descriptive statistics, lead to
interesting insights into hidden (and sometimes subtle) biases that
individuals possess, even when they attempt to be random.
Additional information about this dataset can be found in the article
"Trying to Be Random in Selecting Numbers for Lotto" in the _Journal
of Statistics Education_ (Boland and Pawitan 1999).
PEDAGOGICAL NOTES:
These datasets provide a rich source of examples for classroom
discussions on bias in the selection of random numbers. Many aspects
of the datasets can be explored through the use of basic graphs and
descriptive statistics.
SUBMITTED BY:
Philip J. Boland and Yudi Pawitan
Department of Statistics
National University of Ireland - Dublin
Belfield, Dublin 4, Ireland
Philip.J.Boland@ucd.ie, Yudi@ucd.ie