NAME: Lotto 6/42 Selections from Individuals, Irish National Lottery, and S-Plus Simulation TYPE: Six-tuples selected without replacement from {1, 2, 3, ..., 42}, plus a group identifier SIZE: 234 + 264 + 264 observations, 1 + 6 variables DESCRIPTIVE ABSTRACT: The dataset consists of samples of size six taken without replacement from the integers {1, 2, 3, ..., 42}. There are actually three datasets from three different sources, and in each case the six-tuples are (in theory) random selections or samples. The observations in each sample are given in the order in which they were obtained or selected. SOURCES: The data were obtained from three sources. The first 234 samples (coded with a 1) were obtained from university students in a large statistics class. After discussing the difficulties of being truly random in making selections, the 234 students were asked to act (once each) as a random generator for the Lotto 6/42 game by writing down in any order six numbers selected from {1, 2, 3, ..., 42}. The next 264 samples (coded with a 2) were the actual winning combinations for the Irish National Lottery game Lotto 6/42 during the period from September 24, 1994, until March 8, 1997. The final 264 samples (coded with a 3) were obtained by computer simulation using the package S-Plus. VARIABLE DESCRIPTIONS: Columns 1 Code for source of sample (1, 2, or 3) 9 - 10 First selection in sample 17 - 18 Second selection in sample 25 - 26 Third selection in sample 33 - 34 Fourth selection in sample 41 - 42 Fifth selection in sample 49 - 50 Sixth selection in sample Values are aligned and delimited by blanks. STORY BEHIND THE DATA: We thought it might be an interesting classroom exercise to get students to attempt to be random generators of numbers, and also to test their intuition about randomness. In the Lotto n/N game, the winning numbers are chosen by a highly sophisticated mechanical device, which appears to select numbers randomly. This is, of course, one of the appealing aspects of the Lotto game -- because the winning numbers are randomly selected, each of the possible winning n-tuples is equally likely to be chosen. A consequence is that the typical gambler has the same chance of winning on the basis of one n-tuple as any so-called expert. In Lotto, individuals often select numbers related to birthdays, ages, favorite dates, car registration numbers, etc. -- and hence the numbers are clearly not randomly selected. An interesting experiment for students in a class is to challenge them to attempt to act as truly random generators for a Lotto game. A comparison of the results from a group of students with the actual winning combinations from the Irish National Lotto 6/42 game (as well as some data simulated from a statistical package like S-Plus) can, with the use of basic graphs and descriptive statistics, lead to interesting insights into hidden (and sometimes subtle) biases that individuals possess, even when they attempt to be random. Additional information about this dataset can be found in the article "Trying to Be Random in Selecting Numbers for Lotto" in the _Journal of Statistics Education_ (Boland and Pawitan 1999). PEDAGOGICAL NOTES: These datasets provide a rich source of examples for classroom discussions on bias in the selection of random numbers. Many aspects of the datasets can be explored through the use of basic graphs and descriptive statistics. SUBMITTED BY: Philip J. Boland and Yudi Pawitan Department of Statistics National University of Ireland - Dublin Belfield, Dublin 4, Ireland Philip.J.Boland@ucd.ie, Yudi@ucd.ie