Copyright (c) 1994 by Bruce E. Trumbo, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.
Restrictions. The programs accompanying this article are also copyrighted 1994 by Bruce E. Trumbo, all rights reserved. These programs may be freely shared among individuals and used for any non-commercial educational purpose, provided they are not republished in any medium without express written consent from the author and advance notification of the editor.
Although extensively tested, these computer programs should be regarded as developmental. They may contain errors, some of which may cause unpredictable results, including computer "crashes." Source code is not available. Please report errors to the author. NO WARRANTY OR REPRESENTATION OF FITNESS FOR ANY PURPOSE IS MADE OR IMPLIED. The user assumes all risks of any kind in connection with the use of these programs and waives the right to claim damages.
Requirements. An IBM-compatible computer equipped with EGA graphics or better is required.
Key Words: Law of large numbers; Poisson process; Simulation; Computer program; Pedagogy.
Graphical, computational, interactive, and simulation capabilities of computers can be successfully employed in the teaching of elementary probability, either as classroom demonstrations or as exploratory exercises in a computer laboratory. In this first paper of a contemplated series, two programs for EGA-equipped IBM-PC compatible machines are included with indications of their pedagogical uses. Concepts illustrated include the law of large numbers, the frequentist definition of a probability, the Poisson distribution and process, and intuitive approaches to independence and randomness. (Commands for rough equivalents to the programs using Minitab are shown in the Appendix.)
1 Students often arrive in their first probability course with seriously deficient or confused intuitive ideas about the random phenomena being studied. Perhaps this is partly due to the human tendency to seek patterns even where none exist, and partly due to the vested interests of the gambling industry in cultivating erroneous impressions about chance events. Whatever the source of these misconceptions, the teacher of an elementary course in probability has the difficult task of eradicating them and helping to build the sound intuition that leads to self-confidence in understanding theory and making applications.
2 The straightforward way to show how chance events really behave is by firsthand observation, but non-computer demonstrations have serious drawbacks. For example, suppose that one wishes to show that the "heads ratio" settles to 1/2 in a long sequence of tosses, i.e., to illustrate the Law of Large Numbers (LLN). It is unlikely that in-class experiments with actual coins can be large enough to show the effect convincingly. A textbook may show a graphic representation of a single large simulation run, but the inevitable eccentric results of any one simulation experiment are as likely to confuse students as to make a convincing case for the principle supposedly being illustrated.
3 A computer simulation of sufficient size to be convincing (one based on several thousand tosses) can be quickly repeated several times so that the general principle stands out above the peculiarities of any one run. Here I include two programs for IBM-PC computers equipped with EGA graphics or better, which (1) simulate LLN experiments and (2) show how a Poisson process develops in space and time. Each section below begins with a brief description of what one program does and ends with an indication of how that program has been successfully used in elementary and intermediate probability classes.
4 Each of the approximately eight programs contemplated in this series is called from a Main Menu activated by the command PROBDEMO. The menu program is written to detect the presence of the two programs supplied in conjunction with this article as well as others that may be released later; it displays only programs currently loaded into the same directory. The title page of each program tells how to start running it. The F1 and F2 keys give technical details and additional options for more advanced users.
5 [Note: Some of the topics of programs planned for inclusion with future papers in this series are a demonstration of the Central Limit Theorem, a program that allows students to see bivariate data clouds with selected population correlations, a graphical presentation of binomial likelihoods, a Keno game (including expected values), a utility program for graphing density functions from well-known families with user-selected parameters, and a program (mainly for intermediate probability students) that explores what happens to the plane under various bivariate transformations commonly used in deriving probability distributions. The appearance of these additional programs in this journal is, of course, subject to editorial review.]
6 These programs are simple to operate, and students can use them successfully in exploratory laboratory sessions with only brief oral or (preferably) written instructions. With suitable large monitors or projection equipment, however, the programs also can be used to present demonstrations in large lecture sessions. With either method care must be taken to set the stage for the use of the computer program so that the message does not get lost in the fascination of watching the program run.
7 Because I teach mainly small classes and have an excellent lab available, I have had the most experience with the lab approach. An outline of some explorations I have encouraged students to try is given below the description of each program in abbreviated "lab manual" style. But it should not be difficult to adapt the text in lab-manual style to provide commentary for a lecture demonstration.
8 I have used these programs at a variety of levels and for students pursuing many different majors. This flexibility of use is due to several factors:
(a) Most of the programs can be understood at multiple levels. For example, the demonstration of the law of large numbers can be taken in the intuitive sense of illustrating or justifying the frequentist definition of probability at the beginning of a first probability course or in the probability chapter of a very elementary statistics course. It can also be used later in a first probability course, with somewhat closer scrutiny, when the law of large numbers is proved. Similarly, the program demonstrating a Poisson process can be used at a very elementary level to give an intuitive feel for randomness by watching events appear at random in space, or some of its more advanced features can be used in a course on stochastic processes when it is proved that the sum of two Poisson processes is a Poisson process.
(b) The menu system is presented at two levels. Usually, only the commands that will be needed by even the most elementary users appear on the screen. This simplicity at first sight helps to make the programs user friendly for the greatest number of potential users, including those with little mathematics and no computer experience. Users who want to graduate to the more advanced features of the programs can find out how to do so by reading the F2 information pages mentioned above (or by reading this paper).
9 Students in a first statistics course that has a high school algebra prerequisite can benefit from the basic parts of most of the programs. Students taking a first probability course that has a calculus prerequisite will be able to benefit from most features of all of the contemplated programs, although a few of the more advanced features will probably be most useful in a second probability course.
10 These programs are intended as flexible supplements to standard textbooks. As mentioned above, depending on equipment available, they can be used as demonstrations within lectures or in separate sessions conducted in a computer laboratory. They do not constitute a complete course, and they are not intended to substitute for either texts or lectures. Instructors interested in teaching an elementary probability course in which computer demonstrations of probability ideas are integrated at almost every stage may want to consider the text by Snell (1988) which contains many demonstration programs written in the True BASIC language.
11 These programs are written and compiled in Microsoft Quick Basic. They use the resident pseudorandom number generator which is adequate for classroom demonstration purposes, but is not of research quality. None of the programs is so sophisticated that similar ones could not be written by any reasonably competent programmer. In fact, one aim of this paper is to encourage others to write and contribute similar programs for educational use.
12 My programs do, however, have some advantages: attention has been given to making them look nice on the screen; they have been tested extensively so that some of the programming and pedagogical bugs have been worked out; they have been modified to take advantage of student input; and several teachers with a variety of styles have successfully used them in their present form. Undoubtedly, some bugs remain, and I would much appreciate being told about them so that I can distribute corrected versions. Source code is not available.
13 The package of probability demonstration programs included with this article consists of the following: BRUN40.EXE (Microsoft utility), PROBDEMO.EXE (Entry program), PDMEN.EXE (Menu program), PDLLN.EXE (Part 1), and PDPPR.EXE (Part 2).
14 This is a simulation of tossing (possibly biased) coins that illustrates the Law of Large Numbers. The user begins by selecting a value of p = P(Heads) = .1, .2, ..., .9, by pressing a number key from 1 to 9. One usually begins with a fair coin, p = .5. The default number of tosses is n = 4000, but values of n = 500, 1000, 2000, ..., 32000 may be easily selected instead. (Press H to halve the number of tosses, D to double.)
15 The fundamental concept of this program is based on William Feller's pioneering use of computers as an aid to understanding probabilities. The results of one computer run of 10,000 tosses of a simulated fair coin is shown in Feller (1950, p. 84). Each run has its eccentricities. With a modern computer, the present program allows students to look at enough runs so that they can begin to sort out the peculiarities of any one run from the general patterns.
16 In this program, a graphical display plots the heads ratio (cumulative number of heads at each stage divided by number of tosses so far) on a vertical axis extending from p - .1 to p + .1 against number of tosses on the horizontal axis. The interval which is 95% sure to contain the final result is shown at the right edge of the graph.
17 The F1 key shows technical details about the resolution of the display; the process actually plotted on the screen cannot be continuous and so is approximated by a discrete one. The F2 key shows additional options (such as monochrome operation, if needed for projection or for use with students with color deficient vision, and speed controls). Return to the title page from either the F1 or F2 screen by pressing ESC.
18 Begin by taking a coin from your pocket, tossing it 10 times, and recording the results of each toss. The "heads ratio" following each toss is defined as the number of heads so far divided by the number of tosses so far. For example, if your first five tosses resulted in HTTHT the first five values of the heads ratio for your experiment would be 1/1 = 1, 1/2 = .50, 1/3 = .33, 2/4 = .50, and 2/5 = .40.
19 Mark graph paper with a vertical scale running from 0 to 1 and a horizontal scale with values 1, 2, 3, ..., 10. Plot the 10 values of the heads ratio for your experiment and connect the dots with line segments. For a small experiment like yours, the heads ratio will likely fluctuate wildly. [Note: The graphing could also be done at the board in lecture, using results generated by students tossing coins.]
20 The Law of Large Numbers (LLN) says that, for larger experiments with many tosses, you are very likely to observe that the heads ratio becomes stable at a value near .5 (for a fair coin). Thus, the heads ratio eventually converges to P(Heads). It would be very tedious to perform and graph an experiment of this type that is large enough for you to be able to see the heads ratio actually begin to settle down, but a simple computer program allows you to simulate larger experiments with little trouble.
21 Type PROBDEMO at the DOS prompt. Then press ENTER to get to the "Main Menu," where you should select item 1.
22 You will begin by simulating an experiment in which a fair coin is tossed 4000 times. Notice that the program is already set for n = 4000 as you begin. Press 5 to get P(Heads) = .5. The resulting graph shows values of n from 1 to 4000 along the horizontal axis. The vertical axis runs only from .4 to .6 with a horizontal line running across the graph at p = .5. (The disadvantage of the "exploded" vertical scale, rather than one that runs from 0 to 1, is that some of the early values of the heads ratio will run off the graph. The advantage is that you get a magnified view on screen of what is going on as the number of tosses becomes large.)
23 Run the simulation several times, selecting p = .5 each time. Notice that even though the graphs all begin differently at the left-hand side, they all tend to look alike at the right-hand side as the heads ratio begins to converge to .5. There is a 95% chance that any one run will end within the small interval shown at the right-hand edge. (If you and your classmates do many runs, however, some of you are bound to see examples of runs that do not settle down so nicely even after 4000 tosses.)
24 The heads-ratio path for n = 4000 tosses is a bit smudgy because there are not actually 4000 pixels (distinct plotting positions) available across the screen of your monitor; some of the points must be plotted on top of others. You can get a clearer picture of the path if you press H (for halve) several times until n = 500. Note that the "target interval" at the right-hand edge of the screen is larger for this smaller experiment. Do one or two runs with p = 0.5.
25 Next, try running the 500-toss experiment with a coin heavily biased in favor of tails, p = .1. (Press 1 to start the run.) You should actually be able to see the resulting long runs of tails. The graph looks as if it may have been "written" by a left-handed person. (Occasional heads produce sharp up-swings, runs of tails give long slow declines.) Also try p = .9 (press 9) to see a ("right-handed") graph with long runs of heads. For p = .1 and p = .9, notice that the target intervals are smaller. There is less variability with such outrageously biased coins, and so their long-run behavior is more predictable. (Also notice that the vertical scale is automatically changed to put the line representing the target value of p across the center of the screen.)
26 If you have time, you may also want to try one or more very large experiments with n = 32,000. (Press D, for double, enough times to achieve this setting.) On some computers this simulation may run a bit slowly, but notice how well the heads ratio settles down when n is very large.
1. Now that you have seen the LLN in operation, what do you suppose is the "mechanism" by which it works? Suppose that by chance the experiment begins with a very large number of heads, so that the heads ratio starts off much larger than .5. Does the coin somehow realize that it "needs to come up tails" more often for the rest of the run in order to compensate for this early "misbehavior"? In probability terminology this would mean that later tosses of the coin would not be independent of earlier ones. Suppose that a fair-coin experiment started out with ten heads in a row. What effect would this unusual initial run of heads have on the heads ratio after 4000 tosses?
2. Slot machines in most large U.S. casinos are effectively regulated by governmental gaming commissions so that the result at each pull of the handle is independent of the previous results. What do you think of the gambler who leaves a friend to guard a slot machine into which he has unsuccessfully put a lot of money while he goes to get more change, on the grounds that it must be "about ready to pay off big"?
3. What would the graph of the heads ratio look like for a coin that is so biased that it always comes up heads (perhaps a two-headed coin)? Does this help you to understand why a coin that is very heavily biased towards heads, say p = P(Heads) = .9, gives a graph that tends to settle down more quickly than for a fair coin with p = .5?
4. In a lab where students run the LLN program on 20 computers simultaneously, why would you not be surprised to see one case in which the heads-ratio path does not end inside the 95% interval shown at the right-hand side of the screen?
27 I usually have students explore the LLN program during the first week of class in order to illustrate and reinforce the long-run view of probability (which I discuss among others). Later, in calculus-prerequisite courses where we actually get around to proving the LLN, I have them take a second look at the program, with some talk about an epsilon- neighborhood (or an "allowable" band of values) around the target value.
28 In intermediate classes it is possible to use the program in discussions of level-crossing probabilities (e.g., the probability that the heads ratio switches from above the .5 line to below it), even though it is not designed to count crossings.
29 A demonstration of the LLN using Minitab is shown in the Appendix for the benefit of those using Minitab on computers that will not run my program.
30 The second program envisions a Poisson process taking place over time in a rectangular domain. In one graph, points corresponding to Poisson events are plotted in a rectangle as they occur. The effect is as if the rectangle were marked on the exposed surface of an umbrella just as it is beginning to rain, and we watch raindrops appear at random within the rectangle.
31 A second graph on the screen plots the cumulative count to date against time, with the linear mean function shown in the background. Here one might imagine a piece of uranium ore and a geiger counter. The horizontal axis is marked in seconds, the vertical axis in cumulative counts. The "staircase" graph shows, for each point in time, how many radioactive particles have already been observed.
32 For either graph, the expected number of events at completion is fixed at 50. Roughly 95% of the runs will yield total counts between 35 and 65.
33 F1 shows a brief technical discussion about how a continuous time process is approximated on a computer with a discrete clock and graphics. By pressing the F2 key one may see a list of options. These include monochrome operation and speed control. In addition, the user may elect to show only the rectangular domain or only the count vs. time plot, or may split the process into the sum of a "Red" and a "Green" process in which each event has probability p of being "Red" (chosen to be .1, .2, ..., or .9). Return to the title page from either by pressing ESC.
34 Pass a black felt pen and a blank overhead projection slide with a cardboard border around the class. Ask each student to look at the rectangle and make a dot at a random location within the rectangle. (For classes larger than about 50, use only enough rows of seats to get about 50 dots; for smaller classes, pass the slide around more than once to get about 50 dots.) Show the results to the class.
35 This experiment seldom results in anything like a random placement of dots within the rectangle. Each student is influenced by the dots already made and most students tend to put their dots in sparse territory. Explain that if each dot were truly placed at random then previous results would not be taken into account in its placement. "Random" patterns generated by humans are bound to suffer from lack of randomness as each person places dots according to her own biases. [Note: The drawing of dots within a rectangle could, of course, also be done by individual students in lab, but it is more difficult there for the instructor to point out how the results differ from what one would expect of random ones.]
36 Type PROBDEMO at the DOS prompt and press ENTER to get to the "Main Menu," where you should select item 2. For your first run of the program press S (Spatial plot only) to simplify the display on the screen; you should see the letter S in the lower right-hand corner of the screen. Then press ENTER to begin the program.
37 This experiment differs from the one you did in class with the pen and overhead slide because not only the placement of the dots is random, but the number of dots is also random. The average number of dots is 50, but you should not be surprised to get as few as 35 or as many as 65, and more extreme results are certainly possible.
38 Try several runs of the experiment looking only at the Spatial Plot. Notice that random placement of the dots tends to result in clumping and in blank territories.
39 Now return to the title page by pressing N at the end of a run. This time do not press S before starting the program with ENTER. The stairstep pattern in the "Time Plot" shows the accumulated number of events up to the current time. The diagonal line in the background shows the "expected" position of the staircase at each point in time.
40 Now you will see how the events also occur randomly over time. Sometimes several events will occur almost together; sometimes relatively long periods of time will elapse between events. The total number of time units for this program is always 1000.
41 Random processes like this one are called Poisson processes after the French mathematician who first developed the mathematical theory behind them. Roughly speaking, the rules are as follows:
42 If you have time, return to the title page and press the number 5 before you press ENTER. This splits the Poisson process into two separate processes, called Red and Green. Since you pressed 5 (for .5) it is as if a fair coin is tossed as each event occurs to determine whether it is Red or Green. It is possible to prove that the Red and Green processes are both Poisson, as is their sum. (The same is true if the coin is not fair; you may wish to experiment by pressing numbers other than 5.)
43 The Poisson process has been used to describe bomb hits on London during World War II, accidents happening in the center of a large city, telephone calls being placed in a uniformly settled suburban area, birds building nests in a marshland, and atoms decaying in a radioactive material.
44 I usually show the Spatial Plot of the Poisson program to students during the first week of an introductory course in order to build intuition for what randomness and independence mean. Later, after the Poisson distribution is discussed, we often return to the program for a closer look at what it illustrates.
45 The option of splitting the process into Red and Green ones can illustrate either (a) the compounding of a Poisson process by the toss of a (possibly unfair) coin, as shown above, or (b) that the sum of two independent Poisson processes is Poisson. This option of the program is probably most effective in an intermediate-level course in probability or stochastic processes, but more elementary students seldom seem to have a problem understanding the idea intuitively.
46 Programs of the type presented here can be used to help build intuition about randomness and independence for students at the most elementary level, and to illustrate certain important probability theorems and models for students at a somewhat higher level. The effective use of these programs requires appropriate introduction before they are used, guidance (either in written or oral form) during use, and thought-provoking questions after use to help fix the ideas in mind. These programs can be used for lecture demonstrations, but they are more effective if students can use them interactively for exploration and comparison with results obtained by other students in a computer lab.
Development of the programs provided in conjunction with this paper was partially supported by NSF Grant USE 91-50433 and by California State University, Hayward.
Computer programs were compiled using Microsoft QuickBasic (Version 4.0); the utility BRUN40.EXE, which must be present to run the programs, is property of Microsoft Corporation and is used with permission.
For users who have access to Minitab on a platform other than PC DOS machines, it is possible to capture the spirit of Part 1, if not all of its flexibility and ease of use. The following Minitab commands, based on Minitab Version 7, show the required steps [annotations in brackets]:
MTB > set c1 DATA> 1:2000 DATA> end [c1 contains tosses so far.] MTB > rand 2000 c2; SUBC> bernoulli .5. [c2 contains 0s and 1s.] MTB > parsums c2 c3 [c3 contains total number of 1s so far.] MTB > let c3 = c3/c1 [c3 now contains Heads Ratio.] MTB > name c1 'Tosses' MTB > name c3 'H-Ratio' MTB > gplot; SUBC> line 0 2 c3 c1. [Solid red line connects points.]
The result is similar to that of running Part 1 except that the vertical scale includes all observed values of the heads-ratio (it usually goes from 0 to 1). Of course, P(Heads) can be chosen in the Bernoulli subcommand to be something other than .5. The number of tosses is taken to be 2000 to keep from exceeding the capacity of Minitab Version 7 on the PC I used to test this Minitab demonstration.
[Technical notes: (1) Other platforms and versions of Minitab may have different available Minitab worksheet sizes and so may accommodate larger or smaller simulations. (2) In Windows versions of Minitab, the gplot command is obsolete. Appropriate plots may be selected from the menu or typed on the command line using the new plot command in which arguments are separated by a *.]
These Minitab commands are straightforward and can be understood even by most elementary students with just a bit of orientation as to how Minitab works. A stored program with the steps above, perhaps embellished by the addition of opportunities for user input of number of tosses and P(Heads), could be executed repeatedly to allow multiple simulations in reasonably rapid succession.
It is somewhat more difficult and less intuitively satisfying to imitate Part 2 using Minitab. For one thing, it is necessary to admit explicitly that the continuous Poisson process is approximated in the discrete domain of computer graphics by a grid of 180 x 360 pixels and that the 1000 units of time must be treated discretely, with a probability of .05 of an event at each one--a binomial approximation to the Poisson. (This is explained on screen F1 of my Part 2.)
The Time Plot can be imitated in Minitab as follows:
MTB > set c1 DATA> 1:1000 DATA> end [c1 contains 1000 time steps.] MTB > rand 1000 c2; SUBC> bernoulli .05. [c2 contains 0s and 1s.] MTB > parsums c2 c3 [c3 contains event count to date.] MTB > set c4 DATA> 0 50 DATA> end [c4 and c5 contain endpoints MTB > set c5 of linear mean function.] DATA> 0 1000 DATA> end MTB > gplot; SUBC> line 0 2 c4 c5; [Plots mean line in red.] SUBC> line 0 3 c3 c1. [Plots cum. count against time.]
There is no control of the time it takes to do the plotting, so there is no sense of a developing process, only a view of the final result. Furthermore, it is not feasible to have the Time Plot and the Spatial Plot on screen at the same time.
The Spatial Plot can be imitated in Minitab by the following commands, in addition to those given above:
MTB > rand 1000 c6; SUBC> integer 1 180. [Y coord's of candidate points.] MTB > rand 1000 c7; SUBC> integer 1 360. [X coord's of candidate points.] MTB > let c6 = c6 * c2 [Non-event candidate points set MTB > let c7 = c7 * c2 to plot at origin.] MTB > set c8 DATA> 0 0 180 180 -1 DATA> end [c8 and c9 contain coordinates of MTB > set c9 box border, including a somewhat DATA> -1 361 361 0 0 unsuccessful attempt to obscure DATA> end non-event points now at (0,0).] MTB > gplot c6 c7; [Event points plot as white Xs.] SUBC> line 0 2 c8 c9. [Region border plots in red.]
Both parts of this Minitab sequence should be run from a stored program, because there is not much pedagogical value, especially for elementary students, in typing the instructions.
While the Minitab imitation of Part 1 is almost as good as my program, the Minitab imitation of Part 2 is more limited and difficult to use successfully. The Minitab command sequences have not been class tested nearly as extensively as the DOS programs discussed in the main body of this article.
Feller, W. (1950), An Introduction to Probability Theory and its Applications (2nd ed.), New York: John Wiley & Sons.
Snell, J. L. (1988), Introduction to Probability, New York: Random House.
Bruce E. Trumbo
Department of Statistics
California State University, Hayward
Hayward, CA 94542
Download Programs to a Local File
To unpack the files BRUN40.EXE, PROBDEMO.EXE, PDMEN.EXE, PDLLN.EXE, and PDPPR.EXE, type prob01 at the DOS prompt.