Teaching Bits: A Resource for Teachers of Statistics

Topics for Discussion from Current Newspapers and Journals

William P. Peterson
Middlebury College

Journal of Statistics Education Volume 11, Number 3 (2003), jse.amstat.org/v11n3/peterson.html

Copyright © 2003 by the American Statistical Association, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent.


"The Truth About Polygraphs."

by Charles P. Pierce, Boston Globe Magazine, August 3, 2003.

In 2003, the National Academy of Sciences released a report entitled The Polygraph and Lie Detection. As explained in the executive summary, the goal was specifically "to conduct a scientific review of the research on polygraph examinations that pertains to their validity and reliability, in particular for personnel security screening " The report concludes that the use of polygraphs is more problematic in screening job applicants than in questioning criminal suspects, because there is no specific incident under investigation.

In light of this, the Globe Magazine asks why government agencies are still relying on polygraphs in hiring decisions. In 1988, the Employee Polygraph Protection Act banned such use of polygraphs by private businesses. In spite of this, 62% of police departments in the country use polygraphs to screen applicants, up from 19% four decades ago. The article begins with a long anecdote about a candidate who was rejected by both the DEA and FBI, apparently on the basis of shaky polygraph testimony concerning his drug use as an adolescent.

The 9/11 terrorist attacks convinced many people that we must take every measure possible to screen public safety officials. According to statistician Stephen Fienberg, who headed the NAS panel, "A crisis mentality permeated everything, and it still does. The mystique of this machine always has overpowered the science of it, and now it does especially." Furthermore, the article points out that the infamous spies Aldrich Ames and Robert Hanssen both were able to pass lie detector tests.

This is a long essay, containing many other interesting details about the invention of the lie detector and its historical uses. Here is one more quote, this one from Richard Nixon, who had the following to say about his use of polygraphs to screen staffers: "I know they'll scare the hell out of people."


"Never Bitten, Twice Shy: The Real Dangers of Summer."

by David Ropeik and Nigel Holmes (opinion/editorial) The New York Times, August 9, 2003.

Ropeik is an expert on risk, and Holmes is a graphic designer. In this piece, they combine forces to show the way that our emotional fears can influence our perception of risk. They state, "When asked in the abstract about the term 'risk,' Americans correctly tend to talk in terms of statistical probability. Yet when they are faced with specific threats, emotion overrules logic pretty quickly - we fear the unlikely and are relatively unconcerned about the truly dangerous."

The authors provide data for a collection of familiar summertime risks (skin cancer, fireworks, or the West Nile virus, as examples). For each, the authors present the risk of injury, the risk of death, and the "fear index", as measured by the number of news stories about the topic written last summer. Here are the data in table form.

Risk
Odds of injury requiring medical treatment.
Odds of dying
Fear: Number of newspaper articles written last summer

Skin cancer
1 in 200
1 in 29,500
102
Food poisoning
1 in 800
1 in 55,600
257
Bicycles
1 in 1,700
1 in 578,000
233
Lawn mowers
1 in 5,300
not available
53
Heat exposure
not available
1 in 950,000
229
Children falling out of windows
1 in 12,800
1 in 2.4 million
89
Lyme disease
1 in 18,100
not available
47
Fireworks
1 in 32,000
1 in 7.2 million
59
Amusement parks
1 in 34,800
1 in 71.3 million
101
Snake bites
1 in 25,300
1 in 19.3 million
109
Drowning (while boating)
1 in 64,500
1 in 400,000
1,688
West Nile virus
1 in 68,500
1 in 1 million
2,240
Shark attacks
1 in 6 million
1 in 578 million
276

The data are accompanied by a graphic, which uses stylized icons to represent each risk category and outcome: medical treatment is represented by a bed, death by a coffin, and news stories by a newspaper. While this presentation has some visual appeal, it unfortunately has several logical problems. The intention was to show that the fear index is not indicative of the actual risks. The graphic presents the above information along a line, implying a linear relationship that doesn't actually exist. The scale runs from "more risk, less fear" on the left to "less risk, more fear" on the right, suggesting a negative association. But the stylized line is plotted with a positive slope, which seems to undermine the intended message. Furthermore, that message itself is problematic. In an actual scatterplot of fear index vs. death risk, drowning and West Nile virus do show up as high-risk and low-fear points, but they are outliers. The overall correlation coefficient between death risk and fear index is about -0.2; without the drowning and West Nile points it is about +0.1.


"How Many in the Dark? Evidently Not 50 Million."

by Mike Mcintire, New York Times, August 17, 2003, p. A29.

After last summer's major blackout in the Northeast, it was widely reported that 50 million people had lost power. That figure originated in a news release from the North American Electric Reliability Council (NERC). However, according to NERC spokesperson Ellen Vancko, the figure actually represented the total population of the affected area. Many of those people still had power. In the New York region, which represents 18 million people, the New York Independent System Operator reported that about 20 percent of available electricity remained on.

Getting a good estimate is complicated by the fact that power companies organize their accounts by bill-paying customers, which are typically households or businesses as opposed to individual users. Data from the largest companies in the affected area show that at least 10.5 million customers lost power. Finding out how many people that represents is another matter.


"Time to Hit the Panic Button? How We Decide What's Risky."

by Joel Achenbach, National Geographic [Online], September 2003.

Here is another article about risks. It begins with a remarkable story a man who sat calmly reading the newspaper in a Washington, DC park on the fateful morning of 9/11, even as the Pentagon was on fire and evacuations underway. This unflappable fellow had apparently sized up the situation and concluded that the attack was over!

The article goes on to discuss some of the famous psychological studies that have shown how our intuitive responses to risk often prevail over analytical reasoning. In one experiment, subjects could win a dollar by drawing a red jelly bean from one of two bowls. The first had 100 beans, 7 of which were red; the second had 10 beans, only one of which was red. A majority of subjects preferred the first, even knowing that the odds were worse, because on some emotional level the chances seemed better. Professionals do not necessarily fare better. When asked when to release a patient from a mental hospital, clinicians were more likely to consent when the patient's chance of becoming violent was given as 20% than when that same risk was described by saying 20 patients out of 100 become violent.

The online article contains a pointer to related discussions. One is the University of Delaware's Disaster Research Center, which maintains an extensive set of links to research groups and news outlets that track disasters. Achenbach's bottom line advice: "Analyze your situation. Crunch the numbers. But don't be so logical you forget to run for your life."


"Talk of the Town: Making the Grade."

by Malcolm Gladwell, New Yorker, September, 15, 2003, pp. 31-34.

Gladwell finds the "No Child Left Behind" law to be reminiscent of the industrial-efficiency mindset from a century ago. The logic runs that once the correct standards are set, educational productivity will follow. However, the article warns that school-to-school comparisons can be tricky. North Carolina, for example, has tried a program to annually recognize the twenty-five most improved schools, as measured by average test scores. Smaller schools appear to be overrepresented; the article notes that the higher variability in their averages provides at least a partial explanation for this. California is cited as another example. That state ranks schools on a 1000-point Academic Performance Index (A.P.I.), which is used to award state grants. For towns, changes on the order of a point or two translate into thousands of dollars in state aid, and can even affect real estate prices. But the article reports that "the average margin of error on the A.P.I. is something like twenty points, and for a small school it can be as much as fifty points."

"No Child Left Behind" lets individual states choose their own tests for measuring proficiency. This opens the door for underperforming states to choose more lenient standards. Even with an agreed-upon test, defining proficiency is problematic. The article describes possible three methods: "contrasting groups," bookmark" and Jaeger-Mills. In the contrasting groups method, a large sample of teachers are asked to identify students they consider proficient; those students are tested, and their scores define the range for proficiency. In the bookmark method, educators rank questions in order of difficulty, and identify the dividing line between proficient and nonproficient. The Jaeger-Mills method has educators assign quantitative difficulty measures to test questions, and identify a passing score. The article cites a Kentucky study in which all three methods were tried on the state's middle school students. The contrasting groups analysis found 22.7% of the students to be proficient, compared with 61% by the bookmark method and 10.5% by Jaeger-Mills.


"A 7-Game World Series is Unusually Common."

by Kenneth Chang, New York Times, October 22, 2003, p. 6.

"Are 7-game World Series More Common Than Expected?"

by Ben Stein, Inside Science News Service [Online], October 17, 2003 (updated October 20, 2003).

These articles came out during the 2003 World Series. They reported that among the 50 World Series from 1952 to 2002 (there was no Series in 1994), 48% went the full seven games. Is this more than we should expect? The Times says that a "simple statistical calculation" gives the probability distribution for the length of a series under a fair coin tossing model for the outcomes of the games. That distribution is shown below, along with the observed lengths of the last 50 Series.

Length Probability Observed
4 1/8 8
5 1/16 8
6 5/16 10
7 5/16 24

The original Inside Science News Service article quotes Harvard statistician Carl Morris, who says, "There is only a 1% chance that at least 24 of the last 50 World Series would reach seven games if these simple probabilities were correct." The 20 October update takes another look, now considering all 94 Series from 1905-2002 played under the best-of-seven format. Only 11 out of earlier Series went to a seventh game, giving a total of 35 out of the 94 Series played from 1905-2002. According to Professor Morris, "The imbalance was really created between 1952 and 1977, when 15 out of 25 series went to their maximum length."

The Inside Science News Service Web site provides a number of useful links for people who want to take a closer look, including Major League Baseball's history of World Series Outcomes from 1903-present and a University of Illinois site illustrating the derivation of the coin tossing model.


"Oh, No: It's a Girl! Do Daughters Dause Divorce?"

by Steven E. Landsburg, Slate, posted Thursday, October 2, 2003.

"Maybe Parents Don't Like Boys Better."

by Steven E. Landsburg, Slate, posted Tuesday, October 14, 2003.

"It's a Girl! (Will the Economy Suffer?)"

by David Leonhardt, New York Times, October 26, 2003, Sect. 3; p. 1

Steven Landsburg's column "Everyday Economics: How the dismal science applies to your life" appears regularly in the online magazine Slate. In the pair of columns here, he reports on a recent study by economists Gordon Dahl of the University of Rochester and Enrico Moretti of UCLA. Looking at census data for each decade back to the 1940s, Dahl and Moretti discovered that couples with female children were more likely to divorce than those with male children. Landsburg's articles explore number of potential explanations for this effect, which he classifies as falling into one of two broad categories. On one hand, having boys may contribute to the overall happiness of the family, making it more likely for couples to stay married. On the other hand, it may be that boys fare much worse than girls after a divorce, so there is a higher cost associated with breaking up. You can listen to a interview with Landsburg from NPR's "Day to Day" program, October 9, 2003.

The New York Times article gives more details on the findings:

Over the last 60 years, parents with an only child that was a girl were 6 percent more likely to split up than parents of a single boy. The gap rose to 8 percent for parents of two girls versus those of two boys, 10 percent for families with three children of the same sex and 13 percent for four. Every year, more than 10,000 American divorces appear to stem partly from the number of girls in the family.

The article notes that technology increasingly offers ways for couples to influence the sex of their children. A continued preference for boys could eventually unbalance the population's sex ratio. On a more positive note, the effect found by Dahl and Moretti seems to be diminishing over time. In the 1940s, couples with a single child were 8% more likely to divorce if the child was a girl that if it was a boy. By 2000, that figure had dropped to 5%.


"Air Aces Show Fame is not Fair."

by Jenny Hogan, New Scientist, October 18, 2003.

How to be Famous

by Robin Oakley-Hill (letter to the editor), New Scientist, November 10, 2003

Mikhail Simkin and Vwani Roychowdhury, two electrical engineering professors at UCLA, examined the records of Germany's WWI flying aces. The goal was to see whether an ace's fame, as measured by the number of Google hits on his name, is proportional to the number of planes he actually shot down. Simkin and Roychowdhury feel that such proportionality would reflect an equitable distribution of fame. Instead, they found that the actual relationship looks exponential rather than linear.

Arguably the most famous flying ace is Manfred von Richtofen, the Red Baron. The 80 planes he shot down represent only 1.6% of the total number downed by German aviators. However, his name drew 4720 Google hits, which is the 27% of 17,674 total hits obtained for all 393 German aces identified by Simkin and Roychowdhury. For more details, you can download a pdf version of their paper, "Theory of Aces: Fame by Chance or Merit?"

The letter to the editor, however, argues that this whole analysis is misguided, and shows "the science of statistics going where it cannot really reach." According to Oakley-Hill, fame reflects glamour as well as achievement. In the case of Richtofen, being a actual baron and assembling a colorful "flying circus" were central to building his reputation.


William P. Peterson
Department of Mathematics and Computer Science
Middlebury College
Middlebury, VT 05753-6145
USA
wpeterson@middlebury.edu


Volume 11 (2003) | Archive | Index | Data Archive | Information Service | Editorial Board | Guidelines for Authors | Guidelines for Data Contributors | Home Page | Contact JSE | ASA Publications