William P. Peterson

Middlebury College

Journal of Statistics Education Volume 9, Number 2 (2001)

Copyright © 2001 by the American Statistical Association, all rights reserved.

This text may be freely shared among individuals, but it may not be republished in any medium without express
written consent.

*Savvy Traveler*, March 2, 2001

savvy.mpr.org/show/ta/2001/20010302.shtml

The *Savvy Traveler* web site is the online companion
to a public radio program of the same name. The present
story discusses a February 21 air traffic safety report
issued by the National Transporation Safety Board (NTSB).
Full results are available at the NTSB web site

www.ntsb.gov/publictn/2001/SR0101.htm

This study found that the survival rates for accidents involving Part 121 carriers (large scheduled airliners) from 1983 through 2000 was 97.5 percent. For serious Part 121 accidents (those involving fire, serious injury, and either substantial aircraft damage or complete destruction), the survival rate was 56 percent. The report suggests that the public perception of survivability may be substantially lower that the 95.7 percent survival rate found by the study.

The *Savvy Traveler* presents a link to a web site, maintained
by Paul Bailey, which interactivly provides estimates of the
chance that a flight you are about to take will suffer a crash.

Bailey says that he developed this application because he was tired of the media's preoccupation with airline accidents, and he wanted to show people how safe air travel really is.

by Keith Bradsher, *The New York Times*, February 14, 2001, p. C7.

Here is more bad news about the Ford Explorer. A study by the National Highway Safety Administration examined death rates for car drivers involved in an collision during the period from 1991 to 1997, based on the type of vehicle they collided with. The death rate was 10 per 1000 for crashes with a Ford Explorer, compared to only 5 to 7 per 1000 for collisions with other SUVs, such as the Jeep Grand Cherokee, Toyota 4Runner, and Chevrolet Blazer. In crashes with other cars, the driver death rate was .6 per 1000. A Ford spokesperson dismissed the findings, saying that the large error ranges in the study made the comparisons meaningless.

The article itself presents some (unfortunately murky) discussion of the error ranges, noting that

"limited numbers of crashes in the database for each model created a fairly wide range of errors in the calculations. For the Explorer, for example, there was a 95 percent chance that the true death rate of car drivers was 7 to 13 per 1000 crashes. The error range meant that it is statistically possible, although unlikely, that one or more of the other midsize sport utilities was deadlier than the Explorer."

Hans Jonck, the University of Michigan researcher who led the study, is quoted as saying that although "error ranges may be wide for individual models, the errors in comparing models are likely to be smaller." Pondering exactly what these statements mean might be an interesting class discussion topic.

The study concluded that more research was needed to distinguish how factors such as vehicle weight, stiffness, and height contribute to damaging other vehicles in a crash. Nevertheless, the article suggests that the higher death rate for the Explorer may result from its front bumper being higher than that of many cars. It turns out that Ford has redesigned this feature for the 2002 Explorer.

by John Allen Paulos, "Who's Counting," ABCNEWS.com, February 1, 2001.

abcnews.go.com/sections/scitech/WhosCounting/whoscounting010330.html

John R. Lott is a senior research scholar at Yale School of
Law. His 1998 book *More Guns, Less Crime* (University of
Chicago Press) applied multiple regression analysis to study
how gun laws relate to crime rates. He concluded that gun
control policies such as waiting periods and background
checks do little to reduce crime, whereas laws permitting
citizens to carry concealed weapons have a significant
deterrent effect. Lott framed his case in economic terms,
explaining that the increased "cost" represented by the
possibility of being shot convinces criminals to pursue other
careers.

Paulos does not object to the formal statistical calculations, but he is unconvinced by Lott's interpretations. Paulos warns against extrapolating from the regression equation to conclude that the trend will continue if the number of people with concealed weapons exceeds the range of present data. Reminding readers that correlation does not imply causation, Paulos points out that consumption of hot chocolate is also associated with less crime, but both are responses to cold weather. Finally, Paulos wonders how safe we would really feel as a society if we knew that almost everyone was carrying a concealed weapon.

by Haya El Nasser and Paul Overberg, *USA Today*, March 15, 2001, p. 3A.

*USA Today* has developed a number called the Diversity Index,
which it computes from Census data to measure the racial and
ethnic diversity of the population. In 1990, the index was
40; in 2000 it is 49. The article interprets this to mean
there is "a 49% chance that two individuals are different."
*USA Today* attributes the increase over the last
decade to the growth of the Hispanic population.
Unfortunately there is not much detail on how the index is
actually calculated.

We found a more complete discussion of the index on a web site entitled "Reporting Census 2000: A Guide for Journalists," which is maintained by Professor Stephen K. Doig of the Cronkite School of Journalism at Arizona State University.

cronkite.pp.asu.edu/census/index.htm

This site contains a wealth of online tools and references for working with Census data, as well as discussions of current issues. In particular, the diversity story can be found at the following link.

cronkite.pp.asu.edu/census/diversity.htm

We learn here that the index was introduced by *USA Today* after
the 1990 Census ("Analysis Puts a Number on Population Mix," *USA
Today*, April 11, 1991, p. 10A). The equation used in 1990
was:

1 - (W^2 + B^2 + AmInd^2 + API^2) * (H^2 + NonH^2)

The formula gives the probability that two randomly chosen individuals are different, by computing one minus the probability that they are the same. The variables "W", "B", "AmInd", and "API" are the probabilities of white, black, American Indian, and Asian/Pacific Islander, computed as the respective proportions of the total population. Similarly, "H" is the probability of Hispanic, and "NonH" = 1 - H. The first sum of squares computes the probability that two people are of the same race, the second computes the probability that they are either both Hispanic or both not. However, multiplying the two sums treats Hispanic ethnicity as if it were independent of race, which is a questionable assumption. There is one last modification. The 1990 Census included an "other races" category. This was handled by scaling the race probabilities W, B, AmInd and API to sum to 1, which amounts to applying the known categories proportionally to the unreported cases.

Census 2000 presented two changes. First, the API group was split into two subcategories: Asian (A) and Native Hawaiian/Other Pacific Islander (PI). This is easily accommodated by moving from four to five terms in the race factor. The added complication is that the Census now allows respondents to report more than one race. This was handled by "defaulting to diversity" -- that is, always declaring a pair of multi-race persons to be a non-match. Operationally, this means that is no term for the multi-race response. Thus the 2000 index is:

1 - (W^2 + B^2 + AmInd^2 + A^2 + PI^2) * (H^2 + NonH^2).

The web site presents some sample calculations to show that if only a small percentage of the population declares itself multiracial (5 percent in the example), then the diversity index would not be much different if a multi-race pair were declared to be a match rather than a non-match.

by Gina Kolata, *The New York Times*, March 25, 2001, p. 16.

There is a long-standing controversy regarding the value of courses that coach students for the SAT. The present article reports on a new study, which found that the prep courses have little value. This is consistent with claims repeatedly made by the College Board. However, the article says the new study is noteworthy because it uses a large national sample, and also because its author, Berkeley graduate student named Derek Briggs, has no affiliation with either the College Board or the test prep companies.

Briggs wrote an article describing his study for the
Winter 2001 edition of *Chance* magazine. You
can view the article in pdf format from the "Previous
Featured Articles" links on the magazine's web site.

www.public.iastate.edu/~chance99/homepage.html

Not surprisingly, test preparation companies disagreed
with Briggs' conclusions. A spokesperson for Kaplan Inc.
objected to lumping all of the services into one group. As
he explained to *The New York Times*, "What we've seen
over the last 15 years is this huge increase in weekend
courses and one-day courses. This whole notion of grouping
commercial courses with this broad brush causes a problem
for us."

The article points out the additional concern that this was only an observational study: students were not randomly assigned to coached and non-coached groups.

by Robert J. Barro, *Business Week*, April 9, 2001, p. 20.

Robert Barro is a professor of economics at Harvard and a senior fellow at the Hoover Institution. He maintains a web page of his popular articles, where you can find a link to the present piece.

post.economics.harvard.edu/faculty/barro/popwritings.html

Barro laments the fact that Richard Atkinson, president of the University of California, has proposed eliminating SAT scores as a required part of the college application. (Clarification: Atkinson recommended dropping the aptitude test SAT I but he recommends requiring the SAT II tests which test mastery of a specific subject.)

Barro argues that the SAT is highly useful as a predictor of college performance as well as post-graduation wages. To back up his first claim, he has analyzed data from the Education Department's National Postsecondary Student Aid Study (NPSAS), which reports every three years on students' grade point averages, admission test scores (including the SATS), and other family and school variables. Barro reports that he used the NPSAS studies for 1990, 1993, and 1996, which provided 33,000 observations, to study how well the admission tests predict college performance. He writes:

In this sample, admissions-test scores strongly predict college grades, though much of the individual variation in grades remains unexplained. Taking into account many other factors (including college attended, race and gender variables, and parental income and education), the t-statistic -- a measure of how closely two variables move together -- for the admissions test is 60. In comparison, researchers customary regard a result as significant if this statistic exceeds 2. Therefore, admissions tests have strong predictive power for college grades. They are as good for senior as for freshman grades.Although Barro does not explain just what t-test he is carrying out, this presumably represents a test of significance for a regression coefficient. (Would this be obvious to his audience in

Barro discusses the results of his study for women and minorities. He finds that the SAT's power for predicting college grades is not as high for women as for men, and concludes that women may have legitimate concerns about sex bias in SAT-based admissions policies. For minorities, however, he finds that this is not the case. He states that average college grades for black and Hispanic students are lower than those for white students with comparable SAT scores.

Needless to say, there has been a large amount of research done on the predictive power of SAT I tests. The College Board maintains a Research Notes page, containing links to recent reports

www.collegeboard.org/research/html/rn_indx.html

There you can find further discussion on all of these issues.

by Sara Robinson, *The New York Times*, April 10, 2001, p. D5.

The story describes a deceptively simple-sounding probability puzzle, which is stated as follows.

Three players enter a room and a red or blue hat is placed on each person's head. The color of each hat is determined [independently] by a coin toss... Each player can see the other players' hats but not his own.

No communication of any sort is allowed, except for an initial strategy session before the game begins. Once they have had a chance to look at the other hats, the players must simultaneously guess the color of their own hats or pass. The group [wins] if at least one player guesses correctly and no players guess incorrectly.

The same game can be played with any number of players. The general problem is to find a group strategy that maximizes its chances of winning...The naive strategy would be to have one group member guess and the others pass. This has probability 1/2 of success. Of course it would not be a very interesting puzzle if this were the best plan possible plan! The article defers the solution to give readers a chance to find a better strategy (don't read the next paragraph if you want to try).

The following strategy achieves a probability of 3/4 of success.

Each player looks at the hats of the other two players. If they are different colors he passes. If they are the same color he guesses that his hat is the other color.To see how this works, observe that the strategy succeeds if two people have the same color hat and the third is different, since this last person will be the only one to guess, and he will be correct. The strategy fails in the complementary case when all three have the same color hat, since all three players will guess incorrectly. We now have a simple counting problem: two out of eight possible assignments give all three players the same color hat.

The article goes on to explain that the problem has been solved for larger groups in the special case where the number of players is one less than a power of 2. The solution was suggested by coding theory--specifically by the Hamming code. Credit for discovering the problem is given to Dr. Todd Ebert of the University of California at Irvine. Much more information about the puzzle is available on Dr. Ebert's web site.

www.ics.uci.edu/~ebert/teaching/spring2001/ics151/puzzles.html

by Fox Butterfield, *The New York Times*, April 20, 2001, p. A10.

This article describes the findings of a study of all homicide cases in North Carolina for the period 1993 to 1997. Among cases where the death penalty was possible, rates for receiving the death penalty are broken down by the race of the victim and defendant. (Actually, only the categories "white" and "nonwhite" are used, but the article states that nonwhite in North Carolina means mostly African-American and some Hispanic.) The study found that people convicted of killing a white person are more likely to receive the death penalty than those convicted of killing a person who isn't white. The relevant information is summarized below (W = white, NW = not white).

Defendant |
Victim |
number of cases |
% receiving the death penalty |

NW | W | 284 | 11.6% |

W | W | 541 | 6.1 |

NW | NW | 616 | 4.7 |

W | NW | 80 | 5.0 |

These findings are consistent with earlier studies showing that the race of the victim affects sentencing in capital cases. The article quotes David Baldus, a professor of law at the University of Iowa, as saying "the significance of the new study was that racial disparities found in the South by studies more than two decades ago still existed."

The new study considered factors that might lead to these different rates. It turned out that when the victim was black, prosecutors were more willing to plea bargain, allowing defendants to plead guilty in return for a lesser sentence.

William P. Peterson

Department of Mathematics and Computer Science

Middlebury College

Middlebury, VT 05753-6145

USA

Volume 9 (2001) | Archive | Index | Data Archive | Information Service | Editorial Board | Guidelines for Authors | Guidelines for Data Contributors | Home Page | Contact JSE | ASA Publications