Statistical Literacy as a Goal for Introductory Statistics Courses

Deborah J. Rumsey
The Ohio State University

Journal of Statistics Education Volume 10, Number 3 (2002), jse.amstat.org/v10n3/rumsey2.html

Copyright © 2002 by Deborah J. Rumsey, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.


Key Words: Introductory statistics; Statistical literacy.

Abstract

In this paper, I will define statistical literacy (what it is and what it is not) and discuss how we can promote it in our introductory statistics courses, both in terms of teaching philosophy and curricular issues. I will discuss the important elements that comprise statistical literacy, and provide examples of how I promote each element in my courses. I will stress the importance of and ways to move beyond the "what" of statistics to the "how" and "why" of statistics in order to accomplish the goals of promoting good citizenship and preparing skilled research scientists.

1. Introduction: Goals for our Introductory Statistics Courses

Many statistics educators agree that any introductory statistics course should raise students’ awareness of data in everyday life and prepare them for a career in today’s “age of information.” To achieve this objective, in my view, we must work toward two overarching goals for our introductory statistics courses. First, we want our students to be good “statistical citizens,” understanding statistics well enough to be able to consume the information that they are inundated with on a daily basis, think critically about it, and make good decisions based on that information. Some researchers call this “statistical literacy.”

The second goal for our introductory statistics courses, and one that in my opinion is often underemphasized, is to develop research scientist skills in our students. We must be sure to promote the use of the scientific method in all of our students: the ability to identify questions, collect evidence (data), discover and apply tools to interpret the data, and communicate and exchange results. While it is true that many of our students many never conduct a scientific study of their own, it is hard to imagine a student in today’s society who will never encounter data or statistical results, over the course of a career. Statistics is involved in every aspect of the scientific method. It should also be the case that statistics presented in our daily lives are based on either the proper or improper use of the scientific method.

I believe that for students to achieve both of these overarching goals, they need to understand and use statistical ideas at many different levels. To begin, they need a certain level of competence, or understanding, of the basic ideas, terms, and language of statistics. But being a good statistical citizen and research scientist requires more than this; it requires that the student be able to explain, decide, judge, evaluate, and make decisions about the information. These require additional skills in statistical reasoning and thinking, but the foundation for these skills should first be developed at the statistical literacy level.

In order to understand where statistics fits into achieving our goals, consider what I call the “chain of statistical information.” Think of the latest statistic that you have encountered on the television. What process did this information go through in order to get to you? First, the information was produced, or generated, by a researcher (for example, a medical doctor, professor, U.S. government, or independent company). Then the information was picked up by someone, deciphered, and communicated to the intended audience (often by members of the media). Finally, the information is consumed, or received, by the audience (the general public). This extends the ideas presented by Gal (2000), who identified students as (mainly) consumers and (at times) producers of information.

Regardless of where a person is involved in the chain of statistical information, there will be a need for a basic understanding of the concepts and language, a level of reasoning (the abilities to question, compare, and explain) and a level of statistical thinking (applying the ideas to new problems and identifying questions of your own).

In this paper, I will define statistical literacy (what it is and what it is not) and discuss how we can promote it in our introductory statistics courses, both in terms of teaching philosophy and curricular issues. I will discuss the important elements that comprise statistical literacy, and provide examples of how I promote each element in my courses. I will stress the importance of and ways to move beyond the “what” of statistics to the “how" and "why” of statistics in order to accomplish the goals of promoting good citizenship and preparing skilled research scientists.

2. What is Statistical Literacy? Identifying the Learning Outcomes

The statistics education literature includes several definitions of statistical literacy. Some of these definitions are given by the following:

A review of the literature tells that many statistics educators, as well as national councils and education boards, have formulated various lists containing the basic requirements, or learning objectives, for someone who is statistically “literate” or “competent.”

Watson (1997) identifies three stages as components of the "ultimate aim" of development of statistical literacy:

  1. the basic understanding of statistical terminology,

  2. the understanding of statistical language and concepts embedded in a context of wider social discussion, and

  3. the development of a questioning attitude which can apply more sophisticated concepts to contradict claims that are made without proper statistical foundation.

As Moore (1998a) stated, “What statistical ideas will educated people who are not specialists require in the twenty-first century? That is the issue of statistical literacy. What specific concepts and skills will be needed in the context of specific jobs? That is the issue of statistical competence.”

Gal (2000) has identified characteristics of a scientific study that a consumer of information should be able to discuss at a basic level:

Note that a discussion of each of these items begins by understanding the terminology and identifying each characteristic within the context of the problem. At the next level, the student would be asked to describe the results of study by interpreting the results. Students may be asked to produce data on a similar study. Then they might be asked to evaluate the study (which involves critical thinking, and questioning the study at every phase). Finally, the student may be asked to communicate this information to peers. Some of these tasks require basic statistical competence, and others require higher order knowledge skills, such as statistical reasoning and thinking.

Additional lists of learning outcomes for statistical literacy are provided by Cobb (1992), Moore (1998a, 1998b), and Garfield (1999), among others. Each list seems to encompass two different types of learning outcomes for our students: being able to function as an educated member of society in this age of information, and having a basic foundational understanding of statistical terms, ideas, and techniques.

A review of the many papers published and presented on this topic reveals that the phrase “statistical literacy” is not defined consistently. In light of the above discussion involving the overarching goals for our students, it is clear that while all of these definitions apply to the goals, the use of the phrase “statistical literacy” is too broad. I will attempt to clarify the issues by omitting the phrase “statistical literacy” from my discussion, and instead I will use two distinct phrases to refer to the two distinct learning outcomes that we have discussed. “Statistical competence” refers to the basic knowledge that underlies statistical reasoning and thinking, and “statistical citizenship” refers to the ultimate goal of developing the ability to function as an educated person in today’s age of information. Statistical citizenship may very well require high order statistical reasoning and thinking.

3. Statistical Competence: What it is, and what it is not!

Basic statistical competence, as it is defined above, involves the following components:

  1. data awareness,

  2. an understanding of certain basic statistical concepts and terminology,

  3. knowledge of the basics of collecting data and generating descriptive statistics,

  4. basic interpretation skills (the ability to describe what the results mean in the context of the problem), and

  5. basic communication skills (being able explain the results to someone else).

3.1 Promoting Data Awareness

Data awareness provides a motivation for students to want to learn statistics. The important aspects of data awareness include the following:

  1. data are a part of everyday life and are an important component of all aspects of the working world,

  2. data are often misused, leading to misinformation, and

  3. decisions made based on data can have a strong impact on our lives.

Resources for promoting data awareness include Chance News (Snell and Finn 1992, www.dartmouth.edu/~chance/chance_news/news.html), the Data and Story Library (DASL, lib.stat.cmu.edu/DASL), journal articles, and examples that you or your students find in the media. Look for examples in which non-statisticians explain or debate statistical ideas, or examples that contain misleading statistics. Pay special attention to examples that highlight the impact of the decision that was made (for example, the decision to make airbags powerful enough to protect the “average man” ended up killing children and some women). Students seem to give more credibility to examples gleaned from the real world; these examples can serve as a powerful motivator for students to ask more questions. Remember, what is important and interesting to us as teachers may not be interesting and important to our students. The best way to find out what is interesting to students is to ask them to provide their own ideas, and then to save those ideas for future use.

We can promote data awareness by always providing a relevant context for the ideas that we present in class. This holds throughout the course, not just during the first few days. In my opinion, we should always present data in a context so that students can see why it was collected, and what the researcher wants to know about it -- even if the questions we ask of our students don’t necessarily require full data analysis and interpretation. After all, if our students will be expected to work with statistics in their jobs, shouldn’t they gain insight into the entire process of the scientific method, rather than just the second half of it -- data analysis? To discourage students from becoming too skeptical about statistics, it is also important to provide examples where statistics are used correctly, not only to show the bad examples with which we are all so familiar.

To assess data awareness, I typically ask students to answer questions of an open-ended nature:

  1. Give an example of how you might encounter statistics in your chosen major.

  2. Describe three ways in which statistics are presented to you through the media.

  3. Identify the source of this study, discuss how statistics were used to answer the question, and describe the impact of the study on the general public.

In my experience, students typically don’t come into the class knowing why they need to know the subject of statistics. My advice is to get them on your side early and keep them there with relevant, interesting examples.

3.2 Promoting the Understanding of the Basics

We are all familiar with the popular For Dummies ... series of books geared to educate the “common man” about any given topic, from Web page design to choosing wines for dinner. If someone were to write such a book about statistics, what would be included? I assume that a good book about statistics for the “person on the street” would include the basics. What exactly are the basics of statistics? Most lists are similar to the ones given by Cobb (1992), Moore (1998a,1998b), Garfield (1999), Snell (1999), and Gal (2000), among others.

Which specific content and methods should appear on an introductory statistics syllabus is a topic that has received much attention in the literature; it is not something I will attempt to resolve in this paper. However, I would like for us to rethink what it means for a student to understand any statistical idea.

What does it mean to understand a statistical idea? Basically, understanding a statistical idea means to be able to relate the concept within a nonstatistical subject matter; to explain what the concept means, to use it in a sentence or within a broader problem, and to answer questions about it. These skills are critical to developing a student’s ability to interpret and carry out research involving data. How should a student demonstrate understanding? There are several misconceptions that statistics education researchers are trying to dispel. Here are some of them.

Other concepts that are hard to understand, partially because of the way they are presented in a traditional introductory statistics textbook, include the Central Limit Theorem, sampling distributions, spread or variability in the population (as opposed to variability in estimates), critical values of z, and standard error. In my opinion, textbooks often miss the opportunity to connect big ideas with common threads; rather they present the theory, then discuss the calculation, and apply it to an example at the end. I’d like to see a nice example in the beginning, followed by the statistical concept and how it connects to the big picture, and then add math and calculations (only) where they are useful in terms of understanding and practicing the statistics. If we take the ideas and present them using more relevant and usable language for our students, connecting the big ideas with common threads, I believe students will be able to incorporate those ideas more readily into their own knowledge base.

In defense of formulas and calculations, they do play an important role in a student being statistically competent. However, they should not be the focal point, or the endpoint of their knowledge.

How can we promote statistical understanding in our students? I believe the HOW we do statistics must be motivated by WHY we should do the statistics and WHAT we are trying to do with the statistics. Promoting the HOW at the expense of the WHY is a mistake. Examples of the latter include students who spend enormous amounts of time calculating a correlation coefficient, when a simple division of the X-Y plane into four quadrants using the mean of X and the mean of Y can provide a simple intuitive understanding of what the correlation coefficient measures. The calculations seem to become an obstacle to understanding.

Here are some ideas that I have to promote understanding among our students. First, help students learn proper techniques and tools, when they are used, and why they are used. Present them with definitions that they can take hold of, rather than struggle against. Emphasize big ideas and common threads, and not the teaching and testing of knowledge as it falls within certain chapters of a book. Keep cross-referencing and comparing ideas to highlight connections. Be realistic about the language used in class, and be willing to give up some “technicalities” for broader understanding (for example, using plus or minus “2” standard errors instead of 1.96 has meant the world to my students, and to me. What little precision I have lost has been made up in intuitive understanding of the big idea of margin of error). And finally, be willing to let go of the little side trails to save the big picture; for example, does spending a whole day learning how to calculate “n choose k” really help students understand a binomial model for binary data?

I feel that we should distribute statistical terms and ideas on a “what you need to know, when you need to know it” basis, and make sure students “need to know.” Just because a textbook author includes an idea in a book doesn’t mean you have to teach it. Be selective. Some ideas, such as “standard error” can take you a long way in a course, and are worth spending a great deal of time developing. Other terms, like “precision,” “accuracy,” “reliability,” “bias,” and “consistency” all sound the same to students. In my opinion, splitting hairs about these terms will only create confusion and frustration. My advice is to choose the most important ideas, and stick to them. Big ideas and common threads have taken me a long way in my courses. David Moore taught me to consider that if you are not going to use an idea later, then why spend a lot of time on it now? This has become a teaching theme for me (the first things to go were permutations and combinations).

We can assess student understanding with questions that ask them to identify ("what is the population?" or "what is the sample?"), explain what statistical technique should be used, explain how to do it, think about what happens if we change the numbers, for example. We can also give students many opportunities to explain and discuss statistical ideas with each other, and watch closely while they do this. I am convinced that students who are most likely to be good statistical citizens and research scientists are those that are successful at incorporating statistical concepts, terms, and ideas into their own language.

To extend this idea of understanding to collaboration, I assign student teams on the first day of class, and I rotate teams randomly several times during the semester. Throughout the semester, I see a language emerge throughout the class that is correct, but unlike the language I use as a teacher to explain ideas. This simulates a true collaborative, team-based, student-driven environment, much like the environment that many students will encounter in the workplace. If I lectured all the time, think of how much my students (and I, myself) would be missing!

We have to be aware that sometimes students’ misunderstandings of a basic idea can cause big problems down the road. My advice is to not assume too much. For example, after discussing histograms in the traditional way, along with the concepts of descriptive statistics that help us discuss shape, center, and spread, I asked students to choose (from three possibilities) a picture of a data set that had the least amount of variability (I gave them no numbers). The options were a bell-shaped curve, a flat histogram, and a histogram where all the values were located in two bars that were very close together. Almost 3/4 of my students chose the uniform distribution. I was astounded!

When asking them to explain, they said the data showed no change from left to right in the uniform distribution. I then realized they had the wrong concept of a histogram, and that it was probably based on their experiences from the media -- which shows mostly line graphs (values tracked over time). In that case a flat graph does indicate lack of variability. Now when I hear teachers talk about how students “hit a wall” when it comes to the Central Limit Theorem, I wonder, could it be that they don’t see the sequence of ever tightening distributions as an indicator of less and less variability?

3.3 Data production and results

When we give students opportunities to produce their own data and find basic statistical results, I think we are helping them to gain ownership of their own learning. We also promote their skills to take charge of a problem involving data, much like they will have to do in the workplace. If they are also given the opportunity to define their own research questions, then producing their own data will greatly increase their motivation to “do some statistics” in order to see what is going on, and to help answer their research question. This may also help them to discover or determine methods and techniques on their own. For example, I gave my students the Mark McGwire/Sammy Sosa home run data set, and asked students to first write a question, and then organize the data in order to answer that question. It’s amazing how many interesting data displays I got. It’s also amazing how much you don’t have to tell students.

When I teach a specific technique, I try to develop the big ideas that underlie the technique, and then use it to generalize to other situations. For example, once students see how a confidence interval is created to estimate a population mean in a general way, they can quickly see how it applies to a proportion, to two samples, and even to small samples, with small variations.

I can’t stress enough the importance of putting problems and tasks within a context. Mallows (1998) calls this the “Zeroth Problem,” the problem that comes before the data were ever collected. Students are so much more motivated by research questions. Statistics are the tools that help them to answer the questions. Once they realize that, they appreciate statistics so much more.

3.4 Interpretation at a Basic Level

Given results (statistics, graphs, computer output, tables, or raw data) can a student explain in his or her own words what the results mean? Again, a research question will provide a proper context. When I ask students to give the conclusion of a hypothesis test, I stress that they must tell me what the decision was (reject H0 or do not reject H0), and why they decided that (p-value and test statistic). This is called the “statistical conclusion.” But this is not the final answer. They also need to explain what it means in the context of the original research question (since we rejected H0, we conclude that there is a home field advantage in Major League baseball, for example). We call this the “research conclusion.” The ability to interpret statistical information and draw proper conclusions is critical in the workplace, and those who are good at it will be more able to advance and be successful in their positions. Moreover, this is the truly fun part of statistics for students -- seeing it used to answer questions in which they are most interested. Students don’t have to love statistics for statistics sake; they can come to love statistics for what it can do -- help them to understand their world.

How can we assess the ability of students to interpret results? I think the best thing we can do is give them opportunities to interpret their own results, using their own data. I have had the best success with this; it gives students ownership, and allows them to really focus on what information they want to tell us, and what it means to them. Data ownership goes a long way here. I also try to provide situations in which teams of students work together, addressing different pieces of a bigger problem within a common context. I think this helps students to simulate a collaborative work environment that is focused, yet offers opportunities for individual choices.

On exams and homework, I like to ask questions that specifically focus on interpretation (not on the whole process of conducting the hypothesis test, for example). As an illustration, suppose a researcher wants to know if there is any difference in average grade point average between male and female students. His test statistic, based on 100 male and 100 female students, is 3.28. The p-value is less than 0.0001. What is your conclusion?

I also notice that when students create statistical results themselves, they seem to develop a better understanding of how to interpret the results. For example, when covering Chi-Square tests as a tool for determining whether two qualitative variables are associated, I used to find that students had difficulty reading the two-way tables that I gave them. Students were unable to discern which direction the conditional probabilities were in; they seem to get rows and columns mixed up. I wondered if they would do better if they created their own tables from the raw data, instead of trying to interpret someone else’s table the first time around. This worked. In one class of 50 students, not one of them missed the interpretation when they organized the tables themselves. From here, they went on to understand how a given table can be interpreted.

Interpretation should also involve some basic ability to assess the correctness of the technique used. For example, a bar chart shows how often certain numbers are chosen in a state lottery. The graph makes it appear that the number “2” is chosen much more often than the others. Why is this? The answer is because the scale for the graph starts at 200 and goes in increments of 5. If I ask a student to interpret this and he says that 2 is chosen most often, he would be wrong. Moore (1997) provides many excellent examples that teach “data sense.”

3.5 Basic Communication Skills

Basic statistical communication skills involve reading, writing, demonstrating, and exchanging statistical information. While interpretation demonstrates a student’s own understanding of the statistical ideas, communication involves passing the information on to another person in a way that they will understand it. This is an entirely different skill. Among recruiters who hire new graduates, one of the top criteria they look for is the ability to communicate their ideas to others. Certainly this is a skill worth developing in our students when it comes to statistical information.

Communication involves being able to decipher something from one “language,” “style,” “notation,” or “wording” to another. The key to developing good communication skills in our students is to expose them to different styles. I use team teaching and team learning in my introductory classes, and it really helps students to see another viewpoint. They also see that teachers might not always be in perfect agreement. I believe that this is healthy for students.

Exposure to alternative notations, symbols, and definitions is essential for good statistical citizenship, in my opinion. We can teach students that “our” notation for the sample mean is “x-bar” others may call this “mu-hat.” This helps students broaden their understanding, and not retain such as narrow view of what we have taught them. How many times has a student accused you of “tricking them” by changing the numbers, the letter of the variable used, or the context of the problem? How many times have you felt bound by a textbook to use its notation? Broadened exposure will help minimize these problems.

I also stress communication skills through simulations of real life experiences. Examples include the following:

  1. Write a letter to the editor explaining why a graph showing the number of crimes for 1997 versus 1987 should have also included the population size for each year.

  2. Suppose your friend is a journalist trying to figure out why two polls regarding public opinion on campaign finance form do not agree, because she has to submit an article on it. What would you tell her?

  3. Organize and conduct a debate over whether or not city funds should be spent to hire more police officers. Use statistics to back up your points.

  4. You are working at a factory and are convinced that the company is losing money by keeping a large inventory of parts that won’t be needed for several years, if the warranty information is correct. How do you make your point to your manager?

  5. Find someone who has never had a statistics class, and explain the idea of margin of error to them.

  6. Conduct a mock TV interview where you are asked to answer questions from the public regarding polls -- how they are conducted, why “no one ever calls me to participate,” why you only need about 1500 people to be reasonably accurate, and why we sometimes get conflicting results from different polls.

4. Conclusions: Statistical Competence Alone?

The goals of our introductory statistics courses are two-fold. First we want to promote and develop good statistical citizenship, and we also want to produce good research scientists (on whatever level students will be involved). In order to do that, we begin by developing a basic foundation of knowledge of statistical concepts and ideas, which I call statistical competence. Statistical competence promotes and develops skills in data awareness, production, understanding, interpretation, and communication.

Is this all that a good statistical citizen or research scientist needs to successfully function in today’s age of information? In my opinion, the answer is no. It is a good beginning, but it is not the end. Once students have a basic functional knowledge, they need the ability to question, to inquire, to probe, to compare and contrast, to explain, and to evaluate at a higher level. They might know what a matched pairs experiment is, and that it reduces variability, but they need to be able to explain HOW this results in a more powerful hypothesis test, and WHY we are able to draw a cause-effect relationship in some cases, and not in others. They also need to be able to think on their own, to identify their own questions, and come up with their own solutions using statistics. And that requires statistical reasoning and thinking. However, it is important to note that statistical competence is a requirement for statistical reasoning and thinking. If you haven’t got the basic ideas down, you won’t be able to build upon them.

Finally, I do not want to imply that in a statistics course, all the statistical competence is built first, then all the reasoning, then all the thinking. I think that it is important to always present a statistical problem in a relevant context with a legitimate, and relevant, research question. I think that as students learn more, they will ask more involved questions, and we can revisit and reinforce the scientific method over and over again throughout the course. Each time they go through the process, they will reinforce their understanding of terms and concepts, and their reasoning and thinking skills.


References

Chance, B. L. (1997), “Experiences with Authentic Assessment Techniques in an Introductory Statistics Course,” Journal of Statistics Education [Online], 5(3). (jse.amstat.org/v5n3/chance.html)

Cobb, G. (1992), “Teaching Statistics,” in Heeding the Call for Change: Suggestions for Curricular Action, ed. L. A. Steen, Washington, DC: Mathematical Association of America, 3-43.

Gal, I. (ed.), (2000), Adult Numeracy Development: Theory, Research, Practice, Cresskill, NJ: Hampton Press.

Garfield, J. (1999), “Thinking about Statistical Reasoning, Thinking, and Literacy,” Paper presented at First Annual Roundtable on Statistical Thinking, Reasoning, and Literacy (STRL-1).

Mallows, C. (1998), “1997 Fisher Memorial Lecture: The Zeroth Problem,” The American Statistician, 52, 1-9.

Moore, D. S. (1997), Statistics: Concepts and Controversies (4th ed.), New York: W. H. Freeman and Company.

----- (1998a), “Shaping Statistics for Success in the 21st Century: A Panel Discussion,” Kansas State University Technical Report II-98-1.

----- (1998b), “Statistics Among the Liberal Arts,” Journal of the American Statistical Association, 93, 1253-1259.

Snell, L., and Finn, J. (1992), “A Course Called Chance,” Chance, 5, 12-16.

Snell, L. (1999), “Using Chance media to Promote Statistical Literacy,” Paper presented at the 1999 Joint Statistical Meetings, Dallas, TX.

Utts, J. (1996), Seeing Through Statistics, Belmont, CA: Duxbury Press.

Watson, J. (1997), “Assessing Statistical Thinking Using the Media,” in The Assessment Challenge in Statistics Education, eds. I. Gal and J. Garfield, Amsterdam: IOS Press and International Statistical Institute.


Deborah J. Rumsey
Department of Mathematics
The Ohio State University
Columbus, OH 43210
USA
rumsey@math.ohio-state.edu


Volume 10 (2002) | Archive | Index | Data Archive | Information Service | Editorial Board | Guidelines for Authors | Guidelines for Data Contributors | Home Page | Contact JSE | ASA Publications