Wider Shoes for Wider Feet?

Mary C. Meyer
University of Georgia

Journal of Statistics Education Volume 14, Number 1 (2006), jse.amstat.org/v14n1/datasets.meyer.html

Copyright © 2006 by Mary C. Meyer, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the authors and advance notification of the editor.

Key Words:.

Abstract

From a very young age, shoes for boys tend to be wider than shoes for girls. Is this because boys have wider feet, or because it is assumed that girls are willing to sacrifice comfort for fashion, even in elementary school? To assess the former, a statistician measures kids’ feet.

1. Narrow Shoes for Girls

When my daughter was in fourth grade, I took her shopping for dress shoes. I was disappointed in the quality of girls’ shoes at every store in the mall. The shoes for boys were sturdy and had plenty of room in the toes. On the other hand, shoes for girls were flimsy, narrow, and had pointed toes. In spite of the better construction for boys, the costs of the shoes were similar! For children the same age, boys had shoes they could run around in, while girls’ shoes were clearly for style and not comfort.

Upon complaining about this state of affairs, I was told by sales representatives in two stores that boys actually had wider feet than girls, so needed wider shoes. Being very skeptical, I thought I would test this claim. The data I collected have subsequently been used in my statistics classes at several levels from introductory statistics to linear models.

2. Collecting Data

Shortly after the shopping trip, my daughter’s teacher sent home requests to parents, asking if we could schedule a day to come and tell the children about what we do for our jobs. In the fourth grade classroom I talked about “trying to find out about stuff using numbers.” After some simple examples, I posed the problem about feet. Most of the girls were convinced that their feet were not narrower, and most of the boys didn’t seem to have an opinion about it. They all agreed measuring foot widths was in order.

I asked the children to design a study to see if boys have wider feet than girls. Of course we could not measure only foot width. I asked the class, “what if the boys in this class are, on average, bigger than the girls, and wear bigger shoe sizes?” That way, the average width for the boys’ feet might be bigger than that for the girls, but maybe not for a given shoe size. After some discussion, we all agreed to measure foot lengths as well, because shoes are fit according to length.

Should we collect any other data about the kids? One of the students suggested that age of the child should be important, because older kids have bigger feet. After some lengthy discussion, the kids decided that both birth month and year should be recorded, because “some kids were nine and a half, and others were only nine.”

Then we turned to the problem of the actual measurements. The instrument used to measure feet was constructed to resemble the familiar foot-sizing device often seen at shoe stores. It was cut out of cardboard and had a ruler glued to the surface. A block of wood was fastened to the end, behind the ruler, for the children to place their heels against. The length and width were to be measured in centimeters. But should we measure right feet, left feet, or both? The kids decided, “measure the longer foot” and I went along with that. Someone wanted also to record whether the left or right foot was longer, which led to some discussion about whether kids who were right handed had right feet that were bigger. Feeling that we were straying from the topic, I suggested that we also record whether each child was right or left handed, so we could get on with the measuring.

Everyone was willing to step up to the measuring device, and hold very still. After yet more discussion of how much weight should be on the foot measured, the kids agreed to stand as evenly as possible on both feet. The width of the foot was the widest measurement perpendicular to the length. After we went through all the feet in my daughter’s class, we invited another class to get measured as well, so we ended up with measurements for 39 fourth graders.

3. Analysis

We want to answer the question, “do boys have wider feet than girls” (at least for fourth graders). A naive first step is to simply compare widths by gender. Figure 1 shows a substantial difference in mean widths that looks statistically significant. The summary statistics are shown in Table 1.

Figure 1

Figure 1. Widths of kids’ feet, for boys and girls. The horizontal lines mark the average width for each group.

Table 1: Summary Statistics for widths of boys’ and girls’ feet.

Boys Girls

mean 9.190 8.784

standard deviation 0.4518 0.4936

sample size 20 19

	Boys	Girls
mean	9.190	8.784
standard deviation	0.4518	0.4936
sample size	20	19

If we let represent the average foot width for fourth-grade boys, and represent the average foot width for fourth-grade girls, we can write the appropriate hypotheses as

H_₀: = , versus
H_{_a}: > .

A two sample t-test with samples assumed to be independent can be performed using the summary statistics. Note that the pooled sample standard deviation is S_{_p} = 0.4725 and the t-statistic is 2.68. The area to the right of 2.68, under a t-density with 37 degrees of freedom, is p = 0.0055. We reject H_₀ at = 0.01, and conclude that boys do indeed have wider feet than girls.

Of course, as the fourth graders figured out with some prompting, it could be that the boys are actually just larger than the girls, on average. In fact, this seemed to be the case just from glancing over the group. We need to control for foot length in the model.

To get an idea of the relationship between foot width and foot length for boys and girls, we examine the scatter plot shown in Figure 2. The boys’ measurements are represented as circles and the girls’ as triangles. We see right away that the points in the upper right (larger measurements of both width and length) tend to be for boys, while the points in the lower left tend to be for girls. The question of “do boys have wider feet than girls” does not have so clear an answer once foot length is considered.

Figure 2

Figure 2. Widths of kids’ feet, plotted against length, for boys and girls.

If we assume that foot width increases linearly with length over the range of these data, then we can fit a standard analysis of covariance model to the data. The model can be written as

where

y_{_i} is the foot width in centimeters for the i^th child;

x_{_i} is the foot length in centimeters for the i^th child;

d_{_i} is a dummy variable so that

	1 if i^th child is a boy
d_{_i} =
	0 if i^th child is a girl

is the random variation associated with the i^th measurement.

We assume that the errors have mean zero, and are independent and normally distributed with equal variances. Note that for girls, we have d_{_i} = 0, so that the equation representing the relationship is . For boys, the equation representing the relationship is . We interpret the parameters as

The intercept is the expected foot width for girls when the foot length is zero; not a useful interpretation!

The slope represents the expected number of centimeters increase in foot width, for every one centimeter increase in foot length, for 4th grade children.

The parameter represents the average difference in mean foot width for boys and girls, for a given length of foot.

The hypotheses of interest are then

H_₀: = 0, versus
H_{_a}: > 0.

Table 2: Regression results for ANCOVA model.

Source SS df MS F-stat p-value

Model 4.535 2 2.267 15.31 <0.0001

Error 5.333 36 0.148

Total 9.868 38

Parameter estimate std err t-stat p-value

Intercept 3.851 1.113 3.46 0.0014

Foot Length 0.221 0.0496 4.45 <0.0001

Boy 0.233 0.129 1.80 0.0806

Source	SS	df	MS	F-stat	p-value
Model	4.535	2	2.267	15.31	<0.0001
Error	5.333	36	0.148
Total	9.868	38
Parameter	estimate	std err	t-stat	p-value
Intercept	3.851	1.113	3.46	0.0014
Foot Length	0.221	0.0496	4.45	<0.0001
Boy	0.233	0.129	1.80	0.0806

The regression results are shown in Table 2. The R² for the model is 4.535/9.868 = 0.459, so that 45.9% of the variation in foot width is explained by the linear relationship with foot length and gender. The foot length is a highly significant predictor of foot width, but the gender variable is of borderline significance. Remembering that we had a one-sided test, we divide the two-sided p-value by 2, to get 0.0403. This causes us to reject the null hypothesis when = 0.05, but for = 0.01 we conclude that we do not have enough information to reject the hypothesis that boys’ and girls’ feet have the same width on average, for a given length of foot. The least-squares fit to the data is shown superimposed on the scatter plot in Figure 3. The higher line represents the relationship between foot width and length for boys, and the lower line represents the relationship for girls.

Figure 3

Figure 3. Widths of kids’ feet, plotted against length, for boys and girls, with least squares regression function estimates superimposed.

The estimate of the model variance is 0.148; using this we can estimate how much foot widths for boys or girls vary. We expect about 95% of foot widths to be within two model standard deviations of the mean width, this range is about 1.54 centimeters. Finally, a normal probability plot shows no deviations from the assumed error distribution, and other residual plots similarly support the model assumptions.

The results surprised me, as I was expecting not to be able to reject the null hypothesis at = 0.05. The power of the ANCOVA test is large, about 0.9 if = 0.25. Note that the estimated average difference in mean width between boys’ and girls’ feet, for a given length, is about 2.3 millimeters. The difference in actual shoe widths (measured at local shoe stores) can be seen to be almost half a centimeter, for sizes in the appropriate range. The difference in the average of measured foot widths, while perhaps of statistical significance, may not be of practical significance, considering that the difference is well within the estimate of the model standard deviation. The variation of foot widths within gender is more substantial than the variation between genders.

4. Discussion

The analyses presented here are appropriate for an introductory regression course. The instructor should discuss the sampling scheme, as this is clearly a convenience sample. Is there any reason to think that the sample might not be representative of the population of interest? Can conclusions be drawn about the population of all fourth-graders? To what extent does the result about fourth-graders apply to the general population of children?

The data set can be used higher level data analysis classes to make a point about keeping the purpose of the study in mind, when the model is chosen. This is a good example of a case where an observational study is entirely appropriate, and possible confounders are irrelevant.

For my advanced classes, including linear models and consulting, I tell the students the story, including the purpose of the study and the motivation for collecting the data, and give them the data set with no clues about the hypotheses to be tested. Typically, the students do a sophisticated variable selection routine, using all variables including whether or not the child is right or left handed. They discover, for example, that age is a significant predictor of foot width. They present the “best model” in terms of minimizing some criteria such as AIC. They discuss covariates and confounding, two issues I tend to emphasize as very important in data analysis.

However, this more sophisticated analysis does not answer the purpose of the study! When selecting a shoe size for a child, the length of the foot is measured. No one asks how old the child is (except perhaps in the interests of polite conversation), or whether the child is right or left handed. We should not build a model using these variables, because the only issue concerns foot width, foot length, and gender. We don’t care about possible confounding factors, because we do not wish to make a cause and effect conclusion about feet. We simply want to know, are shoe manufacturers justified in their decision to make boys’ shoes wider than girls’ shoes, for the same length feet. Even though age is a significant predictor of foot width, it should not be included in the model.

In conclusion, we estimate that the mean foot width for fourth-graders is about 2.3 millimeters larger for boys, and this size is of borderline statistical significance. It would be interesting to see if a repeat study would again reject the one-sided null hypothesis at = 0.05.

Shoe size charts for men and women can be found on the web at www.bravesurf.com/knowledge/shoe_sizing.htm#D.

For example, it is found that a women’s size 9.5 corresponds to a foot length of 25.4 centimeters, with standard width (B) of 8.6 centimeters, while a man’s size 8 is for the same foot length but a standard width (D) of 9.7 centimeters. Determining if this difference in adult shoe widths is reasonable for physiological differences between men and women would be a nice exercise for a statistics class of about 40 students. Perhaps an ANCOVA model would be appropriate for adult feet as well. The exercise of collecting data, formulating hypotheses, and doing the analyses provides a useful synthesis of various topics covered in a statistics classroom.

5. Getting the Data

The file kidsfeet.dat.txt contains the data on the 39 fourth graders. The file kidsfeet.txt is a documentation file that contains a brief description of the dataset.

Mary C. Meyer
Department of Statistics
University of Georgia
Athens, GA
U.S.A.
mmeyer@stat.uga.edu