David Martin
Davidson College
Journal of Statistics Education Volume 16, Number 3 (2008), ww2.amstat.org/publications/jse/v16n3/martin.html
Copyright © 2008 by David Martin all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.
Key Words: Joint influence; Multicollinearity; Ftest confidence region; Ttest confidence interval.
This note presents a spreadsheet tool that allows teachers the opportunity to guide students towards answering on their own questions related to the multiple regression Ftest, the ttests, and multicollinearity. The note demonstrates approaches for using the spreadsheet that might be appropriate for three different levels of statistics classes, so teachers can select the context that is most appropriate for their particular needs. The spreadsheet tool is linked to this article, and materials are provided in the appendices for teachers to use as handouts, homework questions, and answer keys.
This note was inspired by a question that commonly arises when teaching multiple regression analysis: "How does multicollinearity differ from the case of the two independent variables jointly influencing the dependent variable?" The purpose of this note is to present a spreadsheet tool that allows teachers the opportunity to guide students towards answering on their own this question, as well as others related to the multiple regression Ftest, the ttests, and multicollinearity. The note demonstrates approaches for using the spreadsheet that might be appropriate for three different levels of statistics classes, so teachers can select the context that is most appropriate for their particular needs.
The materials for Statistics Level 1 describe the basic Ftest and the ttest confidence intervals for multiple regression with two independent variables and how multicollinearity affects those test results. The materials for Statistics Level 2 include those elements and add problems that ask the students to calculate the areas within those intervals so that students can develop a better appreciation for how multicollinearity affects those tests differently. Statistics Level 3 augments those materials with Bonferronitype corrections of significance levels.
The teaching context assumed here is that of a lab session directed by a teaching assistant that follows a lecture on the relevant material by the course instructor. However, the materials are easily adaptable for a variety of circumstances ranging from inclass discussion to homework problems worked by the students independently. The material in the text of the note might best be viewed as the lecture material that proceeds the lab session. A lab handout, including homework questions and a chart for students to complete, as well as suggested answers for that chart and for those homework questions are in the appendices.
The linked spreadsheet contains eight worksheets. The worksheet Basic can be adapted easily for any teaching goal involving the Ftest and the ttests in this regression context; it was used to develop all of the materials discussed here. The other worksheets within that spreadsheet are those mentioned in the appendices for student use.
Consider the case in which two independent variables (x_{1i} and x_{2i}) are assumed to linearly affect the dependent variable (y_{i}) as given by the population regression model below, with observations i=1. . .n and error term ε_{i}.
.
The sample predicted value of the dependent variable is given in Equation (2), where the_{}s are least squares estimators of the coefficients above:
_{}.
Since the coefficient of determination is often presented before the students see the tests of the slope coefficient significance but is related to the overall Ftest, a teacher can use it to foreshadow how the slope coefficients and multicollinearity can affect the evaluation of a sample regression model. The coefficient of determination can be expressed in the form given in Equation (3), where_{},_{}, and _{}are the variances of the three variables and r is the correlation coefficient between the two independent variables.
_{}
First, by having the students assume that the correlation between the two independent variables equals zero the teacher will help the students see that the model fits the data better as one or both of the slope coefficients increase in absolute value away from zero. Second, by assuming that the two slope coefficients are small in absolute value the teacher can point out that the sample model also can fit well if the correlation coefficient is large enough. The important emphasis for this discussion is that these are possibly two different outcomes; a good model might result from having large slope coefficients or from a large correlation coefficient.
The hypotheses for the overall Ftest, the level of significance, and the Ftest statistic are given below (_{}is the sample error term variance).
_{}
_{}
Using Equations (4) and (5) followed by the same substitution used in the formula for the coefficient of determination, the rarely taught confidence region for the Ftest is:
_{}.
Equation (6) determines an ellipse in _{}space centered at the value of zero for both _{}and_{}. The interior of this "Ftest ellipse" is the region of _{}and_{}values for which the analyst concludes that the two slope coefficients are jointly insignificant. The important points about this confidence region are that it allows the students to see how different regression results can be significant or not and that it will allow the students to learn how multicollinearity affects the test results. Figure 1 ^{1} demonstrates this elliptical confidence region in_{}space^{2}.
The lab materials associated with this student level are in Appendix A. Early in the lab session, the students will have the opportunity to manipulate the standard deviations of x_{1}_{ }and x_{2} and the regression standard error to see how the Ftest ellipse changes. The first homework question asks the students to examine how changing the level of significance changes the size of the ellipse. And later, the students will have the opportunity to change the correlation coefficient to see how multicollinearity changes the size and tilt of the Ftest ellipse (as demonstrated in Figures 36, which are discussed below).
The hypothesis tests and confidence intervals for the two individual slope coefficient ttests are presented in Equations (7) and (8). The notation for the significance level allows for the possibility that the analyst might use one level of significance for the Ftest and lower levels of significance for the ttest.
_{}
_{}
_{}
The two ttest confidence intervals form a rectangle; the "ttest rectangle" and its interior is the region of _{}and_{}values for which the analyst must conclude that both slope coefficients are individually insignificant. Figure 2 overlays the Ftest ellipse and the ttest rectangle. The labels (a)(e) are explained below.
As with the Ftest ellipse, early in the lab materials presented in Appendix A the students will have the opportunity to manipulate the standard deviations of x_{1}_{ }and x_{2} and the regression standard error to see how the ttest rectangle changes. The first homework question asks the students to examine how changing the level of significance changes the size of the rectangle.
Subsequent homework questions ask the students to change the correlation coefficient between the two independent variables so they can learn how multicollinearity affects the Ftest results and the ttest results. Figure 3 through Figure 6 preview how the spreadsheet presents these impacts. Students often have a good sense about how the ttest rectangle changes with multicollinearity and can readily understand that the Ftest ellipse increases in size, although the exact calculations of those areas are deferred until the Statistics Level 2 material.
Instructors do need to explain the tilt in the Ftest ellipse because the direction of the tilt is the opposite of students tend to expect as it is the opposite of the sign of the correlation coefficient. To help the students, the teacher can remind the students of two concepts and then bring those concepts together to explain the tilt. First, the students should remember that the regions outside the Ftest ellipse represent regions where the two slope coefficients are jointly significant. Second, if the two independent variables are positively (negatively) correlated, then they likely affect the dependent variable similarly (differently) and so the slope coefficients would have the same sign (different signs). So, by tilting the Ftest ellipse negatively (positively), positive (negative) multicollinearity makes it more likely that the two slope coefficients are jointly significant when they have the same sign (different signs). Note that in Figure 4 and Figure 6 (the two cases in which the correlation coefficient is positive), the areas outside of the Ftest ellipse are mainly in the northeast and southwest quadrants where the slope coefficients have the same sign. Similarly, in Figure 3 and Figure 5 (the two cases in which the correlation coefficient is negative), the areas outside the Ftest ellipse are mainly in the northwest and southeast quadrants where the slope quadrants have different signs.
The final concept covered in the Statistics Level 1 material relates to the results of the Ftest and the two ttests. Geary and Leser (1968) (followed shortly by Duchan (1969)) noted that there are six possible combinations of outcomes for the three tests. The labels (a)(e) in Figure 2 correspond to the first five outcomes while the sixth arises only with multicollinearity and is discussed below.
a. The Ftest allows the analyst to conclude that the slope coefficients are jointly significant and the ttests allow the analyst to conclude that both slope coefficients are individually significant.
b. The Ftest forces the analyst to conclude that the slope coefficients are jointly insignificant and the ttests force the analyst to conclude that both slope coefficients are individually insignificant.
c. The Ftest allows the analyst to conclude that the slope coefficients are jointly significant and the ttests allow the analyst to conclude that one of the slope coefficients is individually significant.
d. The Ftest forces the analyst to conclude that the slope coefficients are jointly insignificant and, yet, the ttests allow the analyst to conclude that one of the slope coefficients is individually significant.
e. The Ftest allows the analyst to conclude that the slope coefficients are jointly significant and the ttests force the analyst to conclude that both slope coefficients are individually insignificant.
f. The Ftest forces the analyst to conclude that the slope coefficients are jointly insignificant and, yet, the ttests allow the analyst to conclude that both slope coefficients are individually significant.
The appropriate areas for outcome (e) are small (outside the Ftest ellipse but inside the ttest rectangle) so the arrow in Figure 2 points to one such region. This result is exactly the case that motivated this note; it is one in which the two variables work together to have a statistically significant joint influence on the dependent variable even if neither variable’s marginal influence is significant. It is important to emphasize at this point that Figure 2 is drawn assuming that the correlation coefficient between the two independent variables equals zero, so this outcome is not a result of multicollinearity. This outcome can occur with multicollinearity, but unlike outcome (f), multicollinearity is not a prerequisite.
Further, we can also see in Figure 3 through Figure 6 that the areas corresponding to Geary and Leser’s outcome (e), the region in which we conclude that the slope coefficients are jointly significant but individually insignificant increases in size as the correlation coefficient increases in absolute value away from zero. So, the initial student question that motivated this paper does make sense. It can be difficult to distinguish between two independent variables jointly influencing the dependent variable and two independent variables being multicollinear. Clearly, the teacher needs to emphasize that the context of the analysis will be the driving force in distinguishing between the two possibilities in any specific case.
The change in the shape of the Ftest ellipse due to multicollinearity allows for the possibility that the two slope coefficients can be insignificant in the Ftest but significant in the both of the two individual ttests (Geary and Leser’s outcome (f)). This possibility occurs at the extreme ends of the Ftest ellipse (as the stretching of the Ftest ellipse is more obvious at higher correlation coefficient values this possibility is denoted by the arrows only in Figure 6). However, obtaining these types of results is unlikely as, in the context of Figure 6, that outcome would require the two slope coefficients to have different signs when the variables are positively correlated (and the same signs when the variables are negatively correlated as in Figure 5).
The general concept that the area of the Ftest ellipse increases with multicollinearity is easy to motivate, but many students are not familiar with calculating the areas of ellipses. So, if an instructor wants to skip that topic (as well as the Bonferronitype corrections), the materials for Statistics Level 1 are sufficient. However, if teachers want their students to take the next step of using the spreadsheet to calculate the areas of the Ftest ellipse and the ttest rectangle, then the materials for Statistics Level 2 are the appropriate materials to use.
The materials for Statistics Level 2 repeat everything in Statistics Level 1 and augment them with spreadsheet calculations of the areas of the Ftest ellipse and ttest rectangle. The spreadsheet has the area calculations discussed below already programmed into the appropriate cells, so teachers don’t need to worry the their students’ programming accuracy.
Given a general ellipse
_{},
the area of an ellipse is:
_{}.
Thus, the area of the Ftest ellipse in Equation (6) is:
_{}
_{}
Clearly, this area increases as the correlation coefficient increases.
Reworking Equations (8a) and (8b) allows the area of the ttest rectangle to be defined as in Equation (10) below.
_{}
_{}
Students readily understand that the area of the ttest rectangle increases as the correlation coefficient increases.
The homework materials ask the students to record the sizes of both the Ftest ellipse and the ttest rectangle. The students will learn, as demonstrated in Table 1, that multicollinearity affects the ttest rectangle more than it does the Ftest ellipse.

Area of Fellipse 
Area of trectangle 
Ratio of Areas (F/t) 
r = 0.00 
0.150 
0.111 
1.343 
r =0.30 
0.157 
0.122 
1.281 
r = 0.60 
0.187 
0.174 
1.074 
r =0.90 
0.343 
0.586 
0.585 
r = 0.95 
0.479 
1.143 
0.419 
The final concept to consider is the possibility of using Bonferronitype of corrections. The context here assumes that the analyst has chosen a significance level for the Ftest (α in this note) and then will reduce the significance level of the ttest a more appropriate level (α^{*} in this note). One adhoc version of the Bonferroni Correction to the significance level of the ttests is developed by beginning with definition of the union of rejecting the two ttest null hypotheses, as noted below.
_{}
Next, if we assume that
_{},
then we see:
_{}
That inequality leads to the definition of one adhoc Bonferroni Correction for this context as given in equation (11).
(11) _{}
The instructional materials in Appendix C begin with a pattern very similar to the pattern of the materials presented for Statistics Level 1 and Statistics Level 2. The specific questions are a bit different because this assignment has the students work with a 10% significance level for the Ftest and a 5% significance level for the two ttests rather than working with only a 10% significance level for all three tests.
The last bit of these instructional materials take a different tack to the Bonferronitype corrections, using a question that students naturally raise. A teacher might ask it as, "What level of significance for the ttests would eliminate Geary and Leser’s problematic outcomes (d) and (f)?" A student might ask it as, is there a level of significance that moves the ttest rectangle to be tangent to the Ftest ellipse? The followup question is naturally, does that level of significance change with the correlation between the two independent variables?
The answer, as demonstrated in the Answer Key in Appendix C, is that the appropriate level of significance is derived from the relationship:
_{}.
So, multicollinearity is irrelevant here. The exercises first ask the students to solve for Equation (12) and the implied level of significance and then incorporate those results into their analysis.
In the introductory applied statistics course I teach, the first course in a twocourse sequence, I am always rushing through the multiple regression material at the end of the semester. I found that in teaching through the Statistics Level 2 material I was trying to cover too much material at a busy time in the semester. That is why I created the rather arbitrary separation between the Statistics Level 1 materials and the Statistics Level 2 materials. At the time of this writing, I plan to use the Statistics Level 1 materials in the statistics course when I teach it next. The Statistics Level 2 materials were appropriate for the second course in the sequence, basic econometrics, as multiple regression analysis is that course’s prime focus.
The textbook I use for the basic econometrics course does not discuss Bonferronitype corrections, which is one of the two reasons why I will not use the Statistics Level 3 materials for it. The other reason follows from the point that the course introduces students to reading applied economics journal articles, and the articles that are readable for students at that level rarely mention Bonferronitype corrections. So, the topic doesn’t seem very important for this level of student. However, it was a couple of students in my advanced econometrics course who asked the question about shifting the ttest rectangle to match the Ftest ellipse and thereby motivated the presentation of that topic in the Statistics Level 3 materials. So, I will use the Statistics Level 3 materials for the advanced econometrics class.
Lab Materials
Begin with the worksheet Statistics 1 Handout (which follows and is included in the linked spreadsheet). Your TA will begin by describing various components of the spreadsheet.
· The information in columns AC present abbreviated versions of the results from Excel’s:
o Descriptive statistics,
o Correlation analysis, and
o Regression analysis.
· The information in columns AC is used in the "Input for Data Analysis" section.
· The analyst selects the significance level. For this case, the Ftest significance level (cell ‘Statistics 1’!G12) simply uses the ttest significance level (cell ‘Statistics 1’!G11) although that equality could be changed.
· The graph presents the Ftest ellipse (_{}and _{}values outside the ellipse allow the analyst to reject the null hypothesis for the Ftest), the critical values for the ttest of _{}(_{}values to the left and right allow the analyst to reject the null hypothesis), and the critical values for the ttest of _{}(_{}values above and below allow the analyst to reject the null hypothesis).
As your TA increases the standard deviation of each independent variable and the sample size, in each case making the statistical point that we have more information about the relationship between the variables, notice how the Ftest ellipse and the ttest rectangle are affected. You should be able to predict the affect of each change in advance; if not please be certain to ask why the change lead to the affect. Similarly, as your TA increases the standard error of the estimate, which says that our regression model is less accurate, you should be able to predict how the ttest rectangle and the Ftest ellipse change.
Return the elements back to their original values. Next your TA will change the values of the pseudo regression results in columns P, Q, and R to match those on the worksheet on the next page. Enter the values into your spreadsheet at the same time; your figure should match the TA’s exactly.
Remember that there are six possible combinations of Ftest and ttest results.
1. Fstatistic is significant and both slope coefficient tstatistics are significant.
2. Fstatistic is insignificant and both slope coefficient tstatistics are insignificant.
3. Fstatistic is significant and only one of the two slope coefficient tstatistics are significant.
4. Fstatistic is insignificant and only one of the two slope coefficient tstatistics are significant.
5. Fstatistic is significant and both slope coefficient tstatistics are insignificant.
6. Fstatistic is insignificant and both slope coefficient tstatistics are significant.
The fifth possibility is the case in which the two independent variables work together to help explain the dependent variable even though neither variable has a significant marginal impact. The sixth possibility occurs only when the two independent variables are correlated with each other, and so you will not see that possibility until you get to one of the multicollinearity situations.
Your TA will work through the worksheet with the case of the significance level equaling 5% and the correlation coefficient for the two independent variables equaling 0.0. For this particular case, you should see that the first possibility occurs twice and the fourth possibility occurs twice.
Homework assignment
Change the significance to 10%.
1. Complete the second column in the worksheet.
2. Use the first two columns (both with r=0) to explain how the worksheet demonstrates the impact of increasing the level of significance.
3. In the second column (α=10% and r=0), which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Is this a result of multicollinearity?
4. What is unexpected about the test results for Data 2 (α=10% and r=0) and for Data 4 (α=10% and r=0)? Do these outcomes result from multicollinearity?
5. Complete the remainder of the worksheet.
6. Importantly, multicollinearity increases (in absolute value) the critical value of the slope coefficient (value of_{} that allows you to reject the null hypothesis).
a. Demonstrate that point mathematically.
b. Demonstrate that point using the estimation results above.
7. Multicollinearity also increases the size of the Ftest ellipse and tilts it. Which estimation results demonstrate the impacts of those effects?
8. In the third, fourth, fifth, and sixth columns (r>0), which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Do these outcomes result from multicollinearity?
9. What is unexpected about the test results for Data 11 (α=10% and r=+.6)?
Statistics 1 Handout
Write an "R" when the null hypothesis is rejected. 







sig = 5% 
Use a 10% significance level 

Regression Estimates 

r=0 
r=0 
r=.3 
r=.9 
r=+.6 
r=+.95 

H_{0} 






Data 1 (0.10,0.00) 
H_{01} 







H_{02} 







H_{0} 






Data 2 (0.19,0.00) 
H_{01} 







H_{02} 







H_{0} 






Data 3 (0.24,0.00) 
H_{01} 







H_{02} 







H_{0} 






Data 4 (0.10,0.18) 
H_{01} 







H_{02} 







H_{0} 






Data 5 (0.10,0.22) 
H_{01} 







H_{02} 







H_{0} 






Data 6 (0.10,0.10) 
H_{01} 







H_{02} 







H_{0} 






Data 7 (0.16,0.16) 
H_{01} 







H_{02} 







H_{0} 






Data 8 (0.22,0.22) 
H_{01} 







H_{02} 







H_{0} 






Data 9 (0.10,0.10) 
H_{01} 







H_{02} 







H_{0} 






Data 10 (0.16,0.16) 
H_{01} 







H_{02} 







H_{0} 






Data 11 (0.22,0.22) 
H_{01} 







H_{02} 






Answer Key
Write an "R" when the null hypothesis is rejected. 







sig = 5% 
Use a 10% significance level 

Regression Estimates 

r=0 
r=0 
r=.3 
r=.9 
r=+.6 
r=+.95 

H_{0} 
 
 
 
 
 
 
Data 1 (0.10,0.00) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
 
 
 
 
 
Data 2 (0.19,0.00) 
H_{01} 
 
R 
R 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
R 
R 
R 
R 
R 
Data 3 (0.24,0.00) 
H_{01} 
R 
R 
R 
 
R 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
 
 
 
R 
R 
Data 4 (0.10,0.18) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
R 
R 
 
 
 

H_{0} 
 
R 
 
 
R 
R 
Data 5 (0.10,0.22) 
H_{01} 
 
 
 
 
 
 

H_{02} 
R 
R 
R 
 
R 
 

H_{0} 
 
 
 
 
 
 
Data 6 (0.10,0.10) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
R 
 
 
R 
R 
Data 7 (0.16,0.16) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
R 
R 
R 
 
R 
R 
Data 8 (0.22,0.22) 
H_{01} 
R 
R 
R 
 
R 
 

H_{02} 
R 
R 
R 
 
R 
 

H_{0} 
 
 
 
 
 
 
Data 9 (0.10,0.10) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
R 
R 
R 
 
 
Data 10 (0.16,0.16) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
R 
R 
R 
R 
 
 
Data 11 (0.22,0.22) 
H_{01} 
R 
R 
R 
 
R 
 

H_{02} 
R 
R 
R 
 
R 
 
For the TA:
· As the standard deviation of either variable increases, which would signify that the analyst has more information about the linear relationship:
o The Ftest ellipse compresses along that variable’s slope coefficient axis. Such compression makes it easier to reject the null hypothesis for the Ftest.
o The ttest rectangle also compresses along that variable’s slope coefficient axis. Such compression makes it easier to reject the null hypothesis for the ttest.
· As the sample size increases:
o The Ftest ellipse compresses in both dimensions, reflecting the increased information available to the analyst.
o The ttest rectangle similarly compresses in both dimensions.
· As the standard error of the regression increases:
o The Ftest ellipse expands in both directions as the standard error increases, corresponding to the increased inaccuracy of the regression fit.
o The ttest rectangle similarly expands in both dimensions.
Change the significance to 10%.
2. Use the first two columns (both with r=0) to explain how the worksheet demonstrates the impact of increasing the level of significance.
Data 2 … with the higher level of significance the tstatistic for_{}becomes significant.
Data 3 … with the higher level of significance the Fstatistic becomes significant.
Data 4 … with the higher level of significance the tstatistic for_{}becomes significant.
Data 5 … with the higher level of significance the Fstatistic becomes significant.
Data 7 … with the higher level of significance the Fstatistic becomes significant.
Data 10 … with the higher level of significance the Fstatistic becomes significant.
3. In the second column (α=10% and r=0), which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Is this a result of multicollinearity?
Data 7
Data 10
No, this is not a result of multicollinearity as r=0.
4. What is unexpected about the test results for Data 2 (α=10% and r=0) and for Data 4 (α=10% and r=0)? Do these outcomes result from multicollinearity?
· In both cases, one tstatistic is significant (so one independent variable has a significant marginal influence) but the Fstatistic is insignificant.
· This is not a result of multicollinearity as r=0.
6. Importantly, multicollinearity increases (in absolute value) the critical value of the slope coefficient (value of_{} that allows you to reject the null hypothesis).
a. Demonstrate that point mathematically.
As an example:
_{}
So, as r increases from zero to one, the standard error of the slope coefficient increase and the critical value increases.
b. Demonstrate that point using the estimation results above.
Data 2 … the tstatistic for_{}becomes insignificant as r increases in absolute value.
Data 3 … the tstatistic for_{}becomes insignificant as r increases in absolute value.
Data 4 … the tstatistic for_{}becomes insignificant as r increases in absolute value.
Data 5 … the tstatistic for_{}becomes insignificant as r increases in absolute value.
Data 8 … the tstatistics for both_{}and _{}become insignificant as r increases in absolute value.
Data 11 … the tstatistics for both_{}and _{}become insignificant as r increases in absolute value.
7. Multicollinearity also increases the size of the Ftest ellipse and tilts it. Which estimation results demonstrate changes in the Ftest ellipse? What are those changes and what are the impacts of those changes on the Fstatistic?
Data 4 … The Ftest ellipse tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 5 … The Ftest ellipse becomes larger as r increases, so the Fstatistic changes from significant to insignificant. But, the Ftest ellipse also tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 7 … The Ftest ellipse becomes larger as r increases, so the Fstatistic changes from significant to insignificant. But, the Ftest ellipse also tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 8 … The Ftest ellipse becomes larger as r increases, so the Fstatistic changes from significant to insignificant. But, the Ftest ellipse also tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 10 … The Ftest ellipse tilts positively with negative r so the Fstatistic stays significant with negative r values. Because the Ftest ellipse tilts negatively with a positive r, the Fstatistic becomes insignificant with positive r values.
Data 11 … The Ftest ellipse tilts positively with negative r so the Fstatistic stays significant with negative r values. Because the Ftest ellipse tilts negatively with a positive r, the Fstatistic becomes insignificant with positive r values.
8. In the third, fourth, fifth, and sixth columns (r>0), which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Do these outcomes result from multicollinearity?
With r=0 (so multicollinearity is not a factor) as well as at higher rvalues:
Data 7 and Data 10
At high correlation coefficients, so multicollinearity is a factor:
Data 3, Data 4, Data 5, Data 8, Data 11
9. What is unexpected about the test results for Data 11 (α=10% and r=+.6)?
Both the tstatistic for_{}and the tstatistic for_{}are significant, indicating that both variables have significant individual marginal influences, but the Fstatistic is insignificant. So, jointly, the two variables offset each other. This result is because of the multicollinearity.
Lab Materials
Begin with the worksheet Statistics 2 Handout (which follows and is included in the linked spreadsheet). Your TA will begin by describing various components of the spreadsheet.
· The information in columns AC present abbreviated versions of the results from Excel’s:
o Descriptive statistics,
o Correlation analysis, and
o Regression analysis.
· The information in columns AC is used in the "Input for Data Analysis" section.
· The analyst selects the significance level. For this case, the Ftest significance level (cell ‘Statistics 2’!G12) simply uses the ttest significance level (cell ‘Statistics 2’!G11) although that equality could be changed.
· The graph presents the Ftest ellipse (_{}and _{}values outside the ellipse allow the analyst to reject the null hypothesis for the Ftest), the critical values for the ttest of _{}(_{}values to the left and right allow the analyst to reject the null hypothesis), and the critical values for the ttest of _{}(_{}values above and below allow the analyst to reject the null hypothesis).
As your TA increases the standard deviation of each independent variable and the sample size, in each case making the statistical point that we have more information about the relationship between the variables, notice how the Ftest ellipse and the ttest rectangle are affected. You should be able to predict the affect of each change in advance; if not please be certain to ask why the change lead to the affect. Similarly, as your TA increases the standard error of the estimate, which says that our regression model is less accurate, you should be able to predict how the ttest rectangle and the Ftest ellipse change.
Return the elements back to their original values. Next your TA will change the values of the pseudo regression results in columns P, Q, and R to match those on the worksheet on the next page. Enter the values into your spreadsheet at the same time; your figure should match the TA’s exactly.
Remember that there are six possible combinations of Ftest and ttest results.
1. Fstatistic is significant and both slope coefficient tstatistics are significant.
2. Fstatistic is insignificant and both slope coefficient tstatistics are insignificant.
3. Fstatistic is significant and only one of the two slope coefficient tstatistics are significant.
4. Fstatistic is insignificant and only one of the two slope coefficient tstatistics are significant.
5. Fstatistic is significant and both slope coefficient tstatistics are insignificant.
6. Fstatistic is insignificant and both slope coefficient tstatistics are significant.
The fifth possibility is the case in which the two independent variables work together to help explain the dependent variable even though neither variable has a significant marginal impact. The sixth possibility occurs only when the two independent variables are correlated with each other, and so you will not see that possibility until you get to one of the multicollinearity situations.
Your TA will work through the worksheet with the case of the significance level equaling 5% and the correlation coefficient for the two independent variables equaling 0.0. For this particular case, you should see that the first possibility occurs twice and the fourth possibility occurs twice.
Also, notice that the spreadsheet calculates the area of the Ftest ellipse and the ttest rectangle. The general formulas for those areas are:
_{}
_{}
In this particular case, the areas equal:
Farea = 0.196
tarea = 0.159
Ratio of Farea to tarea = 1.232
Given the formulas for the two areas, it should be clear that both increase as the correlation between the two independent variables increases in absolute value. The question you will answer in this lab is, which area increases faster?
Homework assignment
Change the significance to 10%.
1. Complete the second column in the worksheet.
2. Use the first two columns (both with r=0) to explain how the worksheet demonstrates the impact of increasing the level of significance.
3. In the second column (α=10% and r=0), which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Is this a result of multicollinearity?
4. What is unexpected about the test results for Data 2 (α=10% and r=0) and for Data 4 (α=10% and r=0)? Do these outcomes result from multicollinearity?
5. Complete the remainder of the worksheet.
6. Importantly, multicollinearity increases (in absolute value) the critical value of the slope coefficient (value of_{} that allows you to reject the null hypothesis).
a. Demonstrate that point mathematically.
b. Demonstrate that point using the estimation results above.
7. Multicollinearity also increases the size of the Ftest ellipse and tilts it. Which estimation results demonstrate the impacts of those effects?
8. In the third, fourth, fifth, and sixth columns (r>0), which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Do these outcomes result from multicollinearity?
9. As the correlation coefficient between the two independent variables increases in absolute value, which area increases faster: the Ftest ellipse or the ttest rectangle?
10. What is unexpected about the test results for Data 11 (α=10% and r=+.6)
Statistics 2 Handout
Write an "R" when the null hypothesis is rejected. 
sig = 5% 
Use a 10% significance level 

Regression Estimates 
r=0 
r=0 
r=.3 
r=.9 
r=+.6 
r=+.95 


H_{0} 






Data 1 (0.10,0.00) 
H_{01} 







H_{02} 







H_{0} 






Data 2 (0.19,0.00) 
H_{01} 







H_{02} 







H_{0} 






Data 3 (0.24,0.00) 
H_{01} 







H_{02} 







H_{0} 






Data 4 (0.10,0.18) 
H_{01} 







H_{02} 







H_{0} 






Data 5 (0.10,0.22) 
H_{01} 







H_{02} 







H_{0} 






Data 6 (0.10,0.10) 
H_{01} 







H_{02} 







H_{0} 






Data 7 (0.16,0.16) 
H_{01} 







H_{02} 







H_{0} 






Data 8 (0.22,0.22) 
H_{01} 







H_{02} 







H_{0} 






Data 9 (0.10,0.10) 
H_{01} 







H_{02} 







H_{0} 






Data 10 (0.16,0.16) 
H_{01} 







H_{02} 







H_{0} 






Data 11 (0.22,0.22) 
H_{01} 







H_{02} 






Area of FEllipse = 







Area of trectangle = 







Ratio of Farea to tarea = 






Answer Key
Write an "R" when the null hypothesis is rejected. 







sig = 5% 
Use a 10% significance level 

Regression Estimates 

r=0 
r=0 
r=.3 
r=.9 
r=+.6 
r=+.95 

H_{0} 
 
 
 
 
 
 
Data 1 (0.10,0.00) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
 
 
 
 
 
Data 2 (0.19,0.00) 
H_{01} 
 
R 
R 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
R 
R 
R 
R 
R 
Data 3 (0.24,0.00) 
H_{01} 
R 
R 
R 
 
R 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
 
 
 
R 
R 
Data 4 (0.10,0.18) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
R 
R 
 
 
 

H_{0} 
 
R 
 
 
R 
R 
Data 5 (0.10,0.22) 
H_{01} 
 
 
 
 
 
 

H_{02} 
R 
R 
R 
 
R 
 

H_{0} 
 
 
 
 
 
 
Data 6 (0.10,0.10) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
R 
 
 
R 
R 
Data 7 (0.16,0.16) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
R 
R 
R 
 
R 
R 
Data 8 (0.22,0.22) 
H_{01} 
R 
R 
R 
 
R 
 

H_{02} 
R 
R 
R 
 
R 
 

H_{0} 
 
 
 
 
 
 
Data 9 (0.10,0.10) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
R 
R 
R 
 
 
Data 10 (0.16,0.16) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
R 
R 
R 
R 
 
 
Data 11 (0.22,0.22) 
H_{01} 
R 
R 
R 
 
R 
 

H_{02} 
R 
R 
R 
 
R 
 
Area of FEllipse = 
0.196 
0.150 
0.157 
0.343 
0.187 
0.479 

Area of trectangle = 
0.159 
0.111 
0.122 
0.586 
0.174 
1.143 

Ratio of Farea to tarea = 
1.232 
1.343 
1.281 
0.585 
1.074 
0.419 
For the TA:
· As the standard deviation of either variable increases, which would signify that the analyst has more information about the linear relationship:
o The Ftest ellipse compresses along that variable’s slope coefficient axis. Such compression makes it easier to reject the null hypothesis for the Ftest.
o The ttest rectangle also compresses along that variable’s slope coefficient axis. Such compression makes it easier to reject the null hypothesis for the ttest.
· As the sample size increases:
o The Ftest ellipse compresses in both dimensions, reflecting the increased information available to the analyst.
o The ttest rectangle similarly compresses in both dimensions.
· As the standard error of the regression increases:
o The Ftest ellipse expands in both directions as the standard error increases, corresponding to the increased inaccuracy of the regression fit.
o The ttest rectangle similarly expands in both dimensions.
Change the significance to 10%.
2. Use the first two columns (both with r=0) to explain how the worksheet demonstrates the impact of increasing the level of significance.
The areas of the Ftest ellipse and the ttest rectangle decrease reflecting the smaller critical values that are consistent with the larger significance level.
Data 2 … with the higher level of significance the tstatistic for_{}becomes significant.
Data 3 … with the higher level of significance the Fstatistic becomes significant.
Data 4 … with the higher level of significance the tstatistic for_{}becomes significant.
Data 5 … with the higher level of significance the Fstatistic becomes significant.
Data 7 … with the higher level of significance the Fstatistic becomes significant.
Data 10 … with the higher level of significance the Fstatistic becomes significant.
3. In the second column (α=10% and r=0), which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Is this a result of multicollinearity?
Data 7
Data 10
No, this is not a result of multicollinearity as r=0.
4. What is unexpected about the test results for Data 2 (α=10% and r=0) and for Data 4 (α=10% and r=0)? Do these outcomes result from multicollinearity?
· In both cases, one tstatistic is significant (so one independent variable has a significant marginal influence) but the Fstatistic is insignificant.
· This is not a result of multicollinearity as r=0.
6. Importantly, multicollinearity increases (in absolute value) the critical value of the slope coefficient (value of_{} that allows you to reject the null hypothesis).
a. Demonstrate that point mathematically.
As an example:
_{}
So, as r increases from zero to one, the standard error of the slope coefficient increase and the critical value increases.
b. Demonstrate that point using the estimation results above.
Data 2 … the tstatistic for_{}becomes insignificant as r increases in absolute value.
Data 3 … the tstatistic for_{}becomes insignificant as r increases in absolute value.
Data 4 … the tstatistic for_{}becomes insignificant as r increases in absolute value.
Data 5 … the tstatistic for_{}becomes insignificant as r increases in absolute value.
Data 8 … the tstatistics for both_{}and _{}become insignificant as r increases in absolute value.
Data 11 … the tstatistics for both_{}and _{}become insignificant as r increases in absolute value.
7. Multicollinearity also increases the size of the Ftest ellipse and tilts it. Which estimation results demonstrate changes in the Ftest ellipse? What are those changes and what are the impacts of those changes on the Fstatistic?
Data 4 … The Ftest ellipse tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 5 … The Ftest ellipse becomes larger as r increases, so the Fstatistic changes from significant to insignificant. But, the Ftest ellipse also tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 7 … The Ftest ellipse becomes larger as r increases, so the Fstatistic changes from significant to insignificant. But, the Ftest ellipse also tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 8 … The Ftest ellipse becomes larger as r increases, so the Fstatistic changes from significant to insignificant. But, the Ftest ellipse also tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 10 … The Ftest ellipse tilts positively with a negative r so the Fstatistic stays significant with negative r values. Because the Ftest ellipse tilts negatively with a positive r, the Fstatistic becomes insignificant with positive r values.
Data 11 … The Ftest ellipse tilts positively with a negative r so the Fstatistic stays significant with negative r values. Because the Ftest ellipse tilts negatively with a positive r, the Fstatistic becomes insignificant with positive r values.
8. In the third, fourth, fifth, and sixth columns (r>0), which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Do these outcomes result from multicollinearity?
With r=0 (so multicollinearity is not a factor) as well as at higher rvalues:
Data 7 and Data 10
At high correlation coefficients, so multicollinearity is a factor:
Data 3, Data 4, Data 5, Data 8, Data 11
9. As the correlation coefficient between the two independent variables increases in absolute value, which area increases faster: the Ftest ellipse or the ttest rectangle?
The area of the ttest rectangle increases faster than the area of the Ftest ellipse.
10. What is unexpected about the test results for Data 11 (α=10% and r=+.6)?
Both the tstatistic for_{}and the tstatistic for_{}are significant, indicating that both variables have significant individual marginal influences, but the Fstatistic is insignificant. So, jointly, the two variables offset each other. This result is because of the multicollinearity.
Lab Materials
Consider the regression model with two independent variables.
_{} Further: _{}
and: _{}.
As noted in class, one adhoc Bonferroni Correction to the significance level of the ttests is:
_{}
Also as noted in class, Geary and Leser[3] (followed shortly by Duchan[4]) compared the outcomes of the overall Ftest and the slope coefficient ttests for the case of a regression model with two independent variables. Assuming that the analyst used the same significance level for all three tests, the six possible outcomes were:
1. Fstatistic is significant and both slope coefficient tstatistics are significant.
2. Fstatistic is insignificant and both slope coefficient tstatistics are insignificant.
3. Fstatistic is significant and only one of the two slope coefficient tstatistics are significant.
4. Fstatistic is insignificant and only one of the two slope coefficient tstatistics are significant.
5. Fstatistic is significant and both slope coefficient tstatistics are insignificant.
6. Fstatistic is insignificant and both slope coefficient tstatistics are significant.
Instead of doing a Bonferronitype of correction, the analyst might ask the question, "What level of significance for the ttests would eliminate the problematic 4^{th} and 6^{th} outcomes?" In other words, is there a level of significance that moves the ttest rectangle to be tangent to the Ftest ellipse? Further, does that level of significance change with the correlation between the two independent variables?
Homework assignment
A. In the worksheet Statistics 3 Handout (which follows and is included in the linked spreadsheet). , set the significance level for the Ftest to 10% and, as an adhoc Bonferroni correction, set the significance level for the ttest to 5%.
1. Complete the first two columns in the handout.
2. Use the first two columns (both with r=0) to explain how the worksheet demonstrates the impact of the adhoc Bonferroni correction.
3. In the first column, which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Is this a result of multicollinearity?
4. In the first column, what is unexpected about the test results for Data 2 and for Data 4? Do these outcomes result from multicollinearity?
5. Complete the remainder of the handout.
6. Importantly, multicollinearity increases (in absolute value) the critical value of the slope coefficient (value of_{} that allows you to reject the null hypothesis).
a. Demonstrate that point mathematically.
b. Demonstrate that point using the estimation results above.
7. Multicollinearity also increases the size of the Ftest ellipse and tilts it. Which estimation results demonstrate the impacts of those effects?
8. In the third, fourth, fifth, and sixth columns (r>0), which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Do these outcomes result from multicollinearity?
9. As the correlation coefficient between the two independent variables increases in absolute value, which area increases faster: the Ftest ellipse or the ttest rectangle?
10. This question suggests one advantage to using some type of Bonferroni correction. Change the significance level for the ttests to 10% (which equals the significance level for the Ftest) and change the correlation coefficient between the independent variables to +0.6. In this case, what is unexpected about the test results for Data 11 and how does the Bonferroni correction overcome this problem?
B. Solve for the significance level that moves the ttest rectangle to be tangent to the Ftest ellipse.
11. Show your derivation of the appropriate tstatistic.
12. Does that tstatistic change with the correlation between the two independent variables?
13. In the worksheet Statistics 4, insert your formula into cell 'Statistics 4'!G13. The cell 'Statistics 4'!G14 calculates the significance level associated with that tstatistic. What is it when the correlation coefficient equals zero?
14. Change the correlation coefficient value from 0.0 to 0.3 to +0.6 to 0.9 to +0.95 (which mirrors your previous work). Observe how the ttest rectangle, the Ftest ellipse change, and the ttest significance level change. What is the ttest significance level in each case?
Statistics 3 Handout
Write an "R" when the null hypothesis is rejected. 
sig = 10% 
Use 10% significance level for Ftest and 5% for ttest 

Regression Estimates 
r=0 
r=0 
r=.3 
r=.9 
r=+.6 
r=+.95 


H_{0} 






Data 1 (0.10,0.00) 
H_{01} 







H_{02} 







H_{0} 






Data 2 (0.19,0.00) 
H_{01} 







H_{02} 







H_{0} 






Data 3 (0.24,0.00) 
H_{01} 







H_{02} 







H_{0} 






Data 4 (0.10,0.18) 
H_{01} 







H_{02} 







H_{0} 






Data 5 (0.10,0.22) 
H_{01} 







H_{02} 







H_{0} 






Data 6 (0.10,0.10) 
H_{01} 







H_{02} 







H_{0} 






Data 7 (0.16,0.16) 
H_{01} 







H_{02} 







H_{0} 






Data 8 (0.22,0.22) 
H_{01} 







H_{02} 







H_{0} 






Data 9 (0.10,0.10) 
H_{01} 







H_{02} 







H_{0} 






Data 10 (0.16,0.16) 
H_{01} 







H_{02} 







H_{0} 






Data 11 (0.22,0.22) 
H_{01} 







H_{02} 






Area of FEllipse = 







Area of trectangle = 







Ratio of Farea to tarea = 






Answer Key
Write an "R" when the null hypothesis is rejected. 







sig = 10% 
Use 10% significance level for Ftest and 5% for ttest 

Regression Estimates 

r=0 
r=0 
r=.3 
r=.9 
r=+.6 
r=+.95 

H_{0} 
 
 
 
 
 
 
Data 1 (0.10,0.00) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
 
 
 
 
 
Data 2 (0.19,0.00) 
H_{01} 
R 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
R 
R 
R 
R 
R 
R 
Data 3 (0.24,0.00) 
H_{01} 
R 
R 
R 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
 
 
 
 
R 
R 
Data 4 (0.10,0.18) 
H_{01} 
 
 
 
 
 
 

H_{02} 
R 
 
 
 
 
 

H_{0} 
R 
R 
 
 
R 
R 
Data 5 (0.10,0.22) 
H_{01} 
 
 
 
 
 
 

H_{02} 
R 
R 
R 
 
 
 

H_{0} 
 
 
 
 
 
 
Data 6 (0.10,0.10) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
R 
R 
 
 
R 
R 
Data 7 (0.16,0.16) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
R 
R 
R 
 
R 
R 
Data 8 (0.22,0.22) 
H_{01} 
R 
R 
R 
 
 
 

H_{02} 
R 
R 
R 
 
 
 

H_{0} 
 
 
 
 
 
 
Data 9 (0.10,0.10) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
R 
R 
R 
R 
 
 
Data 10 (0.16,0.16) 
H_{01} 
 
 
 
 
 
 

H_{02} 
 
 
 
 
 
 

H_{0} 
R 
R 
R 
R 
 
 
Data 11 (0.22,0.22) 
H_{01} 
R 
R 
R 
 
 
 

H_{02} 
R 
R 
R 
 
 
 
Area of FEllipse = 
0.150 
0.150 
0.157 
0.343 
0.187 
0.479 

Area of trectangle = 
0.111 
0.159 
0.75 
0.838 
0.249 
1.632 

Ratio of Farea to tarea = 
1.343 
0.940 
1.281 
0.585 
0.752 
0.294 
A. In the worksheet Statistics 3, set the significance level for the Ftest to 10% and, as an adhoc Bonferroni correction, set the significance level for the ttest to 5%.
2. Use the first two columns (both with r=0) to explain how the worksheet demonstrates the impact the adhoc Bonferroni correction.
The area of the ttest rectangle increases reflecting the larger critical values that are consistent with the smaller significance level.
Data 2 … with the lower level of significance the tstatistic for_{}becomes insignificant.
Data 4 … with the lower level of significance the tstatistic for_{}becomes insignificant.
3. In the first column, which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Is this a result of multicollinearity?
Data 7
Data 10
No, this is not a result of multicollinearity as r=0.
4. In the first column, what is unexpected about the test results for Data 2 and for Data 4? Do these outcomes result from multicollinearity?
· In both cases, one tstatistic is significant (so one independent variable has a significant marginal influence) but the Fstatistic is insignificant.
· This is not a result of multicollinearity as r=0.
6. Importantly, multicollinearity increases (in absolute value) the critical value of the slope coefficient (value of_{} that allows you to reject the null hypothesis).
a. Demonstrate that point mathematically.
As an example:
_{}
So, as r increases from zero to one, the standard error of the slope coefficient increase and the critical value increases.
b. Demonstrate that point using the estimation results above.
Data 3 … the tstatistic for_{}becomes insignificant as r increases in absolute value.
Data 5 … the tstatistic for_{}becomes insignificant as r increases in absolute value.
Data 8 … the tstatistics for both_{}and _{}become insignificant as r increases in absolute value.
Data 11 … the tstatistics for both_{}and _{}become insignificant as r increases in absolute value.
7. Multicollinearity also increases the size of the Ftest ellipse and tilts it. Which estimation results demonstrate changes in the Ftest ellipse? What are those changes and what are the impacts of those changes on the Fstatistic?
Data 4 … The Ftest ellipse tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 5 … The Ftest ellipse becomes larger as r increases, so the Fstatistic changes from significant to insignificant. But, the Ftest ellipse also tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 7 … The Ftest ellipse becomes larger as r increases, so the Fstatistic changes from significant to insignificant. But, the Ftest ellipse also tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 8 … The Ftest ellipse becomes larger as r increases, so the Fstatistic changes from significant to insignificant. But, the Ftest ellipse also tilts negatively with a positive r, so the Fstatistic becomes significant at large positive r values.
Data 10 … The Ftest ellipse tilts positively with a negative r so the Fstatistic stays significant with negative r values. Because the Ftest ellipse tilts negatively with a positive r, the Fstatistic becomes insignificant with positive r values.
Data 11 … The Ftest ellipse tilts positively with a negative r so the Fstatistic stays significant with negative r values. Because the Ftest ellipse tilts negatively with a positive r, the Fstatistic becomes insignificant with positive r values.
8. In the third, fourth, fifth, and sixth columns (r>0), which estimation results demonstrate the concept of variables working together although individually have insignificant marginal influences? Do these outcomes result from multicollinearity?
With r=0 (so multicollinearity is not a factor) as well as at higher rvalues:
Data 7 and Data 10
At high correlation coefficients, so multicollinearity is a factor:
Data 3, Data 4, Data 5, Data 8, Data 11
9. As the correlation coefficient between the two independent variables increases in absolute value, which area increases faster: the Ftest ellipse or the ttest rectangle?
The area of the ttest rectangle increases faster than the area of the Ftest ellipse.
10. This question suggests one advantage to using some type of Bonferroni correction. Change the significance level for the ttests to 10% (which equals the significance level for the Ftest) and change the correlation coefficient between the independent variables to +0.6. In this case, what is unexpected about the test results for Data 11 and how does the adhoc Bonferroni correction overcome this problem?
With all of the significance levels equal to 10%, then (a) both the tstatistic for_{}and the tstatistic for_{}are significant, indicating that both variables have significant individual marginal influences while (b) the Fstatistic is insignificant. This result is a bit problematic unless you conclude that the two variables offset each other and happens because of the multicollinearity. The adhoc Bonferroni correction overcomes this problem by making it more difficult to conclude that the two slope coefficient estimates are individually significant.
B. Solve for the significance level that moves the ttest rectangle to be tangent to the Ftest ellipse.
11. Show your derivation of the appropriate tstatistic.
The critical values for the Ftest and the ttest:
_{}
To find the value of _{} where the Ftest ellipse is tangent to the ttest rectangle, we maximize the value of the Ftest ellipse with respect to_{}. At a simple point, we substitute the critical value of _{}in for_{}.
_{}
_{}
12. Does that tstatistic change with the correlation between the two independent variables?
No, there is no correlation coefficient in the formula for the tstatistic.
13. In the worksheet Statistics 4, insert your formula into cell 'Statistics 4'!G13. The cell 'Statistics 4'!G14 calculates the significance level associated with that tstatistic. What is it when the correlation coefficient equals zero?
_{} So, α^{*} = 3.23%
14. Change the correlation coefficient value from 0.0 to 0.3 to +0.6 to 0.9 to +0.95 (which mirrors your previous work). Observe how the ttest rectangle, the Ftest ellipse change, and the ttest significance level change. What is the ttest significance level in each case?
· Obviously both the ttest rectangle and the Ftest ellipse increase in size as the correlation coefficient increases and the Ftest ellipse tilts as it did before. The difference is that now the ttest rectangle is larger than it was when the significance level was 5% for the ttests.
· The ttest significance level does not change with the correlation coefficient, so it stays at 3.23%.
This paper benefitted substantially from comments given by Nancy Haskell, anonymous reviewers, and the Editor of JSE, Dr. William Notz. Dr. Notz also suggested incorporation of the material related to Bonferronitype corrections. The usual disclaimer applies.
^{1} The spreadsheet tends to draw an incomplete ellipse as the spreadsheet is designed to accommodate a very wide range of values for the inputs. The ellipse can be completed easily in any specific example by adjusting cells ‘Basic’!B20:C21. Similar adjustments can be made to the other worksheets if necessary.
^{2} The spreadsheet uses standard deviations of x_{1}_{ }and x_{2} and regression standard error (s_{e}) equal to one, a sample size of onehundred, and a ten percent significance level for all three tests; as will be noted in the student exercises below, those values can be changed. Figure 1 is also drawn assuming that the correlation between the two independent variables equals zero, another value that the students will be able to change easily.
^{3} Geary, R.C. , and C.E.V. Leser. February 1968. Significance Tests in Multiple Regression. The American Statistician. 22(1)2021.
^{4} Duchan, Alan I. June 1969. A Relationship Between the F and t Statistics and the Simple Correlation Coefficients in Classical Least Squares Regression. The American Statistician. 23(3):2728.
Duchan, Alan I. June 1969. A Relationship Between the F and t Statistics and the Simple Correlation Coefficients in Classical Least Squares Regression. The American Statistician. 23(3):2728.
Geary, R.C. , and C.E.V. Leser. February 1968. Significance Tests in Multiple Regression. The American Statistician. 22(1)2021.
David Martin
Professor, Department of Economics
Davidson College
P.O. Box 6988
Davidson, NC 280356988
U.S.A.
DaMartin@Davidson.edu
Volume 16 (2008)  Archive  Index  Data Archive  Resources  Editorial Board  Guidelines for Authors  Guidelines for Data Contributors  Home Page  Contact JSE  ASA Publications