NAME: Evaluating Aptness of a Regression Model
TYPE: Full population of data (all software projects completed by the AT&T data center from 1986 through 1991).
SIZE: 104 observations, 5 variables
DESCRIPTIVE ABSTRACT:
Values for function point count, actual work hours, operating system, database management system, and programming language are recorded for 104 software projects from AT&T. Function point counts are often used to help predict the number of work hours that will be required to complete a proposed software project. This data can be used to demonstrate the development and evaluation of a linear regression model. In particular this is an excellent data set for demonstrating violations of the standard assumptions of linear regression, and how to address those violations.
SOURCE:
The data were generously provided by Linda Hughes and Mary Dale of AT&T.
VARIABLE DESCRIPTIONS:
Columns Variable
1-4 Function Point Count
9-13 Work Hours
17 Operating System:
(0) Unix
(1) MVS
25 Database Management System:
(1) IDMS
(2) IMS
(3) INFORMIX
(4) INGRESS
(5) Other
33 Language:
(1) COBOL
(2) PLI
(3) C
(4) Other
STORY BEHIND THE DATA:
Function points are a standard metric used for estimating the size of software development projects (International Function Point Users Group, 2005). As the number of function points for a proposed software project increases, so will the estimate of development effort required to produce the software increase. Regression models based on function points are an important tool used in the management of software development projects.
PEDAGOGICAL NOTES:
The data set is particularly interesting in that it violates most of the assumptions required of a linear regression model. When the data are transformed using a natural log transformation, the violations are corrected. Prediction intervals can also be productively used to provide insight into the practical limitations of the predictive accuracy of the final model.
REFERENCES:
International Function Point Users Group (2005). “About Function Point Analysis”, http://www.ifpug.org/about/about.htm.
SUBMITTED BY:
Jack E. Matson and Brian R. Huguenard
Department of Decision Sciences and Management
Tennessee Technological University
Johnson Hall 306
1105 North Peachtree Street
Cookeville, TN 38505-0001
Phone: (931) 372-3793
FAX: (931) 372-6249
JEMatson@tnech.edu