Journal of Statistics Education v.2, n.1 (1994)
Copyright (c) 1994 by Margaret Mackisack, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.
Key Words: Linear models; Data analysis; Experimental design; Problem-based learning.
This paper describes a situation where systematic use is being made of data collected by students as part of a class project and advocates the wider use of such projects. The immediate learning benefits to the students involved in carrying out projects have been widely canvassed recently, and this paper reports some experiences with a particular type of project. Advantage is also taken of these projects as a source of material for problem-based learning in applied statistics at all levels, and some specific reasons for the potential importance of such material are advanced.
1 The author was inspired by Hunter (1976) to use experimentation by students as an activity in a subject including design of experiments. This has been productive far beyond the original expectation, and this paper aims to describe the way experiment projects have been implemented as part of our course, and some of the many uses to which these experiments have been put over the last four years.
2 This paper briefly describes the background of the students and the course in which these projects have been embedded, describes the learning objectives and some apparent direct outcomes for the students involved, and suggests solutions to some of the logistic questions which always seem to loom very large when such activities are proposed. Almost more important than these direct outcomes are the uses to which the projects can be put to fill a current void in the material for teaching the design and analysis of experiments. The nature of this gap, and the role of student experiments, will be explained.
3 The experiment project is carried out by second year undergraduate students, typically 18 years old, mainly enrolled in a Bachelor of Applied Science degree specialising in mathematics. Walpole and Myers (1993, Chapters 9-13) is a required text, with extra material specifically on the topic of designing experiments from Mead (1990) and Box, Hunter and Hunter (1978). Graduates from this degree program will hope to take up positions as statisticians in government or industry; in this country postgraduate training is not a general requirement for such employment, although four-year Honours degrees are preferred.
4 The students in this subject are mathematics majors some of whom will become practising statisticians. For many this will be the only time in their career that they are ever fully responsible for the design and all of the execution, as well as the analysis, of an experiment. The most important learning objective of the activity is that they should have this experience and reflect upon it. It is hoped that this experience will make them aware that practical issues involved in scientific experimentation have an impact on the quality of the eventual statistical analysis. At this stage awareness is all that is aimed for.
5 The second learning objective for the project is that students apply specific techniques that they have studied in this subject, namely designing simple factorial experiments, applying randomisation, graphical and tabular data summary, formal analysis of variance and examination of residuals. This is a secondary objective in the sense that practice at these activities could just as easily be obtained without having to go through the actual data generation process. However, some of the design-specific issues are rarely raised by textbook problems, and in this undergraduate program, as in most, it is not possible to expose students to live consulting problems. The project is also an easier environment to first practice design skills because students choose a context which is familiar to them, avoiding the need to deal with a new subject area.
6 The project is one component of the overall structure of the subject, and is complementary to other planned activities such as data analysis practical classes using computer software in networked laboratories, class data collections, and mathematical statistical exercises of a more traditional type. These exercises are smaller and more specifically focused on individual curriculum items, while the project requires synthesis of knowledge and techniques from the whole of the student's prior statistical training.
7 There is an increasing demand from professional groups and employers for the University community to teach explicitly what are described as generic skills, including problem solving, working cooperatively in groups, and non-technical written and verbal presentation of results. To develop these skills group project work is being included in mathematical subjects which previously required almost entirely individual, technical output. The third learning objective of the project is therefore that the students practise group problem solving, and verbal and written communication about data collection and analysis to a non-technical audience. The experiment provides them with a data set in a context which they understand in great detail. They can therefore be expected to report on their work in a much more complete and polished manner than if they were merely presented with a data set to analyse, collected by someone else, about which there would inevitably be many unanswerable questions. The project does not attempt to give students experience at all the skills that are desirable for a statistical consultant: specifically, questioning, elucidation of information about an unfamiliar subject area and research objectives are deliberately excluded to simplify the task.
8 When thinking about what is the use of experiments carried out by students it is necessary to keep in mind what they are not useful for. The aim of the experiments is not to discover dramatic or unexpected new science. Suitable topics for investigation are therefore based on activities that the students already carry out, such as sports, or areas in which they have expertise or interest for other reasons. The `scientific' objective of the experiment is therefore to quantify some aspect of the student's knowledge about an everyday event. The most successful projects deal with effects that are already known or strongly suspected to be present, in a context where the student has a good understanding of the sources of variation affecting the response. The `scientific' question provides a context within which the students can apply and develop their understanding of design and analysis; the assessment of the project is ultimately based on the completeness with which the processes are carried out, and not on the significance of the results. The specific instructions issued to the students are at Appendix A.
9 When the experiment project is described to other teachers of statistics the reaction is frequently that it is a good idea but in their circumstances it would not be possible. The issues that appear most daunting relate to the amount of instructor and marking time that such an activity would require, and the difficulty of grading the resulting reports equitably since there is no `right' answer, and all the questions are different. These issues are explored at length by Cobb (1993); an outline of the author's organising strategy follows. Between 25 and 45 students enroll in this subject each year; the project has now been part of the subject for six semesters. The workload derived from the project can be divided into consultation and marking, which is all done by a single instructor (the author).
10 The experiment is proposed in the ninth week of a 14-week semester, and class presentations are held in week 13. Discussion of the students' proposed experiments is a fairly instructor-intensive process for a couple of weeks, but as this is the period in which the design elements are being covered in lectures it is fairly straightforward to tie students' problems to the class material. Students work in groups of two or three on weekly homework as well as the experiment project. This reduces the weekly marking load to between 15 and 20 pieces, with the same number of projects at the end of semester.
11 The experiment project itself is assessed based on a class presentation, in which all group members must take part, and a written report. Since a substantial part of the benefit of the experiment is expected to derive from the actual activity of designing and carrying it out, a group which carries out an experiment gets a passing grade, irrespective of the quality of the analysis of the results. The class presentations are not formally graded. Instead, each group has to do a presentation as a precondition for their written report to be accepted, and they receive feedback particularly about the description of procedures and analysis of their results which they can incorporate in the written version if they wish. Marks are allocated for the written report. The strategy adopted is to deduct marks for missing obvious elements of the analysis, poor design which should have been eliminated by consultation before the data were collected, or extremely disorganised work. A single grade is given to the whole group. They also receive a page of comments indicating where the written presentation could have been improved or commending particularly good work.
12 Individual groups almost invariably learn enough during the subject to design and carry out at least a 2^3 experiment and analyse it in Minitab, sometimes with quite a bit of hand-holding. Feedback from the instructor and inter-group discussion over the four weeks concerned, the class presentations and reviews of final reports at the end, raise a number of conceptual issues, described below. The points are generally difficult to learn from standard exercises; students who are able to recite a definition or assumption that applies in general frequently find the application to their specific experiment highly enlightening.
13 Students often fail to understand the distinction between a randomised experimental study where specified treatments are imposed on experimental units, and an observational study where the units may be described by some categorical variables and tabulated in the same way as the results of an experiment, but lack the intervention characteristic of experimentation (Samuels 1991, p. 267). For example, a regular question is whether a group can investigate the effect of some intervention on the stock market, and the answer must be that, in the absence of millionaire resources they can't experiment in a way that will affect the market, and that at any rate in this country it would probably be illegal to do so if they did have the resources. What they actually have in mind is simply collecting data that are already available, not actually experimenting. The verbal distinction is frequently not kept in textbooks, where experiment is used in the sense of scientific `experiment' which is a collective word synonymous with all sorts of scientific investigations, rather than in the more technical sense of statistical designed experiment. The distinction is an important one for a statistician to learn to make because of its consequences for the type of conclusion that can be drawn from the ensuing data (Moore and McCabe 1989, p. 258), and it is emphasised in the project.
14 Mead (1990, p. 7) points out that there is often lack of clarity about the difference between block variables (intrinsic to experimental units) and factors being imposed in the experiment. Student groups particularly wish to include the differences between themselves in their investigations, and they come to realise that while they can randomly assign a set of behaviours to each other, they cannot randomly assign `being Darryl' to anyone other than Darryl.
15 The sort of everyday activities that are the focus of most experiments are not usually approached with a view to minimising unwanted variability. Sports are a particularly attractive class of activities to experiment with, and it is sometimes difficult to observe slight effects of experimental factors due to the variability of individuals' repeated efforts at ostensibly the same task. For example, even professional cricketers will hit balls bowled (apparently) the same way differently, and for different distances, so the influence of different weights of cricket bat on the distance that a ball is hit is likely to be drowned in noise variation, particularly when the bat is held by an amateur. Students sometimes become frustrated by the problems associated with trying to minimise noise variation, but they develop a better understanding of the importance of inherent variability.
16 Students frequently don't associate the assumption that the residuals from fitting the linear model have homoscedastic Gaussian distribution with the definition of the response variable. Counting numbers of ants attacking sandwiches, counting number of baskets shot out of a fixed number of attempts, counting numbers of pins knocked down bowling, and observing whether or not a goal is kicked at football, are among the problematic types of data students have proposed collecting. Students show some amazement that the abstract idea of `normal distribution of errors' could actually be related to data collection in such a concrete way. At the design stage students are encouraged to define continuous response variables; at the analysis stage they are then encouraged to consider how far the response actually appears to follow a Gaussian distribution and to carry out transformations to improve the residual distribution. At this stage, formal Box-Cox procedure is not taught, and the use of transformations is presented as an exploratory technique.
17 If students recognise that particular variables not included in their design are possibly going to influence the results of their experiment, they are encouraged to add these variables to their collection. They can do a post hoc check against the residuals after the analysis to see if there is any systematic effect evident. Sometimes this will be due to a factor which could have been incorporated into the design but which was not recognised beforehand, such as when students swimming lengths of a pool did not realise that starting from the deep or shallow end would influence the time for a lap. On other occasions such variables may be outside the experimenters' control, such as weather, order, or a retailer's sale program.
18 Sometimes the data collection doesn't work the way it was planned. A group of students trying to discover the influence of refrigeration on the life of roses discovered that the life was extended well beyond the end of the semester, which was interesting but difficult to deal with in the bounds of the project, since analysing censored data is well outside the scope of the subject concerned. Golf balls are hit into the rough, paper aeroplanes fly over the building parapet, darts fall to the floor instead of sticking into the board, and the students have to decide what to do about it and later defend their decision before their peers.
19 The students react extremely positively to the experiments. They listen with interest to each other's presentations, and put considerable effort into their own written and verbal reports. While the content may appear trivial to adults, these eighteen-year-old students do not treat the activity frivolously. The course has consistently received high evaluations (6 or over out of 7) for the last three years, and most of the students continue into the third year of the statistics program.
20 The students carrying out the experiments benefit from the experience, and usually this is the end of the story. However, a single group experiment also provides a documented, easily understood data set which can be used in teaching data analysis and problem solving in classes at all levels. Out of every dozen student experiments three or four will be creative, successful and completely documented, and can be adopted for use in class the following year. Students are invited to sign a copyright clearance when they hand in their written reports to allow their work to be used in this way.
21 Now that computer packages are available to analyse linear model problems of considerable generality, it is no longer relevant to teach how to compute sums of squares for every accessible design, so the question emerges, what does one teach in place of this? The mathematics of the general linear model, least squares estimation, and the related distribution theory provides the general framework, and that is clearly important as the structure within which the statistical analysis resides. The importance of numerical linear algebra in solving the associated computational problems suggests that students should be exposed to some of this also, but these are all areas of mathematical theory. While it is clearly relevant to teach this material to the mathematics undergraduates, this is not going to develop their understanding of statistical issues.
22 Mead (1990) in his preface to Design of Experiments says that, in the light of the new computing resources which are available for data analysis, ``The fundamental concepts (of experimental design) now require re-examination and re-interpretation outside the limits implied by classical mathematical theory so that the full range of design possibilities may be considered.'' For the same reason, the teaching of fundamental concepts is being re-assessed. This is an evolutionary process; it takes different forms depending on the interests and skills of different instructors, and is not confined to the area of designed experiments. The author is at present particularly concerned with enhancing students' skills in deciding what is the appropriate analysis for a data set (recognising when they do and don't know what should be done); fully interpreting the results of analysis that they have performed (knowing the meaning of what they have done); and investigating anomalies in the data revealed by the analysis or diagnostics (knowing when what they have done may need reconsideration). The projects have provided a unique and invaluable teaching resource which satisfies a need not filled by current textbook material for teaching these skills. An indication of the range of approaches being taken by other tertiary instructors can be found in Cobb (1993).
23 Textbook authors have made substantial efforts in recent years to provide more realistic data sets for analysis, including some background information about the intention of the data collection and frequently references to either the original data or to articles explaining the background. For several reasons these do not sufficiently address the skills described in the previous paragraph, nor is it realistic to expect them to do so.
24 Specifically, textbooks almost never include sufficient intimate detail about the actual mechanics of the data collection, which can have a radical effect on the appropriate analysis (Mead 1990, p. 107). Even the original research papers frequently do not present this information. Textbook problems also usually present only information about the design and response, hardly ever associated covariates (unless it is a chapter on analysis of covariance, which seems itself to be a moribund concept). Limiting the amount of information provided is desirable in exercises designed for practising a specific analysis, and to protect beginners from the often disorganised overflow of information available about real problems, but it restricts the extent of possible analysis and makes it difficult to think properly about, and usefully interpret, the results presented.
25 Finally, textbook problems (and archival data sets generally) commonly involve, or claim to involve, real science. On the one hand, this is highly desirable for motivation, in that it validates for the students that the statistical techniques that they are studying are used in real and important investigations. It is a substantial improvement over the anonymous treatments A, B and C. However, the disadvantage is that the students can almost never know enough about the subject matter to know what the answers mean. Sometimes this is reflected in the fact that although the context is more elaborate, the questions asked in some textbooks have not changed over twenty years. We are still asking students to carry out manipulative exercises and apply mathematical techniques rather than to think about the meaning of what they are doing.
26 Books on applied statistics or applied regression analysis, such as Weisberg (1980) or Cox and Snell (1981) are simply at too high a level for the undergraduate population that we have to serve. Their data sets, while presented with more of the elaboration which one would like to see, are limited in number, and again tend to involve issues beyond the immediate grasp of undergraduates. This is also a disadvantage in trying to use published research as a teaching resource, even when papers with sufficiently simple designs, and actual data, can be located. This is not such an issue when teaching graduate students, or students for whom statistics is a tool to be applied to their substantive discipline from which data are taken. Data collected in simple contexts are frequently dismissed as ``toy'' data, but if the students are interested in, understand, and learn from the context, its lack of importance to the instructor seems irrelevant.
27 Graduate students can acquire some of these skills by being involved in statistical consulting, which also develops other skills such as listening and use of probing questions to properly understand someone else's ideas. However, this generally involves some form of internship, and much more close personal supervision than is possible for large classes of undergraduates, who also are only learning the techniques for the first time. What is needed is some surrogate for this consulting involvement, and the student experiments which, by their simplicity, allow other students to become engaged with the material, fill this role. The projects also allow the student the complete experience of the data collection activity, an important objective (see section 2.2), which in a consultant role they only experience at second hand.
28 Carrying out the project is an exercise in applying general principles to a specific task. Particular contexts and examples that can be used to introduce general principles are an extremely valuable resource; student data can be used to give students experience of tasks such as critiquing designs, giving detailed interpretation of results, exploring anomalous observations, questioning to elicit detailed information about somebody else's problem, and deciding what analysis is appropriate for a particular data set. The nature of the student experiments makes them much easier to use for these purposes than textbook or archival data.
29 Student experimenters can be monitored to ensure that their design is documented in detail so that the exact procedures used to collect the data are known unambiguously. The class presentation frequently shows up the fact that details have been omitted, and helps make the final report more complete. The requirement that the randomisation of treatments to experimental units be explicitly displayed is important in making sure that the units and treatments are clearly defined. Since students are also encouraged to collect apparently relevant extra information, particularly time order of collection of data and variables that emerge during the data collection as possibly affecting the response, relationships can be considered which were not taken into account in the design of the experiment. When such information is available, and the mechanics of the data collection involve processes well known to everyone, one can ask for critiques of the design, and suggestions for improving the experiment.
30 The student data sets are not only useful for learning about design. They also provide material that can be used for more in-depth analyses than are often possible with textbook or archival data. When a great deal of background information can be supplied it may also be possible to explore in more detail the causes of anomalous observations, enabling a much fuller treatment of questions about outliers and influential points when the students can understand the process probably generating the anomaly. When considerable detail is available it is also possible to present the procedures and data and leave it to the students to determine what analysis should be performed. This helps to avoid the `this is section 10.8 so it must be a split plot' approach to analysing experiments, and is particularly useful for examination questions. The presence of covariates in many data collections facilitates exploration of more detailed models than the original design.
31 Successful experiments usually aim to quantify some simple obvious effects which are fairly strongly suspected to be present. As a consequence the data tend to have simple relationships present, particularly when the experimental techniques are well designed. The skipping experiment investigated the time to skip rope 100 times forward and backward, on left or right foot, or both feet together, and shows clearly the difference between skipping on one foot vs both feet, or left foot vs right, and the reasons for the observed differences are clear to the students, most of whom will have done the same thing at some stage in their lives even if they did not quantify the differences in this structured way. A paper plane experiment on the effect of using different weights of paper for different designs of paper aeroplane showed that a heavier paper gave longer glides with a less `folded' plane, a lighter paper was better with a more complicated folded shape. One can ask for detailed interpretation of results for such experiments, and frequently there is expertise in the class so that the mechanisms underlying effects can be explained. For example, a student who worked for a pizza delivery company explained the management reward system that led to a complicated pattern of interaction between factors affecting the time to deliver pizza.
32 It is difficult to overemphasise how much difference it makes to students' ability to interpret results of experiments for them to be about material which they closely understand. Once they have seen these comprehensible data sets they become very frustrated with the lack of information provided by textbooks, and constantly ask for more information so that they can decide what the problem really means. That is, they are trying to think about what they are doing, and that is by no means necessarily the way statistics subjects are approached. For this reason it is not being suggested that the author's data be used elsewhere, although it could be made available, because its virtues depend very much on detail local to this city, brand names of beer or chocolate bars, popular sports and recreation activities, local trade identities and employment regulations.
33 Neat, unambiguous and easily understood examples are a boon to the lecturer presenting data in class to exemplify new concepts being taught. For example when teaching about interaction and additive models one would like two- or three-way interaction that is clear, obvious, and interpretable; text book problems are not classified in this way, and one may have to plough through endless analyses to find something satisfactory.
34 For examinations based on data analyses, one needs a continuous supply of `new' data sets. A sequence that introduces last year's projects into the examination for this year's students and then into class examples for the following years' students has proved very productive.
35 In the following year, the students who carry out the experiment projects study a subject where more complex design and analysis techniques are taught. They are encouraged to reflect on their own experience for examples of concepts such as random vs fixed effects; they are led to re-analyse their data incorporating covariates, or using suitable procedures for analysing non-Gaussian data. In some cases they recognise that the design that they actually carried out was not quite as simple as they had thought.
36 Favourite experiments, with student appeal and interesting results, include the pizza experiment, the skipping experiment, the swimming experiment (mentioned above), the squash ball experiment, the golf experiment, and the paper-plane experiment. A complete list of successes is presented at Appendix B. In this section some extracts from reports are given that highlight the richness of this material relative to textbook problems. Data for the experiments referred to are given in Appendix C. (A supplementary file containing descriptions and data for six additional experiments can be obtained by following instructions given at the end of this article.)
37 The background and procedure description from the student report on the pizza experiment displays the sort of detail and thought about the design that makes this exercise so useful.
38 ``As I am a big pizza lover, I had much pleasure in involving pizza in my experiment. I became curious to find out the time it took for a pizza to be delivered to the front door of my house. I was interested to see how, by varying whether I ordered thick or thin crust, whether Coke was ordered with the pizza and whether garlic bread was ordered with the pizza, the response would be affected.
39 ``Because of my current financial status and limitation of time, I decided to have only two replicates, just to get a reasonable estimate of the variance. To decrease my financial burden I managed a deal with the manager of the pizza shop. I managed to get the pickup special, delivered to my house, which was the cheapest and smallest pizza made. I tried to repeat the experiment in as nearly as possible identical conditions to reduce `noise'.
40 ``I ordered the pizza from the same shop, being Domino's Pizza. To be consistent I ordered a Supreme pizza each time at approximately the same time of day. The response was measured from the time I closed the telephone to the time the pizza was delivered to the front door of my house.
41 ``I wrote each of the eight treatments on a piece of paper twice, put them all into a hat, mixed them up, and took them out one at a time to allocate the order in which each treatment was done.
42 ``As well as the response and treatment for each pizza delivery the actual hour of delivery was recorded, also the order in which the treatments were done and whether the driver was male or female.''
43 As well as the complete background, this experiment shows interesting interaction between factors and some interesting features in the residual plots, including the possible use of covariate information to investigate unplanned influences on the response.
44 Another favourite experiment was inspired by the subtropical climate of our institution, which makes swimming a popular activity to experiment with in late Spring. The students who carried out one swimming experiment used a friend to actually do the swimming, measuring her time to swim one lap as the response, and included as additional information in their report a note from the swimmer explaining her view of the experience:
45 ``The first thing I remember is trying to concentrate on a set rhythm more than anything else as I was looking to keep the lap times as consistent as possible. On the first lap I shot out of the gate and suddenly realised that I would be doing a number of laps and I needed to concentrate on rhythm rather than speed. As the laps continued I slowly began to realise that I was getting more tired. At about lap 20 I took a break for a couple of minutes as I felt that I had reached the stage where it would begin to affect the lap times.
46 ``When I was swimming without the goggles I found it more difficult to swim straight and would occasionally bump into things (lane rope, people). It mostly felt as thought I was going slower when swimming from the shallow to the deep end. After taking the flippers off, the first few kick beats felt funny.''
47 Armed with this information, and the data about order of runs and which end of the pool they were made from, the students can be asked to consider revising the design of the experiment, taking the end of pool into account and possibly breaking the runs up into blocks to minimise the tiredness effect. The role of serial correlation can be investigated, and the effect of incorporating time as an extra covariate.
48 Designed experiments are of course not the only type of data that are desirable for a problem-oriented approach to teaching. Sample surveys, observations of point processes, queueing systems, and collections of data for regression modelling are obvious exercises which many teachers already use to reinforce concepts introduced. However, these are frequently done on a throwaway basis, being seen as having principally motivational benefits solely for the students carrying them out. With a little more time and elaboration they can also provide resources for future use, directed at specific learning objectives.
49 The data sets collected by the second year mathematics students here are being used by colleagues of the author for teaching both `service' and mainstream statistics students, and could be made available to others if there was interest, but the principal intention of this paper is to suggest the value of starting your own students conducting experiments to generate their own data that you, and they, can benefit from. The specific details of local culture, and the fact that the context is immediately accessible to students, are among the most valuable aspects of such data. The entire context of local students' activity is not readily transferable to other locations. At the same time it seems unlikely that textbooks, at least in the classic form, will fill this perceived gap. We need not wait for large grant-funded projects to produce new curricular resources; given a little encouragement and logistic framework our students will do it for us.
Thanks to all my second year students for producing data, in particular to Bill Afantenou for the Pizza Experiment, Kim Horsfall, Sue Hall and Simone Golik for the Swimming Experiment, and to Ruth Hubbard, Helen MacGillivray, and three anonymous referees for constructive criticism of drafts. This work was finalised while the author was a visiting fellow at the Australian National University.
Box, G.E.P., Hunter, J.S. and Hunter, W.G. (1978), Statistics for Experimenters, New York: John Wiley.
Cobb, G.W. (1993), "Reconsidering Statistics Education: A National Science Foundation Conference," Journal of Statistics Education, v.1, n.1.
Cox, D.R. and Snell, E.J. (1981), Applied Statistics: Principles and Examples, London: Chapman and Hall.
Hunter, W.G. (1976), ``Some Ideas About Teaching Design of Experiments, With 2^5 Examples of Experiments Conducted by Students,''American Statistician, 31, p. 12.
Mead, R. (1990), Design of Experiments, Cambridge: Cambridge University Press.
Moore, D.S. and McCabe, G.P. (1989), Introduction to the Practice of Statistics, New York: Freeman.
Samuels, M.L. (1991), Statistics for the Life Sciences, San Francisco: Dellen.
Walpole, R.E. and Myers, R.H. (1993), Probability and Statistics for Engineers and Scientists (5th ed.), New York: Macmillan.
Weisberg, S. (1980), Applied Linear Regression, New York: John Wiley.
Requirement: The class assessment for the experimental design part of MAB648/ MAB748 will be partly based on an experiment that you design, carry out, analyse the results of, and report on. The experiment can be about anything that you choose.
THE FINAL WRITTEN REPORT ON YOUR PROJECT IS TO BE HANDED IN BY 5PM ON FRIDAY 30 OCTOBER. CLASS PRESENTATIONS WILL BE HELD IN LECTURE AND TUTORIAL PERIODS IN WEEK 13 (ON 22 AND 23 OCTOBER), DETAILS TO BE ARRANGED.
The attached list contains outlines of experiments that have been carried out by students in this course in previous semesters, and also experiments done by students in an experimental design course at a US university, to give you an idea of the sort of thing you might choose. Remember, you have to carry out the experiment and collect the results as well as do the analysis in five weeks, so don't choose something which is going to take a long time to accumulate data. An experiment which makes use of some activity which you would perform anyway, or some skill that you have, is usually the most satisfactory.
You should investigate the effect of varying several factors,each at only two levels, preferably using homogeneous experimental units so that no blocking is needed. It is strongly recommended that you discuss your design with other groups, or with your lecturer, because explaining to someone else what you plan to do is an excellent way to become aware of aspects that you have overlooked.
Your group must actually carry out the experiment yourselves, and yourselves cause the factors influencing the outcomes to vary, do appropriate randomisation of the allocation of treatments to experimental units, etc. Collecting together data which has been obtained by somebody else, even if it appears to fit into the framework of your experimental design, is not acceptable.
Examples are available of good reports on experiments from previous classes for you to inspect. Your analysis should include appropriate background about your design, plots and tables of results, significant facts that emerge from the analysis, and checks on model assumptions. Your report should say or show all the things you decided, did or discovered along the way, including the simple ones (and particularly the things which, with hindsight, seem obvious!).
Each group will give a short presentation on their experiment during class time in week 13. For this you will need to prepare at most three overhead transparencies, summarising (i) the design and background of your experiment, and the question you were aiming to answer; (ii) the data you collected, and important results of the analysis; and (iii) your conclusions, including results of diagnostic checks and recommendations for future experimenters. Each group member will be required to speak about some aspect of the experiment.
After the class presentations, you will receive feedback which you can then use to improve your final written report. This may be hand written or word-processed, but in any case should be organised and presented in a lucid and orderly manner with appropriate diagrams and graphs included.
The project is worth 10% of the course assessment. In order to get half marks you must design and carry out the experiment, and give a class presentation about it. In order to get the other half of the marks you must design your experiment well, analyse it correctly and write a well-organised report. You do NOT have to have statistically significant results, although the chances are that they will be if you design your experiment well. Considerable weight will be given to evidence that you have actually been thinking about what you are doing, and why, and what it all means, and how you could improve upon it.
Response Factors distance golf ball hit on full tee height, ball age, club type squash ball rebound distance ball type, ball age, ball temperature distance football kicked kick style, angle to goal, angle of ball number of pins down bowling ball weight, spin, position for throw water evaporation surface area, sun or shade, volume at start car acceleration time to gear change, road surface, car type (?) number of baskets thrown ball size, distance from net letter delivery time post code, mailbox or post office, town (?) speed running distance, terrain, time of day distance paper aeroplane flew design, paper weight, angle longjump distance runup distance, shoes on or off, L or R foot time to swim 25m shirt on or off, flippers, goggles number of meat ants on sandwich butter, filling, bread type time to skip 100 L, R or both feet, forward or back, rope weight time to deliver pizza crust thickness, garlic bread, Coke time to boil water amount of water, lid on or off, size of pan virus scan time RAM cache, program size, operating system (?) reaction time to grab rod width of rod, L or R hand, eyes open/shut number of pull-ups achieved time of day, warm-up, taking inosine height rise of scones amounts of butter, flour, milk time to read short passage of text font, size, upper/lower case number of tennis serves in direction of gaze, stance time to be served at supermarket items, location, time of day height off ground jumped type of shoe, run-up, starting foot time to run 5km chocolate/banana, coffee/beer, time before run (?) distance cricket ball hit weight of bat, wearing gloves (?) life of rosebuds stem length, aspirin in water, refrigeration number of corns popped/100 pot diameter, oil or margarine casting distance rod length, line weight, sinker weight
(a) Pizza Experiment Data Data Codes: Pizza Crust Thin/Thick 0/1 Coke Yes/No 0/1 Garlic Bread Yes/No 0/1 Order Crust Coke G.Bread Driver Hour Time to Deliver 1 0 1 1 M 20.87 14 2 1 1 0 M 20.78 21 3 0 0 0 M 20.75 18 4 0 0 1 F 20.60 17 5 1 0 0 M 20.70 19 6 1 0 1 M 20.95 17 7 0 1 0 F 21.08 19 8 0 0 0 M 20.68 20 9 0 1 0 F 20.62 16 10 1 1 1 M 20.98 19 11 0 0 1 M 20.78 18 12 1 1 0 M 20.90 22 13 1 0 1 M 20.97 19 14 0 1 1 F 20.37 16 15 1 0 0 M 20.52 20 16 1 1 1 M 20.70 18 (b) Swimming experiment data Data codes: Wearing flippers Yes/No 1/0 Wearing goggles Yes/No 1/0 Wearing shirt Yes/No 1/0 Start from deep end Yes/No 1/0 Order Time Shirt Goggles Flippers End 5 16.55 1 1 1 0 12 17.22 1 1 1 1 18 17.70 1 1 1 1 9 21.53 1 1 0 0 14 22.49 1 1 0 1 17 22.50 1 1 0 0 7 17.77 1 0 1 0 11 17.43 1 0 1 0 21 18.70 1 0 1 0 2 23.78 1 0 0 1 19 24.29 1 0 0 0 22 24.89 1 0 0 1 4 16.14 0 1 1 1 20 16.39 0 1 1 1 24 16.40 0 1 1 1 1 19.97 0 1 0 0 3 19.95 0 1 0 0 6 20.32 0 1 0 1 10 16.85 0 0 1 1 13 17.80 0 0 1 0 15 16.81 0 0 1 0 8 22.63 0 0 0 1 16 22.81 0 0 0 1 23 22.31 0 0 0 0
A supplementary file contains descriptions and data for six additional student experiments.