NAME: Modeling Rare Baseball Events -- Are They Memoryless? TYPE: Census SIZE: no hitters file – 206 observations, 7 variables; cycles file – 225 observations, 6 variables; triple plays file – 511 observations, 16 variables DESCRIPTIVE ABSTRACT: Three sets of rare baseball events – pitching a no-hit game, hitting for the cycle, and turning a triple play – offer excellent examples of events whose occurrence may be modeled as Poisson processes. That is, the time of occurrence of one of these events doesn’t affect when we see the next occurrence of such. We modeled occurrences of these three events in Major League Baseball for data from 1901 through 2004 including a refinement for six commonly accepted baseball eras within this time period. Model assessment was primarily done using goodness of fit analyses on inter-arrival data. DATA SOURCES: www.Baseball-Almanac.com/feats/feats16d.shtml www.Retrosheet.org tripleplays.sabr.org/tp_sum.htm DATASETS LAYOUT: Spreadsheet in MS Excel cycles.xls: Variable Description Year Major League Baseball season Date Day and Month Player Player to hit for cycle Game Game, that Year, in which cycle was hit Inter-arrival Time Games since previous cycle (possibly in a prior Year) Games in Season Total number of games in that Year PEDAGOGICAL NOTES: All games in which one of our events occurred were treated as the first game of the day, unless it was the second game of a double-header, and then it was treated as the last game of the day. The data is modeled with an exponential distribution and goodness-of-fit tests (chi-squared and Anderson-Darling) are conducted. The data is examined as a whole and in subsets for exponentiality. REFERENCES: D’Agostino, R. B. and Stephens, M. A. (1986), Goodness-of-Fit Techniques, New York, NY: Marcel Dekker, Inc. Devore, J. L. (1995), Probability and Statistics for Engineering and the Sciences, Fourth Edition, New York, NY: Duxbury Press. Siwoff, S. (2004), The Book of Baseball Records, 2004 Edition, New York, NY: Seymour Siwoff, Elias Sports Bureau, Inc. Potthoff, R. F. and Whittinghill, M. (1966), “Testing for Homogeneity: II. The Poisson Distribution”, Biometrika, Volume 53(1/2), pp. 183-190. Internet site for no-hit game definition: mlb.mlb.com/NASApp/mlb/mlb/official_info/official_rules/foreword.jsp, accessed 11 Feb 06. Internet site for baseball eras: www.netshrine.com/era.html, accessed 10 Feb 2006. SUBMITTED BY: Michael Huber Department of Mathematical Sciences Muhlenberg College Allentown, PA 18104 U.S.A. huber@muhlenberg.edu Andrew Glen Department of Mathematical Sciences United States Military Academy West Point, NY 10996 U.S.A. andrew.glen@usma.edu