JSM 2011 Online Program

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

Activity Number:	468
Type:	Contributed
Date/Time:	Wednesday, August 3, 2011 : 8:30 AM to 10:20 AM
Sponsor:	Section on Nonparametric Statistics
Abstract - #301389
Title:	Usage of Distribution Extents in Predictive Modeling
Author(s):	Alex Zolotovitski*+
Companies:	Microsoft Corporation
Address:	1 MSFT Way, Redmond, WA, 98052,
Keywords:	distribution ; web analytics ; entropy ; predictive modeling
Abstract:	Distribution extents of order k for a sample {x1, x2,.. xn} of a non-negative stochastic variable X are defined as E_k = (sum p_i^k )^(1/(1-k)) if k != 1 , E_1 = exp(-sum(p_i ln(p_i))) if k = 1 , where p_i = x_i / sum(x_i) , and are useful measures of "a number of large values" in the sample. They were introduced by L.L. Campbell in 1964, and are generalization of inverse Herfindahl-Hirschman Index (HHI), a commonly accepted measure of market concentration in economics, and Simpson's diversity index used in ecology, and are closely related to Shannon-Wiener Index and the Rényi entropy and divergence. In this work we describe general properties of E_k and demonstrate advantages to use it in analysis of a web advertisement network, where actors are advertisers, publishers, and users, for three purposes: 1) as cut off parameters to present the network as a graph to visualize the network and to use graph theory methods 2) as independent variable in predictive modeling, and 3) as a criterion for optimization of some parameters of models.

The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

2011 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.