JSM 2011 Online Program

The views expressed here are those of the individual authors and not necessarily those of the JSM sponsors, their officers, or their staff.

Abstract Details

Activity Number: 468
Type: Contributed
Date/Time: Wednesday, August 3, 2011 : 8:30 AM to 10:20 AM
Sponsor: Section on Nonparametric Statistics
Abstract - #301389
Title: Usage of Distribution Extents in Predictive Modeling
Author(s): Alex Zolotovitski*+
Companies: Microsoft Corporation
Address: 1 MSFT Way, Redmond, WA, 98052,
Keywords: distribution ; web analytics ; entropy ; predictive modeling
Abstract:

Distribution extents of order k for a sample {x1, x2,.. xn} of a non-negative stochastic variable X are defined as

E_k = (sum p_i^k )^(1/(1-k)) if k != 1 , E_1 = exp(-sum(p_i ln(p_i))) if k = 1 ,

where p_i = x_i / sum(x_i) ,

and are useful measures of "a number of large values" in the sample.

They were introduced by L.L. Campbell in 1964, and are generalization of inverse Herfindahl-Hirschman Index (HHI), a commonly accepted measure of market concentration in economics, and Simpson's diversity index used in ecology, and are closely related to Shannon-Wiener Index and the Rényi entropy and divergence.

In this work we describe general properties of E_k and demonstrate advantages to use it in analysis of a web advertisement network, where actors are advertisers, publishers, and users, for three purposes: 1) as cut off parameters to present the network as a graph to visualize the network and to use graph theory methods 2) as independent variable in predictive modeling, and 3) as a criterion for optimization of some parameters of models.


The address information is for the authors that have a + after their name.
Authors who are presenting talks have a * after their name.

Back to the full JSM 2011 program




2011 JSM Online Program Home

For information, contact jsm@amstat.org or phone (888) 231-3473.

If you have questions about the Continuing Education program, please contact the Education Department.