Online Program

Return to main conference page
Friday, May 18
Computing Science
Distinguished Colleagues of Edward Wegman: Modeling and Data Science
Fri, May 18, 5:15 PM - 6:15 PM
Grand Ballroom D
 

The Revival of Statistical Ranking Methods in The High Technology and Big Data Era: Some Recent Developments (304682)

*Michael G. Schimek, Medical University of Graz 

Keywords: ranking data, top-k lists, indirect inference, bootstrap, n<

Most methods for statistical analysis of rank data are from the pre- or early computer age. Such methods have the advantage of being invariant to transformation and normalization as long as the relative orderings are preserved. Moreover, they are robust to outliers, although some information is inevitably lost compared to metric approaches. The data challenge of new technologies in the 21st century has revived the interest in rank data. It arises for instance in genomic sequencing, in Web analytics, in personalized marketing, or in fusion technologies. For convenience metric scale data are often transformed into rankings. Pre- or early computer age statistical methods were designed for n>>p problems comprising a reasonable but never huge sample size n and at the same time a rather small number of parameters p. But now we have to cope with n<

Let us assume a set of distinct objects of arbitrary size ordered by different ranking mechanisms, resulting in multiple ranked lists. However, only a comparably small subset of top ranked objects is informative, characterized by a strong overlap of rank positions across lists. A recent approach for the identification of those top ranked objects is described [Hall and Schimek, 2012, JASA 107, 661-672]. An innovative graphical tool for the visualization of the obtained truncated lists and of their overlap characteristics is considered [Schimek et al., 2015, SAGMB 14, 311-316]. In many applications the underlying signals or decision processes that informed the observed rankings are unknown. Therefore, signal reconstruction from rankings is a rewarding task. A novel indirect inference distribution function approach is introduced [Svendova & Schimek, 2017, CSDA 115, 122-135]. Bootstrap standard errors can be derived for the generic signal estimates of the objects. The obtained estimates can be easily converted into consolidated ranks.