Online Program Home
My Program

Abstract Details

Activity Number: 629 - New Developments in Nonparametric and Semiparametric Statistics
Type: Contributed
Date/Time: Thursday, August 2, 2018 : 8:30 AM to 10:20 AM
Sponsor: Section on Nonparametric Statistics
Abstract #329606 Presentation
Title: Measuring Lexical Dispersion in Corpus Linguistics
Author(s): Brent Burch* and Jesse Egbert and Douglas Biber
Companies: Northern Arizona University and Northern Arizona University and Northern Arizona University
Keywords: Gries' DPnorm; Juilland's D; Word frequency lists
Abstract:

The frequency and the dispersion of a word are measures of a word's importance in a collection of texts or a corpus. In particular, lexical dispersion is a statistic that measures a word's homogeneity across the parts of a corpus. There are different ways to measure dispersion and the authors compare three approaches. Both formulaic and interpretative issues pertaining to dispersion are discussed. A simulation study and an example using words from the British National Corpus indicate that the index constructed from the difference between every possible pair of frequencies of the word in the parts of a corpus is preferred.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2018 program