Activity Number:
|
629
- New Developments in Nonparametric and Semiparametric Statistics
|
Type:
|
Contributed
|
Date/Time:
|
Thursday, August 2, 2018 : 8:30 AM to 10:20 AM
|
Sponsor:
|
Section on Nonparametric Statistics
|
Abstract #329606
|
Presentation
|
Title:
|
Measuring Lexical Dispersion in Corpus Linguistics
|
Author(s):
|
Brent Burch* and Jesse Egbert and Douglas Biber
|
Companies:
|
Northern Arizona University and Northern Arizona University and Northern Arizona University
|
Keywords:
|
Gries' DPnorm;
Juilland's D;
Word frequency lists
|
Abstract:
|
The frequency and the dispersion of a word are measures of a word's importance in a collection of texts or a corpus. In particular, lexical dispersion is a statistic that measures a word's homogeneity across the parts of a corpus. There are different ways to measure dispersion and the authors compare three approaches. Both formulaic and interpretative issues pertaining to dispersion are discussed. A simulation study and an example using words from the British National Corpus indicate that the index constructed from the difference between every possible pair of frequencies of the word in the parts of a corpus is preferred.
|
Authors who are presenting talks have a * after their name.