Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 145 - Statistics of Social Media
Type: Invited
Date/Time: Tuesday, August 4, 2020 : 10:00 AM to 11:50 AM
Sponsor: Section on Statistical Computing
Abstract #308134
Title: What Authors Reveal of Themselves in Internet Discussions?
Author(s): Juha Alho*
Companies: University of Helsinki
Keywords: social media; data quality; emoticons; correspondence analysis; profiling

Analyses of a general topic discussion forum that has been in operation in Finland since 2001 are presented. The posts to the forum may be prompted by external events, or they may be endemic in nature. Beyond the biases one expects due to the self-selected participation, cyber attacks and data management errors occur. They influence the statistical characteristics of the data. Yet, the data reveal stable "rhythms" of posting by hour of day, weekday, and topic area, for example.

The posts have been parsed and lemmatized. As an example, emoticons [such as :) ;) :( ] can be located. They are used by some authors to add emotional metatext. We show how posts with emoticons differ from the bulk of the posts in terms of their rhythms. Correspondence analysis is used to study the association between emoticons and topics. The stability of the associations over time is demonstrated and time trends are studied.

Emoticon use provides indirect evidence of the linguistic background of the otherwise anonymous authors. Examples from Sports-related posts are used to indicate how a closer reading of the posts can be used to complement statistical analyses.

Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program