Online Program Home
  My Program

All Times EDT

Abstract Details

Activity Number: 74 - Text Analysis in Machine Learning and Statistical Models
Type: Contributed
Date/Time: Monday, August 3, 2020 : 10:00 AM to 2:00 PM
Sponsor: Section on Statistics in Defense and National Security
Abstract #313525
Title: Naive Dictionary on Musical Corpora: From Knowledge Representation to Pattern Recognition
Author(s): Qiuyi Wu* and Ernest Fokoue
Companies: University of Rochester and Rochester Institute of Technology and SAMSI
Keywords: text analysis; topic modeling; music analysis; muselets; Naive Dictionary
Abstract:

In this paper, we propose and develop the novel idea of treating musical sheets as literary documents in the traditional text analytics parlance, to fully benefit from the vast amount of research already existing in statistical text mining and topic modelling. We specifically introduce the idea of representing any given piece of music as a collection of "musical words" that we codenamed "muselets", which are essentially musical words of various lengths. Given the novelty and therefore the extremely difficulty of properly forming a complete version of a dictionary of muselets, the present paper focuses on a simpler albeit naive version of the ultimate dictionary, which we refer to as a Naive Dictionary because of the fact that all the words are of the same length. We specifically herein construct a naive dictionary featuring a corpus made up of African American, Chinese, Japanese and Arabic music, on which we perform both topic modelling and pattern recognition.


Authors who are presenting talks have a * after their name.

Back to the full JSM 2020 program