- Main
Temporal event clustering in speech versus music
Abstract
Both speech and music can be organized as hierarchical, nested groupings of units. In speech, for instance, phonemescan group to form syllables, which group to form words, which group to form sentences, and so on. In music, notes can groupto form phrases, which group to form chord progressions, which group to form verses, and so on. We present a new methodfor extracting events (amplitude peaks in Hilbert envelopes of filter banks) from speech and music recordings, and quantifyingthe degree of nesting in temporal clusters of events across timescales (using Allan Factor analysis). We apply this method tomonologue recordings of speech (TED talks) and also to solo musical performances of similar lengths. We found that bothtypes of recordings exhibit nested clustering, revealing similar organizational principles, but that clustering is more pronouncedon shorter timescales (milliseconds) for speech, but longer timescales (seconds+) for music.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-