Music acoustic features: Do machine predictions correspond with human judgments?
Poster presented at NeuroMusic 18, McMaster University, Hamilton, Canada. Presenter Name: Maya Flannery School/Affiliation: McMaster University Co-Authors: Matthew Woolhouse
Abstract Link to heading
Researchers’ methods of music description and classification have long been criticized. Musical genre, for example, maintains little consistency between category definitions and has consequently been called intrinsically ambiguous. Recent methods have approached music classification differently: in terms of the structural and expressive musical cues used by composers and performers. This approach allows for more consistently defined music characteristics and, as a result, for stimuli to be reliably produced and manipulated in experiments. The present study investigated the effectiveness and potential benefits of such an approach. First, a number of machine learning algorithms were trained to predict levels of six musical features (i.e., articulation, dynamic, register, tempo, texture, and timbre) from the output of a music information retrieval tool named Essentia. We refer to these features as Music Acoustic Features (MAFs). Optimal algorithms were then used to predict levels of MAFs in 44 real-world musical excerpts. Finally, in a listening task, participants (N = 43) provided ratings for the same six MAFs and excerpts. The results of each method, machine predictions and human judgments, were then compared for their consistency. Significant correlations were found between the levels of MAFs predicted by both methods. The procedure outlined here showed that MAFs can be reliably produced and manipulated, effectively measured within audio stimuli, and are readily perceived by listeners. MAFs can thus be effectively applied in music research as a reliable way to develop experiments with well-defined musical stimuli. Furthermore, since MAFs can be identified within existing audio, MAFs can enrich previous research by clarifying ambiguous results with clear and consistent descriptions of music.