Paper Detail

Towards an 'Everything Corpus': A Framework and Guidelines for the Curation of More Comprehensive Multimodal Music Data

Paper ID: https://openalex.org/W44100915032025Citations: 2core

Source

Transactions of the International Society for Music Information Retrieval

Slug: tismir

Abstract

Music information retrieval (MIR) is increasingly concerned with properly managing the complexity of musical data and the curation of high-quality multimodal datasets for use in a variety of computational tasks. This article presents (1) a conceptual framework for how practitioners interested in MIR-from musicians to scientists-can understand the multitude of modalities that constitute musical data and (2) a set of proposed guidelines for MIR researchers to consider when setting out to curate comprehensive, well-targeted, durable, and ethically sourced multimodal datasets. For (1), we identify 12 different themes of musical data divided into three, sequential phases further subdivided into five, narrow focus areas: (i) 'before' the music (leading to), (ii) the 'actual' music (itself and around it), and (iii) 'after' the music (uses of and responses to). For (2), we identify 17 specific quantitative, qualitative, and ethical criteria, informed by this conceptual framework and practices observed in existing multimodal datasets, for the eventual construction of an 'Everything Corpus' for MIR research.

Authors

  • Mark Gotham
  • Brian Bemman
  • Igor Vatolkin

Topics

Music and Audio ProcessingDiverse Musicological StudiesNatural Language Processing Techniques

Similar papers

Next explainability step

This page now serves real metadata from Postgres. Next, attach ranking run context and per-signal contributions.