2510001943
  • Open Access
  • Article

MIRTracks: A Large-Scale Multi-Dimensional Multi-Track Music Dataset

  • Yuehan Lee *,   
  • Yi Qin *

Received: 02 Sep 2025 | Revised: 09 Oct 2025 | Accepted: 27 Oct 2025 | Published: 12 Nov 2025

Abstract

This paper presents MIRTracks, a large-scale dataset containing 240 h of royalty-free multi-track audio, aiming to address the limitations of traditional music source separation datasets, including single-dimensional annotation and semantic information gaps. By integrating multi-dimensional musical information annotation with a semi-automated annotation pipeline, MIRTracks achieves high-quality semantic annotation across rock, electronic, and pop music genres. Experiments demonstrate that fine-tuning a small-scale model on this dataset significantly improves beat detection accuracy from 66.2% to 80.1%, reaching 91.0% of the performance of large-scale models.

References 

  • 1.
    Rafii, Z.; Liutkus, A.; Stöter, F.R.; et al. MUSDB18—A corpus for music separation. arXiv preprint 2017, arXiv:1710.11192.
  • 2.
    Stöter, F.R.; Uhlich, S.; Liutkus, A.; et al. Open-Unmix: A Reference Implementation for Music Source Separation. J. Open Source Softw. 2019, 4, 1667.
  • 3.
    Tong, W.; Zhu, J.; Chen, J.; et al. SCNet: Sparse compression network for music source separation. arXiv preprint 2024, arXiv:2401.13276.
  • 4.
    Bittner, R.M.; Salamon, J.; Tierney, M.; et al. A Multitrack Dataset for Annotation-Intensive MIR Research. In Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), Taipei, Taiwan, 27–31 October 2014; pp. 155–160.
  • 5.
    Hadjeres, G.; Pachet, F.; Nielsen, F. DeepBach: A Steerable Model for Bach Chorales Generation. In Proceedings of the 34th International Conference on Machine Learning (PMLR), Sydney, Australia, 6–11 August 2017; pp. 1362–1371.
  • 6.
    Dhariwal, P.; Jun, H.; Payne, C.; et al. Jukebox: A generative model for music. arXiv e-print 2020, arXiv:2005.00341.
  • 7.
    Agostinelli, A.; Denk, T.I.; Borsos, Z.; et al. MusicLM: Generating music from text. arXiv preprint 2023, arXiv:2301.11325.
  • 8.
    Foscarin, F.; Schlüter, J.; Widmer, G. Beat this! Accurate beat tracking without DBN postprocessing. arXiv preprint 2024, arXiv:2407.21658.
  • 9.
    Sturm, B.L. The GTZAN dataset: Its contents, its faults, their effects on evaluation, and its future use. arXiv preprint 2013, arXiv:1306.1461.
  • 10.
    Liutkus, A.; Stöter, F.-R.; Rafii, Z.; et al. The 2016 Signal Separation Evaluation Campaign. In Latent Variable Analysis and Signal Separation, Proceedings of the 12th International Conference, LVA/ICA 2015, Liberec, Czech Republic, 25–28 August 2015; pp. 323–332; Tichavský, P., Babaie-Zadeh, M., Michel, O.J.J.; et al., Eds.; Springer International Publishing: Cham, Switzerland, 2017. https://doi.org/10.1007/978-3-319-19544-2_31.
Share this article:
How to Cite
Lee, Y.; Qin, Y. MIRTracks: A Large-Scale Multi-Dimensional Multi-Track Music Dataset. Transactions on Artificial Intelligence 2025, 1 (1), 282–290. https://doi.org/10.53941/tai.2025.100019.
RIS
BibTex
Copyright & License
article copyright Image
Copyright (c) 2025 by the authors.