Research Article
G. Tzanetakis and P. Cook, “Musical genre classification of audio signals,” IEEE Trans on speech and audio process. 10, 293-302 (2002).
10.1109/TSA.2002.800560A. Casey, R. Veltkamp, M. Goto, M. Leman, C. Rhodes, and M. Slaney, “Content-based music information retrieval: Current directions and future challenges,” Proc. IEEE, 96, 668-696 (2008).
10.1109/JPROC.2008.916370D. Bogdanov, N. Wack, E. Gómez Gutiérrez, S. Gulati, P. H. Boyer, O. Mayor, G. R. Trepat, J. Salamon, J. R. Z. González, and X. Serra, “Essentia: An audio analysis library for music information retrieval,” Proc. 14th ISMIR, 493-498 (2013).
A. Van Den Oord, S. Dieleman, and B. Schrauwen, “Deep content-based music recommendation,” Proc. NIPS, 26 (2013).
K. Choi, G. Fazekas, M. Sandler, and K. Cho, “Convolutional recurrent neural networks for music classification,” Proc. ICASSP, 2392-2396 (2017).
10.1109/ICASSP.2017.7952585Y. Xu and W. Zhou, “A deep music genres classification model based on CNN with Squeeze & Excitation Block,” Proc. IEEE APSIPA ASC. 332- 338 (2020).
A. Ferraro, D. Bogdanov, X. S. Jay, H. Jeon, and J. Yoon, “How low can you go? Reducing frequency and time resolution in current CNN architectures for music auto-tagging,” Proc. 28th EUSIPCO, 131-135 (2021).
10.23919/Eusipco47968.2020.9287769D. Kim, T. T. Sung, Y. S. Cho, G. Lee, and B. C. Sohn, “A single predominant instrument recognition of polyphonic music using CNN-based timbre analysis,” Int. J. Eng. Technol. 7, 590-595 (2018).
10.14419/ijet.v7i3.34.19388S. Joshi, T. Jain, and N. Nair, “Emotion based music recommendation system using LSTM-CNN architecture,” Proc. IEEE ICCCNT. 01-06 (2021).
10.1109/ICCCNT51525.2021.9579813C. Liu, L. Feng, G. Liu, H. Wang, and S. Liu, “Bottom-up broadcast neural network for music genre classification,” Multimed. Tools Appl. 80, 7313-7331 (2021).
10.1007/s11042-020-09643-6J. Lee and J. Nam, “Multi-level and multi-scale feature aggregation using pretrained convolutional neural networks for music auto-tagging,” IEEE Signal Processing Letters, 24, 1208-1212 (2017).
10.1109/LSP.2017.2713830S.-H. Gao, M.-M. Cheng, K. Zhao, X.-Y. Zhang, M.-H. Yang, and P. Torr, “Res2Net: A new multi- scale backbone architecture,” IEEE Trans. Pattern Anal. Mach. Intell. 43, 652-662 (2019).
10.1109/TPAMI.2019.2938758I. Ikhsan, L. Novamizanti, and I. N. A. Ramatryana, “Automatic musical genre classification of audio using Hidden Markov Model,” Proc. 2nd ICoICT, 397-402 (2014).
10.1109/ICoICT.2014.6914095FMA: A Dataset for Music Analysis, https://arxiv.org/abs/1612.01840, (Last viewed September 16, 2025).
A. Ferraro, Y. Kim, S. Lee, B. Kim, N. Jo, S. Lim, S. Lim, J. Jang, S. Kim, and X. Serra, “Melon playlist dataset: A public dataset for audio-based playlist generation and music tagging,” Proc. IEEE ICASSP, 536-540 (2021).
10.1109/ICASSP39728.2021.9413552S.-H. Cho, Y. Park, and J. Lee, “Effective music genre classification using late fusion convolutional neural network with multiple spectral features,” Proc. IEEE ICCE-Asia, 1-4 (2022).
10.1109/ICCE-Asia57006.2022.9954732S. Chen, C. Wang, Z. Chen, Y. Wu, S. Liu, Z. Chen, J. Li, N. Kanda, T. Yoshioka, X. Xiao, J. Wu, L. Zhou, S. Ren, Y. Qian , Y. Qian, J. Wu, M. Zeng, X. Yu, and F. Wei, “WavLM: Large-scale self-supervised pre- training for full stack speech processing,” IEEE J. Sel. Top. Signal Process. 16, 1505-1518 (2022).
10.1109/JSTSP.2022.3188113- Publisher :The Acoustical Society of Korea
- Publisher(Ko) :한국음향학회
- Journal Title :The Journal of the Acoustical Society of Korea
- Journal Title(Ko) :한국음향학회지
- Volume : 44
- No :5
- Pages :516-523
- Received Date : 2025-08-06
- Revised Date : 2025-08-29
- Accepted Date : 2025-08-30
- DOI :https://doi.org/10.7776/ASK.2025.44.5.516



The Journal of the Acoustical Society of Korea









