V. Hongisto, “Effects of sound masking on workers - a case study in a landscaped office,” Proc. 9th ICBEN, 442-449 (2008).
L. Lu, H.-J. Zhang, and H. Jiang, “Content analysis for audio classification and segmentation,” IEEE Trans. Speech Audio Process. 10, 504-516 (2002).
10.1109/TSA.2002.804546K. J. Piczak, “Environmental sound classification with convolutional neural networks,” Proc. MLSP 2015, 1-6 (2015).
10.1109/MLSP.2015.7324337Y. Tokozume and T. Harada, “Learning environmental sounds with end-to-end convolutional neural network,” Proc. ICASSP, 2721-2725 (2017).
10.1109/ICASSP.2017.7952651X. Zhang, Y. Zou, and W. Shi, “Dilated convolution neural network with LeakyReLU for environmental sound classification,” Proc. DSP 2017, 1-5 (2017).
10.1109/ICDSP.2017.8096153Z. Zhang, S. Xu, S. Cao, and S. Zhang, “Deep convolutional neural network with mixup for environmental sound classification,” Proc. PRCV, 356-367 (2018).
10.1007/978-3-030-03335-4_31H. Wang, Y. Zou, D. Chong, and W. Wang, “Environmental sound classification with parallel temporal-spectral attention,” arXiv:1912.06808 (2019).
10.21437/Interspeech.2020-1219Y. Cai, P. Zhang, and S. Li, “TF-SepNet: An efficient 1D kernel design in CNNs for low-complexity acoustic scene classification,” Proc. ICASSP, 821-825 (2024).
10.1109/ICASSP48485.2024.10447999A. Ashurov, Z. Yi, H. Liu, Z. Yu, and M. Li, “Concatenation-based pre-trained convolutional neural networks using attention mechanism for environmental sound classification,” Appl. Acoust. 216, 109759 (2024).
10.1016/j.apacoust.2023.109759F. Schmid, P. Primus, T. Heittola, A. Mesaros, I. Martín-Morató, K. Koutini, and G. Widmer, “Data-efficient low-complexity acoustic scene classification in the DCASE 2024 Challenge,” arXiv:2405.10018 (2024).
K. Miyazaki, T. Komatsu, T. Hayashi, S. Watanabe, T. Toda, and K. Takeda, “Conformer-based sound event detection with semi-supervised learning and data augmentation,” Proc. DCASE2020, 100-104 (2020).
A. Gulati, J. Qin, C.-C. Chiu, N. Parmar, Y. Zhang, J. Yu, W. Han, S. Wang, Z. Zhang, Y. Wu, and R. Pang, “Conformer: Convolution-augmented Transformer for speech recognition,” Proc. Interspeech 2020, 5036-5040 (2020).
10.21437/Interspeech.2020-3015A. Mesaros, T. Heittola, and T. Virtanen, “Acoustic scene classification: An overview of DCASE 2017 challenge entries,” Proc. IWAENC 2018, 411-415 (2018).
10.1109/IWAENC.2018.8521242G. Dekkers, S. Lauwereins, B. Thoen, M. W. Adhana, H. Brouckxon, T. V. Waterschoot, B. Vanrumste, M. Verhelst, and P. Karsmakers, “The SINS database for detection of daily activities in a home environment using an acoustic sensor network,” Proc. DCASE 2017, 32-36 (2017).
T. Inoue, P. Vinayavekhin, S. Wang, D. Wood, N. Greco, and R. Tachibana, “Domestic activities classification based on CNN using shuffling and mixing data augmentation,” DCASE 2018 Challenge., Tech. Rep., 2018.
T. Iqbal, Y. Cao, A. Bailey, M. D. Plumbley, and W. Wang, “ARCA23K: An audio dataset for investigating open-set label noise,” arXiv:2109.09227 (2021).
E. Fonseca, X. Favory, J. Pons, F. Font, and X. Serra, “FSD50K: An open dataset of human-labeled sound events,” IEEE/ACM Trans. Audio Speech Lang. Process. 30, 829-852 (2021).
10.1109/TASLP.2021.3133208- Publisher :The Acoustical Society of Korea
- Publisher(Ko) :한국음향학회
- Journal Title :The Journal of the Acoustical Society of Korea
- Journal Title(Ko) :한국음향학회지
- Volume : 45
- No :3
- Pages :313-322
- Received Date : 2026-03-23
- Accepted Date : 2026-05-06
- DOI :https://doi.org/10.7776/ASK.2026.45.3.313



The Journal of the Acoustical Society of Korea









