All Issue

2025 Vol.44, Issue 3 Preview Page

Research Article

31 May 2025. pp. 270-280
Abstract
References
1

F. Schmid, P. Primus, T. Heittola, A. Mesaros, I. M.-Morató, K. Koutini, and G. Widmer, "Data-efficient low-complexity acoustic scene classification in the DCASE 2024 challenge," arXiv preprint, arxiv:2405. 10018 (2024).

2

N. Turpault, R. Serizel, A. P. Shah, and J. Salamon, "Sound event detection in domestic environments with weakly labeled data and soundscape synthesis," Proc. DCASE, 253-257 (2019).

10.33682/006b-jx26
3

N. Harada, D. Niizumi, Y. Ohishi, D. Takeuchi, and M. Yasuda, "First-shot anomaly sound detection for machine condition monitoring: A domain generalization baseline," Proc. EUSIPCO, 191-195 (2023).

10.23919/EUSIPCO58844.2023.10289721
4

S. Cheng, C. Wang, K. Yue, R. Li, F. Shen, W. Shuai, W. Li, and L. Dai, "Automated sleep apnea detection in snoring signal using long short-term memory neural networks," Biomed. Signal Process. Control. 71, 103238 (2022).

10.1016/j.bspc.2021.103238
5

S. K. Ghosh, R. N. Ponnalagu, R. K. Tripathy, G. Panda, and R. B. Pachori, "Automated heart sound activity detection from PCG signal using time-frequency-domain deep neural network," IEEE Trans. Instrum. Meas. 71, 1-10 (2022).

10.1109/TIM.2022.3192257
6

Y. Cai, S. Li, and X. Shao, "Leveraging self-supervised audio representations for data-efficient acoustic scene classification," Proc. DCASE, 21-25 (2024).

7

D. Nadrchal, A. Rostamza, and P. Schilcher, "Data- efficient acoustic scene classification with pre- training, bayesian ensemble averaging, and extensive augmentations," Proc. DCASE, 91-95 (2024).

8

D. Stowell, D. Giannoulis, E. Benetos, M. Lagrange, and M. D. Plumbley, "Detection and classification of acoustic scenes and events," IEEE Trans. Multimedia. 17, 1733-1746 (2015).

10.1109/TMM.2015.2428998
9

H. Eghbal-Zadeh, B. Lehner, M. Dorfer, and G. Widmer, "CP-JKU submissions for DCASE-2016: A hybrid approach using binaural i-vectors and deep convolutional neural networks," DCASE, Tech. Rep., 2016.

10

K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint, arxiv:1409.1556 (2014).

11

Y. Sakashita and M. Aono, "Acoustic scene classification by ensemble of spectrograms based on adaptive temporal divisions," DCASE, Tech. Rep., 2018.

12

C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, "Inception-v4, inception-ResNet and the impact of residual connections on learning," Proc. AAAI, 4278- 4284 (2017).

10.1609/aaai.v31i1.11231
13

H. Chen, Z. Liu, Z. Liu, P. Zhang, and Y. Yan, "Integrating the data augmentation scheme with various classifiers for acoustic scene modeling," arXiv preprint, arxiv:1907.06639 (2019).

14

K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," Proc. CVPR, 770- 889 (2016).

10.1109/CVPR.2016.9026180094
15

S. Suh, S. Park, Y. Jeong, and T. Lee, "Designing acoustic scene classification models with CNN variants," DCASE, Tech. Rep., 2020.

16

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," Proc. NIPS, 5998-6008 (2017).

17

T. Heittola, A. Mesaros, and T. Virtanen, "Acoustic scene classification in DCASE 2020 challenge: Generalization across devices and low complexity solutions," Proc. DCASE, 56-60 (2020).

18

A. Mesaros, T. Heittola, and T. Virtanen, "A multi- device dataset for urban acoustic scene classification," arXiv preprint, arxiv:1807.09840 (2018).

19

G. Dekkers, S. Lauwereins, B. Thoen, M. W. Adhana, H. Brouckxon, B. van den Bergh, T. van Waterschoot, B. Vanrumste, M. Verhelst, and P. Karsmakers, "The SINS database for detection of daily activities in a home environment using an acoustic sensor network," Proc. DCASE, 1-5 (2017).

20

J. Salamon, C. Jacoby, and J. P. Bello, "A dataset and taxonomy for urban sound research," Proc. ACM, 1041-1044 (2014).

10.1145/2647868.2655045
21

J.-W. Jung, H.-S. Heo, H.-J. Shim, and H.-J. Yu, "Knowledge distillation in acoustic scene classification," IEEE Access. 8, 166870-166879 (2020).

10.1109/ACCESS.2020.3021711
22

S. Takeyama, T. Komatsu, K. Miyazaki, M. Togami, and S. Ono, "Robust acoustic scene classification to multiple devices using maximum classifier discrepancy and knowledge distillation," Proc. EUSIPCO, 36-40 (2021).

10.23919/Eusipco47968.2020.9287734
23

A. M. Tripathi and O. J. Pandey, "Divide and distill: New outlooks on knowledge distillation for environmental sound classification," IEEE Trans. Audio Speech Lang. Process. 31, 1100-1113 (2023).

10.1109/TASLP.2023.3244507
24

H. Dinkel, Y. Wang, Z. Yan, J. Zhang, and Y. Wang, "CED: Consistent ensemble distillation for audio tagging," Proc. IEEE ICASSP, 291-295 (2024).

10.1109/ICASSP48485.2024.10446348
25

B. Han, W. Huang, Z. Chen, A. Jiang, P. Fan, C. Lu, Z. Lv, J. Liu, W.-Q. Zhang, and Y. Qian, "Data- efficient low-complexity acoustic scene classification via distilling and progressive pruning," arXiv preprint, arxiv:2410.20775 (2024).

26

Y. Cai, S. Li, and X. Shao, "Leveraging self-supervised audio representations for data-efficient acoustic scene classification," Proc. DCASE, 21-25 (2024).

27

W. Chen, Y. Liang, Z. Ma, Z. Zheng, and X. Chen, "EAT: Self-supervised pre-training with efficient audio transformer," arXiv preprint, arXiv:2401.03497 (2024).

10.24963/ijcai.2024/421PMC11597076
28

S. Abdulatif, R. Cao, and B. Yang, "CMGAN: Conformer-based metric-gan for monaural speech enhancement," IEEE/ACM Trans. Audio Speech Lang. Process. 32, 2477-2493 (2024).

10.1109/TASLP.2024.3393718
29

Y.-X. Lu, Y. Ai, and Z.-H. Ling, "MP-SENet: A speech enhancement model with parallel denoising of magnitude and phase spectra," Proc. Interspeech, 3834-3838 (2023).

30

A. Pandey and D. Wang, "Densely connected neural network with dilated convolutions for real-time speech enhancement in the time domain," Proc. IEEE ICASSP, 6629-6633 (2020).

10.1109/ICASSP40776.2020.9054536
31

G. Dekkers, L. Vuegen, T. van Waterschoot, B. Vanrumste, and P. Karsmakers, "DCASE 2018 challenge - Task 5: Monitoring of domestic activities based on multi-channel acoustics," arXiv preprint, arxiv:1807.11246 (2018).

32

I. Loshchilov and F. Hutter, "Decoupled weight decay regularization," arXiv preprint, arxiv:1711.05101 (2019).

33

T. Iqbal, Y. Cao, A. Bailey, M. D. Plumbley, and W. Wang, "ARCA23K: An audio dataset for investigating open-set label noise," Proc, DCASE, 201-205 (2021).

34

E. Fonseca, X. Favory, J. Pons, F. Font, and X. Serra, "FSD50K: An open dataset of human-labeled sound events," IEEE/ACM Trans. Audio Speech Lang. Process. 30, 829-852 (2022).

10.1109/TASLP.2021.3133208
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 44
  • No :3
  • Pages :270-280
  • Received Date : 2025-03-24
  • Revised Date : 2025-04-22
  • Accepted Date : 2025-04-23