All Issue

2025 Vol.44, Issue 5 Preview Page

Research Article

30 September 2025. pp. 540-547
Abstract
References
1

P. C. Loizou, Speech Enhancement: Theory and Practice (CRC Press, Boca Raton, 2013), pp. 1-10.

10.1201/b14529-1
2

Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Trans. Acoust. Speech, Signal Process. 32, 1109-1121 (1984).

10.1109/TASSP.1984.1164453
3

Y. Xu, J. Du, L.-R. Dai, and C.-H. Lee, “A regression approach to speech enhancement based on deep neural networks,” IEEE/ACM Trans. Audio, Speech, Lang. Process. 23, 7-19 (2015).

10.1109/TASLP.2014.2364452
4

K. Tan and D. Wang, “A convolutional recurrent neural network for real-time speech enhancement,” Proc. Interspeech, 3229-3233 (2018).

10.21437/Interspeech.2018-1405
5

C. Valentini-Botinhao, X. Wang, S. Takaki, and J. Yamagishi, “Investigating RNN-based speech enhancement methods for noise-robust text-to-speech,” Proc. SSW9, 145-150 (2016).

10.21437/SSW.2016-24
6

Y. Hu, Y. Liu, S. Lv, M. Xing, S. Zhang, Y. Fu, J. Wu, B. Zhang, and L. Xie, “DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement,” Proc. Interspeech, 2472-2476 (2020).

10.21437/Interspeech.2020-2537
7

S. Pascual, A. Bonafonte, and J. Serra, “SEGAN: Speech enhancement generative adversarial network,” Proc. Interspeech, 3642-3646 (2017).

10.21437/Interspeech.2017-1428
8

Y. Luo and N. Mesgarani, “Conv-TasNet: Surpassing ideal time-frequency magnitude masking for speech separation,” IEEE/ACM Trans. Audio, Speech, Lang. Process. 27, 1256-1266 (2019).

10.1109/TASLP.2019.291516731485462PMC6726126
9

H. Schröter, A. N. Gomez, and T. Gerkmann, “DeepFilterNet2: Towards real-time speech enhancement on embedded devices for full-band audio,” arXiv: 2205.05474 (2022).

10.1109/IWAENC53105.2022.9914782
10

W. Tai, Y. Lei, F. Zhou, G. Trajcevski, and T. Zhong, “DOSE: Diffusion dropout with adaptive prior for speech enhancement,” Proc. NeurIPS, 1-22 (2023).

11

J.-M. Valin, “A hybrid DSP/deep learning approach to real-time full-band speech enhancement,” Proc. MMSP, 1-5 (2018).

10.1109/MMSP.2018.8547084
12

A. Li, C. Zheng, L. Zhang, and X. Li, “Glance and gaze: A collaborative learning framework for single- channel speech enhancement,” Appl. Acoust. 187, 108535 (2022).

10.1016/j.apacoust.2021.108499
13

G. Zhang, L. Yu, C. Wang, and J. Wei, “Multi-scale temporal frequency convolutional network with axial attention for speech enhancement,” Proc. ICASSP, 9122-9126 (2022).

10.1109/ICASSP43922.2022.9746610
14

H. Dubey, V. Gopal, R. Cutler, A. Aazami, S. Matusevych, S. Braun, S. E. Eskimez, M. Thakker, T. Yoshioka, H. Gamper, and R. Aichner, “ICASSP 2022 deep noise suppression challenge,” Proc. ICASSP, 9271-9275 (2022).

10.1109/ICASSP43922.2022.9747230
15

C. K. A. Reddy, V. Gopal, and R. Cutler, “Dnsmos P.835: A non-intrusive perceptual objective speech quality metric to evaluate noise suppressors,” Proc. ICASSP, 721-725 (2022).

10.1109/ICASSP43922.2022.9746108
16

J. Ha, S. Kwak, and S. Jung, “KsponSpeech: Korean spontaneous speech corpus for automatic speech recognition,” Appl. Sci. 10, 6936 (2020).

10.3390/app10196936
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 44
  • No :5
  • Pages :540-547
  • Received Date : 2025-08-24
  • Accepted Date : 2025-09-08