Research Article
P. C. Loizou, Speech Enhancement: Theory and Practice (CRC Press, Boca Raton, 2013), pp. 1-10.
10.1201/b14529-1Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Trans. Acoust. Speech, Signal Process. 32, 1109-1121 (1984).
10.1109/TASSP.1984.1164453Y. Xu, J. Du, L.-R. Dai, and C.-H. Lee, “A regression approach to speech enhancement based on deep neural networks,” IEEE/ACM Trans. Audio, Speech, Lang. Process. 23, 7-19 (2015).
10.1109/TASLP.2014.2364452K. Tan and D. Wang, “A convolutional recurrent neural network for real-time speech enhancement,” Proc. Interspeech, 3229-3233 (2018).
10.21437/Interspeech.2018-1405C. Valentini-Botinhao, X. Wang, S. Takaki, and J. Yamagishi, “Investigating RNN-based speech enhancement methods for noise-robust text-to-speech,” Proc. SSW9, 145-150 (2016).
10.21437/SSW.2016-24Y. Hu, Y. Liu, S. Lv, M. Xing, S. Zhang, Y. Fu, J. Wu, B. Zhang, and L. Xie, “DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement,” Proc. Interspeech, 2472-2476 (2020).
10.21437/Interspeech.2020-2537S. Pascual, A. Bonafonte, and J. Serra, “SEGAN: Speech enhancement generative adversarial network,” Proc. Interspeech, 3642-3646 (2017).
10.21437/Interspeech.2017-1428Y. Luo and N. Mesgarani, “Conv-TasNet: Surpassing ideal time-frequency magnitude masking for speech separation,” IEEE/ACM Trans. Audio, Speech, Lang. Process. 27, 1256-1266 (2019).
10.1109/TASLP.2019.291516731485462PMC6726126H. Schröter, A. N. Gomez, and T. Gerkmann, “DeepFilterNet2: Towards real-time speech enhancement on embedded devices for full-band audio,” arXiv: 2205.05474 (2022).
10.1109/IWAENC53105.2022.9914782W. Tai, Y. Lei, F. Zhou, G. Trajcevski, and T. Zhong, “DOSE: Diffusion dropout with adaptive prior for speech enhancement,” Proc. NeurIPS, 1-22 (2023).
J.-M. Valin, “A hybrid DSP/deep learning approach to real-time full-band speech enhancement,” Proc. MMSP, 1-5 (2018).
10.1109/MMSP.2018.8547084A. Li, C. Zheng, L. Zhang, and X. Li, “Glance and gaze: A collaborative learning framework for single- channel speech enhancement,” Appl. Acoust. 187, 108535 (2022).
10.1016/j.apacoust.2021.108499G. Zhang, L. Yu, C. Wang, and J. Wei, “Multi-scale temporal frequency convolutional network with axial attention for speech enhancement,” Proc. ICASSP, 9122-9126 (2022).
10.1109/ICASSP43922.2022.9746610H. Dubey, V. Gopal, R. Cutler, A. Aazami, S. Matusevych, S. Braun, S. E. Eskimez, M. Thakker, T. Yoshioka, H. Gamper, and R. Aichner, “ICASSP 2022 deep noise suppression challenge,” Proc. ICASSP, 9271-9275 (2022).
10.1109/ICASSP43922.2022.9747230- Publisher :The Acoustical Society of Korea
- Publisher(Ko) :한국음향학회
- Journal Title :The Journal of the Acoustical Society of Korea
- Journal Title(Ko) :한국음향학회지
- Volume : 44
- No :5
- Pages :540-547
- Received Date : 2025-08-24
- Accepted Date : 2025-09-08
- DOI :https://doi.org/10.7776/ASK.2025.44.5.540



The Journal of the Acoustical Society of Korea









