All Issue

2024 Vol.43, Issue 2 Preview Page

Research Article

31 March 2024. pp. 253-259
Abstract
References
1
P. C. Loizou, Speech Enhancement: Theory and Practice, 2nd ed. (CRC Press, Inc., Boca Raton, 2013), pp. 1-768. 10.1201/b14529
2
S. Pascual, A. Bonafonte, and J. Serra, "SEGAN: Speech enhancement generative adversarial network," arXiv preprint arXiv:1703.09452 (2017). 10.21437/Interspeech.2017-1428
3
H. Dubey, A. Aazami, V. Gopal, B. Naderi, S. Braun, R. Cutler, A. Ju, M. Zohourian, M. Tang, H. Gamper, M. Golestaneh, and R. Aichner, "Icassp 2023 deep speech enhancement challenge," arXiv preprint arXiv: 2303.11510 (2023).
4
S. Hwang, S. W. Park, and Y. Park. "Performance comparison evaluation of real and complex networks for deep neural network-based speech enhancement in the frequency domain" (in Korean), J. Acoust. Soc. Kr. 41, 30-37 (2022).
5
S. A. Nossier, J. Wall, M. Moniri, C. Glackin, and N. Cannings, "Mapping and masking targets comparison using different deep learning based speech enhancement architectures," Proc, IEEE IJCNN, 1-8 (2020). 10.1109/IJCNN48605.2020.9206623
6
H. S. Choi, J. H. Kim, J. Huh, A. Kim, J. W. Ha, and K. Lee, "Phase-aware speech enhancement with deep complex u-net," arXiv preprint arXiv:1903.03107 (2019).
7
X. Qin, Z. Zhang, C. Huang, M. Dehghan, O. R. Zaiane, and M. Jagersand, "U2-Net: Going deeper with nested U-structure for salient object detection," Pattern Recognition, 106, 107404 (2020). 10.1016/j.patcog.2020.107404
8
S. Hwang, S. W. Park, and Y. Park, "Monoaural speech enhancement using a nested U-net with two-level skip connections," Proc. Interspeech, 191-195. (2022). 10.21437/Interspeech.2022-10025
9
R. Cao, S. Abdulatif, and B. Yang, "CMGAN: Conformer-based metric GAN for speech enhancement," arXiv preprint arXiv:2203.15149 (2022). 10.36227/techrxiv.21187846.v2PMC9791363
10
Z. Zhang, S. Xu, X. Zhuang, L. Zhou, H. Li, and M. Wang, "Two-stage UNet with multi-axis gated multilayer perceptron for monaural noisy-reverberant speech enhancement," Proc. IEEE ICASSP, 1-5 (2023). 10.1109/ICASSP49357.2023.10095657
11
Y. Hu, Y. Liu, S. Lv, M. Xing, S. Zhang, Y. Fu, J. Wu, B. Zhang, and L. Xie, "DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement," arXiv preprint arXiv:2008.00264 (2020). 10.21437/Interspeech.2020-2537PMC7553560
12
S. Hwang, J. Byun, and Y.-C. Park. "Performance comparison evaluation of speech enhancement using various loss functions" (in Korean), J. Acoust. Soc. Kr. 40, 176-182 (2021).
13
C. Valentini-Botinhao, X. Wang, S. Takaki, and J. Yamagishi, "Investigating RNN-based speech enhancement methods for noise-robust Text-to-Speech," Proc. SSW, 146-152 (2016). 10.21437/SSW.2016-24
14
Y. Hu and P. C. Loizou. "Evaluation of objective measures for speech enhancement," Proc. Interspeech, 1447-1450 (2006). 10.21437/Interspeech.2006-84
15
A. Li, C. Zheng, L. Zhang, and X. Li, "Glance and gaze: A collaborative learning framework for single-channel speech enhancement," Appl. Acoust. 187, 108499 (2022). 10.1016/j.apacoust.2021.108499
16
A. Defossez, G. Synnaeve, and Y. Adi, "Real time speech enhancement in the waveform domain," arXiv preprint arXiv:2006.12847 (2020). 10.21437/Interspeech.2020-2409
17
A. Li, W. Liu, C. Zheng, C. Fan, and X. Li, "Two heads are better than one: A two-stage complex spectral mapping approach for monaural speech enhancement," IEEE/ACM Transa. on Audio, Speech, and Lang. Process. 29, 1829-1843 (2021). 10.1109/TASLP.2021.3079813
18
S. Zhao, B. Ma, K. N. Watcharasupat, and W. S. Gan, "FRCRN: Boosting feature representation using frequency recurrence for monaural speech enhancement," Proc. IEEE ICASSP, 9281-9285 (2022). 10.1109/ICASSP43922.2022.974757835954638PMC9367798
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 43
  • No :2
  • Pages :253-259
  • Received Date : 2024-01-22
  • Accepted Date : 2024-02-07