All Issue

2023 Vol.42, Issue 6 Preview Page

Research Article

30 November 2023. pp. 544-551
Abstract
References
1
J. Lim and A. Oppenheim, "All-pole modeling of degraded speech," IEEE Trans. Acoust. Speech Signal Process. 26, 197-210 (1978). 10.1109/TASSP.1978.1163086
2
S. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust. Speech Signal Process. 27, 113-120 (1979). 10.1109/TASSP.1979.1163209
3
K. Tan and D. Wang, "A convolutional recurrent neural network for real-time speech enhancement," Proc. Interspeech, 3229-3233 (2018). 10.21437/Interspeech.2018-140530200723
4
Y. Hu, Y. Liu, S. Lv, M. Xing, S. Zhang, Y. Fu, J. Wu, B. Zhang, and L. Xie, "DCCRN: Deep complex convolution recurrent network for phase-aware speech enhancement," Proc. Interspeech, 2472-2476 (2020). 10.21437/Interspeech.2020-2537PMC7553560
5
H. S. Choi, J. H. Kim, J. Huh, A. Kim, J. W. Ha, and K. Lee, "Phase-aware speech enhancement with deep complex u-net," Proc. ICLR, 1-20 (2019).
6
D. Wang and J. Chen, "Supervised speech separation based on deep learning: An overview," IEEE/ACM Trans. Audio, Speech, Language Process. 26, 1702-1726 (2018). 10.1109/TASLP.2018.284215931223631PMC6586438
7
K. Paliwal, K. Wójcicki, and B. Shannon, "The importance of phase in speech enhancement," Speech Communication, 53, 465-494 (2011). 10.1016/j.specom.2010.12.003
8
Y. Wang and D. L. Wang, "A deep neural network for time-domain signal reconstruction," Proc. ICASSP, 4390-4394 (2015). 10.1109/ICASSP.2015.7178800
9
A. Li, C. Zheng, C. Fan, R. Peng, and X. Li, "A recursive network with dynamic attention for monaural speech enhancement," Proc. Interspeech, 2422-2426 (2020). 10.21437/Interspeech.2020-1513
10
Y. Koizumi, K. Yatabe, M. Delcroix, Y. Masuyama, and D. Takeuchi, "Speech enhancement using self-adaptation and multi-head self-attention," Proc. ICASSP, 181-185 (2020). 10.1109/ICASSP40776.2020.9053214
11
Z. Qiquan, S. Qi, N. Zhaoheng, N. Aaron, and L. Haizhou, "Time-Frequency Attention for Monaural Speech Enhancement," Proc. ICASSP, 7852-7856 (2022).
12
O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y. Hammerla, B. Kainz, B. Glocker, and D. Rueckert, "Attention u-net: Learning where to look for the pancreas," Proc. MIDL, 1-10 (2018).
13
Y. Luo and N. Mesgarani, "Conv-tasnet: Surpassing ideal time-frequency magnitude masking for speech separation," IEEE/ACM Trans. Audio, Speech, Language Process. 27, 1256-1266 (2019). 10.1109/TASLP.2019.291516731485462PMC6726126
14
J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, and N. L. Dahlgren, "Acoustic-phonetic continuous speech corpus CD-ROM NIST speech disc 1-1.1," DARPA TIMIT, NIST Interagenct/Internal Rep., (NISTIR) 4930, 1993. 10.6028/NIST.IR.4930
15
E. Vincent, R. Gribonval, and C. Févotte, "Performance measurement in blind audio source separation," IEEE Trans. Audio, Speech, Language Process. 14, 1462-1469 (2006). 10.1109/TSA.2005.858005
16
A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, "Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs," Proc. ICASSP, 749-752 (2001).
17
C. H. Taal, R. C. Hendriks, and R. Heusdens, "A short-time objective intelligibility measure for time-frequency weighted noisy speech," Proc. ICASSP, 4214-4217 (2010). 10.1109/ICASSP.2010.5495701
18
O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," Proc. MICCAI, 234-241 (2015). 10.1007/978-3-319-24574-4_28
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 42
  • No :6
  • Pages :544-551
  • Received Date : 2023-08-08
  • Accepted Date : 2023-09-08