All Issue

2022 Vol.41, Issue 1 Preview Page

Research Article

January 2022. pp. 30-37
Abstract
References
1
A. Narayanan and D. Wang, "Ideal ratio mask estimation using deep neural networks for robust speech recognition," Proc. IEEE ICASSP. 7092-7096 (2013). 10.1109/ICASSP.2013.6639038
2
T. Gerkmann, M. Krawczyk-Becker, and J. Le Roux, "Phase processing for single-channel speech enhance- ment: History and recent advances," IEEE Signal Process. Mag. 32, 55-66 (2015). 10.1109/MSP.2014.2369251
3
H.-S. Choi, J-H Kim, J. Huh, A. Kim, J.-W. Ha, and K. Lee,"Phase-aware speech enhancement with deep complexu-net," Proc. ICLR. 2019.
4
S. A. Nossier, J. Wall, M. Moniri, C. Glackin, and N. Cannings, "Mapping and masking targets comparison using different deep learning based speech enhancement architectures," Proc. IJCNN. 1-8 (2020). 10.1109/IJCNN48605.2020.9206623
5
K. Paliwal, K. Wójcicki, and B. Shannon, "The importance of phase in speech enhancement," Speech Commun. 53, 465-494 (2011). 10.1016/j.specom.2010.12.003
6
K. Tan and D. Wang, "Complex spectral mapping with a convolutional recurrent network for monaural speech enhancement," Proc. IEEE ICASSP. 6865-6869 (2019). 10.1109/ICASSP.2019.8682834
7
Y. Hu, Y. Liu, S. Lv, M. Xing, S. Zhang, Y. Fu, J. Wu, B. Zhang, and L. Xie, "Dccrn: Deep complex convolution recurrent network for phase-aware speech enhancement," Proc. Interspeech, 2472-2476 (2020). 10.21437/Interspeech.2020-2537PMC7553560
8
S. Santurkar, D. Tsipras, A. Ilyas, and A. Madry, "How does batch normalization help optimization?," Proc. NeurIPS. 1-11 (2018).
9
C. K. Reddy, V. Gopal, R. Cutler, E. Beyrami, R. Cheng, H. Dybey, S. Matusevych, R. Aichner, A. Aazami, S. Braun, and J. Gehrke, "The interspeech 2020 deep noise suppression challenge: Dataset, subjective testing framework, and challenge results," arXiv preprint arXiv:2005.13981 (2020). 10.21437/Interspeech.2020-3038
10
J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, N. L. Dahlgren, and V. Zue, "Timit acoustic phonetic continuous speech corpus," Linguistic Data Consortium (1993).
11
A. Varga and H. J. M. Steeneken, "Assessment for automatic speech recognition: Ii. noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems," Speech commun. 12, 247-251 (1993). 10.1016/0167-6393(93)90095-3
12
E. Vincent, J. Barker, S. Watanabe, J. Le Roux, F. Nesta, and M. Matassoni, "The second 'chime'speech sepaꠓration and recog-nition challenge: Datasets, tasks and baselines," Proc. IEEE ICASSP. 126-130 (2013). 10.1109/ICASSP.2013.6637622PMC3787509
13
J. Barker, R. Marxer, E. Vincent, and S. Watanabe, "The third 'chime' speech separation and recognition challenge: Dataset, taskand baselines," Proc. ISRU. 504-511 (2015). 10.1109/ASRU.2015.740483726035872
14
A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, "Perceptual evaluation of speech quality (PESQ)-A new method for speech quality assessment of telephone networks and coders," Proc. IEEE ICASSP. 749-752 (2001).
15
C. H. Taal, R. C. Hendriks, R. Heusdens, and J. Jensen, "An algorithm for intelligibility prediction of time- frequency weighted noisy speech," IEEE Trans. on Audio, Speech, and Lang. Process. 19, 2125-2136 (2011). 10.1109/TASL.2011.2114881
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 41
  • No :1
  • Pages :30-37
  • Received Date : 2021-12-03
  • Revised Date : 2021-12-27
  • Accepted Date : 2022-01-04