Research Article
Y. H. Tu, J. Du, and C. H. Lee, "2d-to-2d mask estimation for speech enhancement based on fully convolutional neural network," Proc. IEEE ICASSP, 6664-6668 (2020).
10.1109/ICASSP40776.2020.9054615Y. Xu, J. Du, and C. H. Lee, "A regression approach to speech enhancement based on deep neural networks," IEEE/ACM Trans. on Audio, Speech, and Lang. Process. 23, 7-19 (2014).
10.1109/TASLP.2014.2364452J. Jung and W. Kim, "A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum" (in Korea), J. Acoust. Soc. Kr. 41, 38-44 (2022).
P. J. Huber, Breakthroughs in Statistics: Methodology and Distribution (Springer, New York, 1992), pp. 492-518.
10.1007/978-1-4612-4380-9_35A. B. Owen, "A robust hybrid of lasso and ridge regression," Contemp. Math. 443.7, 59-72 (2007).
10.1090/conm/443/08555H. S. Choi, J. H. Kim, J. Huh, A. Kim, J. W. Ha, and K. Lee, "Phase-aware speech enhancement with deep complex u-net," Proc. ICLR, 1-20 (2019).
O. Oktay, J. Schlemper, L. Le. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y. Hammerla, B. Kainz, B. Glocker, and D. Rueckert, "Attention u-net: Learning where to look for the pancreas," arXiv preprint arXiv:1804.03999 (2018).
C. Trabelsi, O. Bilaniuk, Y. Zhang, D. Serdyuk, S. Subramanian, J. F. Santos, S. Mehri, N. Rostamzadeh, Y. Bengio, and C. J. Pal, "Deep complex networks," arXiv preprint arXiv:1705.09792 (2017).
P. Ochieng, "Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis," Artif. Intell. Rev. 56, 3651-3703 (2003).
10.1007/s10462-023-10612-2CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit, https://doi.org/10.7488/ds/1994, (Last viewd January 21, 2025).
J. Thiemann, N. Ito, and E. Vincent, "DEMAND: a collection of multi-channel recordings of acoustic noise in diverse environments," Proc. Meetings on Acoustics, 19, 035081 (2013).
C. Valentini-Botinhao, Noisy Speech Database for Training Speech Enhancement Algorithms and TTS models, https://doi.org/10.7488/ds/2117, (Last viewed January 15, 2025).
E. Vincent, R. Gribonval, and C. Fevotte, "Performance measurement in blind audio source separation," IEEE Trans. Audio, Speech, and Lang. Process. 14, 1462-1469 (2006).
10.1109/TSA.2005.858005- Publisher :The Acoustical Society of Korea
- Publisher(Ko) :한국음향학회
- Journal Title :The Journal of the Acoustical Society of Korea
- Journal Title(Ko) :한국음향학회지
- Volume : 44
- No :1
- Pages :66-73
- Received Date : 2024-11-14
- Accepted Date : 2024-12-31
- DOI :https://doi.org/10.7776/ASK.2025.44.1.066