All Issue

2025 Vol.44, Issue 1 Preview Page

Research Article

31 January 2025. pp. 66-73
Abstract
References
1

Y. H. Tu, J. Du, and C. H. Lee, "2d-to-2d mask estimation for speech enhancement based on fully convolutional neural network," Proc. IEEE ICASSP, 6664-6668 (2020).

10.1109/ICASSP40776.2020.9054615
2

Y. Xu, J. Du, and C. H. Lee, "A regression approach to speech enhancement based on deep neural networks," IEEE/ACM Trans. on Audio, Speech, and Lang. Process. 23, 7-19 (2014).

10.1109/TASLP.2014.2364452
3

J. Jung and W. Kim, "A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum" (in Korea), J. Acoust. Soc. Kr. 41, 38-44 (2022).

4

P. J. Huber, Breakthroughs in Statistics: Methodology and Distribution (Springer, New York, 1992), pp. 492-518.

10.1007/978-1-4612-4380-9_35
5

A. B. Owen, "A robust hybrid of lasso and ridge regression," Contemp. Math. 443.7, 59-72 (2007).

10.1090/conm/443/08555
6

H. S. Choi, J. H. Kim, J. Huh, A. Kim, J. W. Ha, and K. Lee, "Phase-aware speech enhancement with deep complex u-net," Proc. ICLR, 1-20 (2019).

7

O. Oktay, J. Schlemper, L. Le. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y. Hammerla, B. Kainz, B. Glocker, and D. Rueckert, "Attention u-net: Learning where to look for the pancreas," arXiv preprint arXiv:1804.03999 (2018).

8

C. Trabelsi, O. Bilaniuk, Y. Zhang, D. Serdyuk, S. Subramanian, J. F. Santos, S. Mehri, N. Rostamzadeh, Y. Bengio, and C. J. Pal, "Deep complex networks," arXiv preprint arXiv:1705.09792 (2017).

9

P. Ochieng, "Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis," Artif. Intell. Rev. 56, 3651-3703 (2003).

10.1007/s10462-023-10612-2
10

CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit, https://doi.org/10.7488/ds/1994, (Last viewd January 21, 2025).

11

J. Thiemann, N. Ito, and E. Vincent, "DEMAND: a collection of multi-channel recordings of acoustic noise in diverse environments," Proc. Meetings on Acoustics, 19, 035081 (2013).

12

C. Valentini-Botinhao, Noisy Speech Database for Training Speech Enhancement Algorithms and TTS models, https://doi.org/10.7488/ds/2117, (Last viewed January 15, 2025).

13

E. Vincent, R. Gribonval, and C. Fevotte, "Performance measurement in blind audio source separation," IEEE Trans. Audio, Speech, and Lang. Process. 14, 1462-1469 (2006).

10.1109/TSA.2005.858005
14

A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, "Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs," Proc. IEEE ICASSP, 749-752 (2001).

10.1109/ICASSP.2001.941023
15

C. H. Taal, R. C. Hendriks, R, Heusdens, and J. Jensen, "A short-time objective intelligibility measure for time-frequency weighted noisy speech," Proc. IEEE ICASSP, 4214-4217 (2010).

10.1109/ICASSP.2010.5495701
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 44
  • No :1
  • Pages :66-73
  • Received Date : 2024-11-14
  • Accepted Date : 2024-12-31