All Issue

2025 Vol.44, Issue 1 Preview Page

Research Article

31 January 2025. pp. 58-65
Abstract
References
1

J. Lim and A. Oppenheim, "All-pole modeling of degraded speech," IEEE Trans. Acoustics, Speech, and Signal Process. 26, 197-210 (1978).

10.1109/TASSP.1978.1163086
2

S. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE ICASSP, 27, 113-120 (1979).

10.1109/TASSP.1979.1163209
3

J.-h. Jung and W. Kim, "A study on loss combination in time and frequency for effective speech enhancement based on complex-valued spectrum" (in Korean), J. Acoust. Soc. Kr. 41, 38-44 (2022).

4

Z. Huang, S. Watanabe, S. W. Yang, P. García, and S. Khudanpur, "Investigating self-supervised learning for speech enhancement and separation," Proc. IEEE ICASSP, 6837-6841 (2022).

10.1109/ICASSP43922.2022.9746303
5

K.-H. Hung, S.-w. Fu, H.-h. Tseng, H.-T. Chiang, Y. Tsao, and C.-W. Lin, "Boosting self-supervised embeddings for speech enhancement," Proc. Interspeech, 186-190 (2022).

10.21437/Interspeech.2022-10002
6

O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," Proc. MICCAI, 234-241 (2015).

10.1007/978-3-319-24574-4_28
7

A. Baevski, H. Zhou, A. Mohamed, and M. Auli, "Wav2vec 2.0: a framework for self-supervised learning of speech representations," Proc. 34th Int. Conf. NeurIPS, 12449-12460 (2020).

8

C. Valentini-Botinhao, X. Wang, S. Takaki, and J. Yamagishi, "Investigating RNN-based speech enhancement methods for noiserobust text-to-speech," Proc. 9th ISCA Speech Synthesis Workshop, 146-152 (2016).

10.21437/SSW.2016-24
9

J. Thiemann, N. Ito, and E. Vincent, "The diverse environments multi-channel acoustic noise database: A database of multichannel environmental noise recordings," J. Acoust. Soc. Am. 133, 3591-3591 (2013).

10.1121/1.4806631
10

E. Vincent, R. Gribonval, and C. Fevotte, "Performance measurement in blind audio source separation," IEEE Trans. Audio, Speech, and Lang. Process. 14, 1462-1469 (2006).

10.1109/TSA.2005.858005
11

A. W. Rix, J. G. Beerends, M. P. Hollier, and A. P. Hekstra, "Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs," IEEE ICASSP, 749-752 (2001).

10.1109/ICASSP.2001.941023
12

C. H. Taal, R. C. Hendriks, R. Heusdens, and J. Jensen, "A short-time objective intelligibility measure for time-frequency weighted noisy speech," IEEE ICASSP, 4214-4217 (2010).

10.1109/ICASSP.2010.5495701
13

I. Cohen, "Noise spectrum estimation in adverse environments: Improved minima controlled recursive averaging," IEEE Trans. Audio, Speech, Lang. Process. 11, 466-475 (2003).

10.1109/TSA.2003.811544
14

S. Pascual, A. Bonafonte, and J. Serra, "SEGAN: Speech enhancement generative adversarial network," Proc. Interspeech, 3642-3646 (2017).

10.21437/Interspeech.2017-1428
15

M. N. Ali, A. Brutti, and D. Falavigna, "Speech enhancement using dilated wave-u-net: an experimental analysis," Proc. 27th FRUCT, 3-9 (2020).

10.23919/FRUCT49677.2020.9211072
16

W. Jiang, F. Wen, Y. Zhang, and K. Yu, "UnSE: Unsupervised speech enhancement using optimal transport," Proc. Interspeech, 4029-4033 (2023).

10.21437/Interspeech.2023-378
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 44
  • No :1
  • Pages :58-65
  • Received Date : 2024-11-14
  • Revised Date : 2024-12-27
  • Accepted Date : 2024-12-31