All Issue

2023 Vol.42, Issue 4 Preview Page

Research Article

31 July 2023. pp. 357-363
Abstract
References
1
F. Bahmaninezhad, J. Wu, R. Gu, S.-X. Zhang, Y. Xu, M. Yu, and D. Yu, "A comprehensive study of speech separation: spectrogram vs waveform separation," Proc. Interspeech, 4574-4578 (2019). 10.21437/Interspeech.2019-3181
2
P.-S. Huang, M. Kim, M. Hasegawa-Johnson, and P. Smaragdis, "Singing-voice separation from monaural recordings using deep recurrent neural networks," Proc. ISMIR, 477-482 (2014).
3
B. Gao, W. L. Woo, and S. S. Dlay, "Adaptive sparsity non-negative matrix factorization for single-channel source separation," IEEE J. Sel. Top. Signal Process, 5, 989-1001 (2011). 10.1109/JSTSP.2011.2160840
4
N. mitianoudis and M. E. davies, "Audio source separation of convolutive mixtures," IEEE trans. Speech, Audio, Process. 11, 489-497 (2003). 10.1109/TSA.2003.815820
5
D. Stoller, S. Ewert, and S. Dixon, "Wave-u-net: A multi-scale neural network for end-to-end audio source separation," Proc. ISMIR, 1-7 (2018).
6
Y. Luo, Z. Chen, and T. Yoshioka, "Dual-path rnn: efficient long sequence modeling for time-domain single-channel speech separation," Proc. ICASSP, 46-50 (2020). 10.1109/ICASSP40776.2020.9054266
7
S. Venkataramani, J. Casebeer, and P. Smaragdis, "End-to-end source separation with adaptive front-ends," Proc. 52nd Asilomar Conf. Sig. Sys. Comput. 684-688 (2018). 10.1109/ACSSC.2018.8645535
8
F. Lluís, J. Pons, and X. Serra, "End-to-end music source separation: Is it possible in the waveform domain?," Proc. Interspeech, 4619-4623 (2018). 10.21437/Interspeech.2019-1177
9
I. Kavalerov, S. Wisdom, H. Erdogan, B. Patton, K. Wilson, J. Le Roux, and J. R. Hershey, "Universal sound separation," Proc. IEEE WASPAA, 175-179 (2019). 10.1109/WASPAA.2019.8937253
10
K. He, X. Zhang, S. Ren, and J. Sun, "Identity mappings in deep residual networks," Proc. ECCV, 630-645 (2016). 10.1007/978-3-319-46493-0_38
11
Y. Luo and N. Mesgarani, "Tasnet: time-domain audio separation network for real-time, single-channel speech separation," IEEE ICASSP, 696-700 (2018). 10.1109/ICASSP.2018.8462116
12
D. Santos-Domínguez, S. Torres-Guijarro, A. Cardenal-Lopez, A. Pena-Gimenez, "ShipsEar: An underwater vessel noise database," Appl. Acoust. 113, 64-69 (2016). 10.1016/j.apacoust.2016.06.008
13
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, "Pytorch: An imperative style, high-performance deep learning library," Proc. NeurIPS, 1-12 (2019).
14
D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," Proc. ICLR, 1-15 (2014).
15
Y Luo and N. Mesgarani, "Conv-tasnet: surpassing ideal time-frequency magnitude masking for speech separation," IEEE/ACM Trans. Audio, Speech, and Lang. Process. 27, 1256-1266 (2019). 10.1109/TASLP.2019.29151676726126
16
M. Kolbaek, D. Yu, Z.-H. Tan, and J. Jensen, "Multitalker speech separation with utterance-level permutation invariant training of deep recurrent neural networks," IEEE/ACM Trans. Audio, Speech, and Lang. Process. 25, 1901-1913 (2017). 10.1109/TASLP.2017.2726762
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 42
  • No :4
  • Pages :357-363
  • Received Date : 2023-05-09
  • Accepted Date : 2023-07-18