All Issue

2021 Vol.40, Issue 5 Preview Page

Research Article

30 September 2021. pp. 439-451
Abstract
References
1
S. Zhao, X. Xiao, Z. Zhang, T. N. T. Nguyen, X. Zhong, B. Ren, L. Wang, L. J. Douglas, E. Chng, and H. Li, "Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction," Proc. IEEE Workshop on ASRU. 460-467 (2015). 10.1109/ASRU.2015.7404831
2
Y. Tachioka, T. Narita, I. Miura, T. Uramoto, N. Monta, S. Uenohara, K. Furuya, S. Watanabe, and J. L. Roux, "Coupled Initialization of multi-channel non- negative matrix factorization based on spatial and spectral information," Proc. 2017 INTERSPEECH, 2461-2465 (2017). 10.21437/Interspeech.2017-61
3
D. Kitamura, N. Ono, H. Sawada, H. Kameoka, and H. Saruwatari, "Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization," IEEE Trans. on Audio, Speech, and Lang. Process. 24, 1626-1641 (2016). 10.1109/TASLP.2016.2577880
4
T. V. d. Bogaert, S. Doclo, J. Wouters, and M. Moonen, "Speech enhancement with multichannel Wiener filter techniques in multimicrophone binaural hearing aids," J. Acoust. Soc. Am. 125, 360-371 (2009). 10.1121/1.302306919173423
5
E. A. Habets, J. Benesty, S. Gannot, and I. Cohen, "The MVDR beamformer for speech enhancement," Proc. Speech Processing in Modern Communication, 225-254 (2010). 10.1007/978-3-642-11130-3_9
6
E. Warsitz and R. Haeb-Umbach, "Blind acoustic beamforming based on generalized eigenvalue decomposition," IEEE Trans. on audio, speech, and lang. process. 15, 1529-1539 (2007). 10.1109/TASL.2007.898454
7
S. Gannot and I. Cohen, "Speech enhancement based on the general transfer function GSC and postfiltering," IEEE Trans. on Speech and Audio Process. 12, 561- 571(2004). 10.1109/TSA.2004.834599
8
J. Heymann, L. Drude, A. Chinaev, and R. Haeb- Umbach, "BLSTM supported GEV beamformer front- end for the 3rd CHiME challenge," Proc. IEEE Workshop on ASRU. 444-451 (2015). 10.1109/ASRU.2015.7404829
9
C. Deng, H. Song, Y. Zhang, Y. Sha, and X. Li, "DNN-based mask estimation integrating spectral and spatial features for robust beamforming," Proc. ICASSP. 4647-4651 (2020). 10.1109/ICASSP40776.2020.9054239
10
Y. Liu, A. Ganguly, K. Kamath, and T. Kristjansson, "Neural network based time-frequency masking and steering vector estimation for two-channel MVDR beamforming," Proc. ICASSP. 6717-6721 (2018). 10.1109/ICASSP.2018.8462069
11
N. Shankar, G. S. Bhat, and I. M. Panahi, "Real-time dual-channel speech enhancement by VAD assisted MVDR beamformer for hearing aid applications using smartphone," Proc. 42nd Annual Int. Conf. of the IEEE EMBC. 952-955 (2020). 10.1109/EMBC44109.2020.917521233018142PMC7545265
12
Y. Zhou, Y. Chen, Y. Ma, and H. Liu, "A real-time dual-microphone speech enhancement algorithm assisted by bone conduction sensor," Sensors, 20, 5050 (2020). 10.3390/s2018505032899533PMC7571026
13
T. Higuchi, N. Ito, S. Araki, T. Yoshioka, M. Delcroix, and T. Nakatani, "Online MVDR beamformer based on complex Gaussian mixture model with spatial prior for noise robust ASR," IEEE Trans. on audio, speech, and lang. process. 25, 780-793 (2017). 10.1109/TASLP.2017.2665341
14
J. Barker, R. Marxer, E. Vincent, and S. Watanabe, "The third 'CHiME'speech separation and recognition challenge: Dataset, task and baselines," Proc. 2015 IEEE Workshop on ASRU. 504-511 (2015). 10.1109/ASRU.2015.740483726035872
15
Z. Rafii, A. Liutkus, F. R. Stöter, S. I. Mimilakis, and R. Bittner, MUSDB18 - a corpus for music separation (2017).
16
J. Heymann, L. Drude, and R. Haeb-Umbach, "Neural network based spectral mask estimation for acoustic beamforming," Proc. IEEE ICASSP. 196-200 (2016). 10.1109/ICASSP.2016.7471664
17
E. Warsitz and R. Haeb-Umbach, "Blind acoustic beamforming based on generalized eigenvalue decomposition," IEEE Trans. on audio, speech, and lang. process. 15, 1529-1539 (2007). 10.1109/TASL.2007.898454
18
J. S. Lim and A. V. Oppenheim, "Enhancement and bandwidth compression of noisy speech," Proc. IEEE. 1586-1604 (1979). 10.1109/PROC.1979.11540
19
D. Gala, A. Vasoya, and V. M. Misra, "Speech enhancement combining spectral subtraction and beamforming techniques for microphone array," Proc. the Int. Conf. and Workshop on Emerging Trends in Technology, 163-166 (2010). 10.1145/1741906.1741938
20
Y. Takahashi, Y. Uemura, H. Saruwatari, K. Shikano, and K. Kondo, "Structure selection algorithm for less musical-noise generation in integration systems of beamforming and spectral subtraction," Proc. 2009 IEEE/SP 15th Workshop on Statistical Signal Processing, 701-704 (2009). 10.1109/SSP.2009.527848019336245
21
S. Karimian-Azari and T. H. Falk, "Modulation spectrum based beamforming for speech enhancement," Proc. 2017 IEEE WASPAA. 91-95 (2017). 10.1109/WASPAA.2017.8170001
22
H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, T. Nishikawa, and K. Shikano, "Blind source separation combining independent component analysis and beamforming," EURASIP J. Advances in Signal Processing, 2003, 569270 (2003). 10.1155/S1110865703305104
23
Google WebRTC,https://webrtc.org/, (Last viewed September 1, 2021).
24
Google Web Speech API,https://wicg.github.io/speech- api/, (Last viewed September 1, 2021).
Information
  • Publisher :The Acoustical Society Of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 40
  • No :5
  • Pages :439-451
  • Received Date :2021. 06. 10
  • Accepted Date : 2021. 08. 09