All Issue

2021 Vol.40, Issue 5 Preview Page

Research Article

30 September 2021. pp. 439-451
S. Zhao, X. Xiao, Z. Zhang, T. N. T. Nguyen, X. Zhong, B. Ren, L. Wang, L. J. Douglas, E. Chng, and H. Li, "Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction," Proc. IEEE Workshop on ASRU. 460-467 (2015). 10.1109/ASRU.2015.7404831
Y. Tachioka, T. Narita, I. Miura, T. Uramoto, N. Monta, S. Uenohara, K. Furuya, S. Watanabe, and J. L. Roux, "Coupled Initialization of multi-channel non- negative matrix factorization based on spatial and spectral information," Proc. 2017 INTERSPEECH, 2461-2465 (2017). 10.21437/Interspeech.2017-61
D. Kitamura, N. Ono, H. Sawada, H. Kameoka, and H. Saruwatari, "Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization," IEEE Trans. on Audio, Speech, and Lang. Process. 24, 1626-1641 (2016). 10.1109/TASLP.2016.2577880
T. V. d. Bogaert, S. Doclo, J. Wouters, and M. Moonen, "Speech enhancement with multichannel Wiener filter techniques in multimicrophone binaural hearing aids," J. Acoust. Soc. Am. 125, 360-371 (2009). 10.1121/1.302306919173423
E. A. Habets, J. Benesty, S. Gannot, and I. Cohen, "The MVDR beamformer for speech enhancement," Proc. Speech Processing in Modern Communication, 225-254 (2010). 10.1007/978-3-642-11130-3_9
E. Warsitz and R. Haeb-Umbach, "Blind acoustic beamforming based on generalized eigenvalue decomposition," IEEE Trans. on audio, speech, and lang. process. 15, 1529-1539 (2007). 10.1109/TASL.2007.898454
S. Gannot and I. Cohen, "Speech enhancement based on the general transfer function GSC and postfiltering," IEEE Trans. on Speech and Audio Process. 12, 561- 571(2004). 10.1109/TSA.2004.834599
J. Heymann, L. Drude, A. Chinaev, and R. Haeb- Umbach, "BLSTM supported GEV beamformer front- end for the 3rd CHiME challenge," Proc. IEEE Workshop on ASRU. 444-451 (2015). 10.1109/ASRU.2015.7404829
C. Deng, H. Song, Y. Zhang, Y. Sha, and X. Li, "DNN-based mask estimation integrating spectral and spatial features for robust beamforming," Proc. ICASSP. 4647-4651 (2020). 10.1109/ICASSP40776.2020.9054239
Y. Liu, A. Ganguly, K. Kamath, and T. Kristjansson, "Neural network based time-frequency masking and steering vector estimation for two-channel MVDR beamforming," Proc. ICASSP. 6717-6721 (2018). 10.1109/ICASSP.2018.8462069
N. Shankar, G. S. Bhat, and I. M. Panahi, "Real-time dual-channel speech enhancement by VAD assisted MVDR beamformer for hearing aid applications using smartphone," Proc. 42nd Annual Int. Conf. of the IEEE EMBC. 952-955 (2020). 10.1109/EMBC44109.2020.917521233018142PMC7545265
Y. Zhou, Y. Chen, Y. Ma, and H. Liu, "A real-time dual-microphone speech enhancement algorithm assisted by bone conduction sensor," Sensors, 20, 5050 (2020). 10.3390/s2018505032899533PMC7571026
T. Higuchi, N. Ito, S. Araki, T. Yoshioka, M. Delcroix, and T. Nakatani, "Online MVDR beamformer based on complex Gaussian mixture model with spatial prior for noise robust ASR," IEEE Trans. on audio, speech, and lang. process. 25, 780-793 (2017). 10.1109/TASLP.2017.2665341
J. Barker, R. Marxer, E. Vincent, and S. Watanabe, "The third 'CHiME'speech separation and recognition challenge: Dataset, task and baselines," Proc. 2015 IEEE Workshop on ASRU. 504-511 (2015). 10.1109/ASRU.2015.740483726035872
Z. Rafii, A. Liutkus, F. R. Stöter, S. I. Mimilakis, and R. Bittner, MUSDB18 - a corpus for music separation (2017).
J. Heymann, L. Drude, and R. Haeb-Umbach, "Neural network based spectral mask estimation for acoustic beamforming," Proc. IEEE ICASSP. 196-200 (2016). 10.1109/ICASSP.2016.7471664
E. Warsitz and R. Haeb-Umbach, "Blind acoustic beamforming based on generalized eigenvalue decomposition," IEEE Trans. on audio, speech, and lang. process. 15, 1529-1539 (2007). 10.1109/TASL.2007.898454
J. S. Lim and A. V. Oppenheim, "Enhancement and bandwidth compression of noisy speech," Proc. IEEE. 1586-1604 (1979). 10.1109/PROC.1979.11540
D. Gala, A. Vasoya, and V. M. Misra, "Speech enhancement combining spectral subtraction and beamforming techniques for microphone array," Proc. the Int. Conf. and Workshop on Emerging Trends in Technology, 163-166 (2010). 10.1145/1741906.1741938
Y. Takahashi, Y. Uemura, H. Saruwatari, K. Shikano, and K. Kondo, "Structure selection algorithm for less musical-noise generation in integration systems of beamforming and spectral subtraction," Proc. 2009 IEEE/SP 15th Workshop on Statistical Signal Processing, 701-704 (2009). 10.1109/SSP.2009.527848019336245
S. Karimian-Azari and T. H. Falk, "Modulation spectrum based beamforming for speech enhancement," Proc. 2017 IEEE WASPAA. 91-95 (2017). 10.1109/WASPAA.2017.8170001
H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, T. Nishikawa, and K. Shikano, "Blind source separation combining independent component analysis and beamforming," EURASIP J. Advances in Signal Processing, 2003, 569270 (2003). 10.1155/S1110865703305104
Google WebRTC,, (Last viewed September 1, 2021).
Google Web Speech API, api/, (Last viewed September 1, 2021).
  • Publisher :The Acoustical Society Of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 40
  • No :5
  • Pages :439-451
  • Received Date :2021. 06. 10
  • Accepted Date : 2021. 08. 09