All Issue

2021 Vol.40, Issue 5 Preview Page

Research Article

30 September 2021. pp. 466-472
Abstract
References
1
A. G. Adam, S. S. Kajarekar, and H. Hermansky, "A new speaker change detection method for two-speaker segmentation," Proc. ICASSP. 3908-3911 (2002). 10.1109/ICASSP.2002.5745511
2
L. Bullock, H. Bredin, and L. P. Garcia Perera, "Overlap aware diarization: Resegmentation using neural end- to-end overlapped speech detection," Proc. ICASSP. 7114-7118 (2020). 10.1109/ICASSP40776.2020.9053096
3
N. Sajjan, S. Ganesh, N. Sharma, S. Ganapathy, and N. Ryant, "Leveraging lstm models for overlap detection in multi party meetings," Proc. ICASSP. 5249-5253 (2018). 10.1109/ICASSP.2018.8462548
4
V. Andrei, H. Cucu, and C. Burileanu. "Detecting overlapped speech on short time frames using deep learning," Proc. Interspeech, 1198-1202 (2017). 10.21437/Interspeech.2017-188
5
E. Kazimirova, A. Belyaev, "Automatic detection of multi speaker fragments with high time resolution," Proc. ICASSP. 1338-1392 (2018). 10.21437/Interspeech.2018-1878
6
Z. Ge, A. N. Iyer, S. Cheluvaraja, and A. Ganapathiraju, "Speaker change detection using features through a neural network speaker classier," Proc. IEEE SAI Intelligent Systems Conference, 1111-1116 (2017). 10.1109/IntelliSys.2017.8324268
7
R. Yin, H. Bredin, and C. Barras, "Speaker change detection in broadcast tv using bidirectional long short term memory networks," Proc. Interspeech, 3827- 3831 (2017). 10.21437/Interspeech.2017-65
8
M. Kunesova, M. Hruz, Z. Zajc, and V. Radova, "Detection of overlapping speech for the purposes of speaker diarization," Proc. ICSC. 247-257 (2019). 10.1007/978-3-030-26061-3_26
9
S. C. Levinson, "Turn-taking in human communication - Origins and implications for language processing," Trends in Cognitive Sciences, 20, 6-14 (2016). 10.1016/j.tics.2015.10.01026651245
10
H. Bredin, "TristouNet: Triplet loss for speaker turn embedding," Proc. Interspeech, 5430-5434 (2017). 10.1109/ICASSP.2017.7953194
11
WebRTC Homepage, http://webrtc.org, (Last viewed November 21, 2020).
12
D. Snyder, D. Garcia-Romero, G. Sell, D. Povey, and S. Khudanpur, "X-vectors: Robust DNN embeddings for speaker recognition," Proc. ICASSP. 5329-5333 (2018). 10.1109/ICASSP.2018.8461375
13
J. Park, S. Cha, S. Eun, J. G. Park, and Y.-S. Yun, "Data augmentation and d-vector representation methods for speaker change detection," Proc. ICRACS. 67-71 (2020). 10.1145/3400286.3418270
14
V. Zue, S. Sene, and S. Glass, "Speech database development at MIT: TIMIT and beyond," Speech communication, 9, 351-356 (1990). 10.1016/0167-6393(90)90010-7
15
H. Kim, J. Park, S. Cha, K. A Son, Y.-S. Yun, and J. G. Park, "Framework switching of speaker overlap detection system" (in Korean), J. SW Assessment and Valuation, 17, 101-113 (2021). 10.29056/jsav.2021.06.13
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 40
  • No :5
  • Pages :466-472
  • Received Date : 2021-07-09
  • Accepted Date : 2021-08-26