All Issue

2021 Vol.40, Issue 2 Preview Page

Research Article

31 March 2021. pp. 148-154
Abstract
References
1
J. -w. Jung, H.-j. Shim, J.-h. Kim, and H.-J. Yu, "α- feature map scaling for raw waveform speaker verification" (in Korean), J. Acoust. Soc. Kr. 39, 441-446 (2020).
2
D. Snyder, P. Ghahremani, D. Povey, D. Garcia- Romero, Y. Carmie, and S. Khudanpur, "Deep neural network-based speaker embeddings for end-to-end speaker verification," Proc. IEEE SLT. 165-170 (2016). 10.1109/SLT.2016.784626026932919
3
D. Snyder, D. Garcia-Romero, D. Povey, and S. Khudanpur, "Deep neural network embeddings for text-independent speaker verification," Proc. Interspeech, 999-1003 (2017). 10.21437/Interspeech.2017-620
4
E. Varian, X. Lei, E. McDermott, I. L. Moreno, and J. Gonzalez-Dominguez, "Deep neural networks for small footprint text-dependent speaker verification," Proc. ICASSP. 4080-4084 (2014). 10.1109/ICASSP.2014.6854363
5
S. Shon, H. Tang, and J. R. Glass, "Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model," Proc. IEEE SLT. 1007-1013 (2018). 10.1109/SLT.2018.8639622
6
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: A simple way to prevent neural networks from overfitting," JMLR. 15, 1929-1958 (2014).
7
S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," Proc. PMLR. 448-456 (2015).
8
D. S. Park, W. Chan, Y. Zhang, C.-C. Chiu, B. Zoph, E. D. Cubuk, and Q. V. Le, "Specaugment: A simple data augmentation method for automatic speech recognition," arXiv preprint arXiv:1904.08779 (2019). 10.21437/Interspeech.2019-2680
9
T. Inoue, P. Vinayavekhin, S. Wang, D. Wood, N. Greco, and R. Tachibana, "Shuffling and mixing data aug-mentation for environmental sound classification," Proc. of the DCASE. 109-113 (2019). 10.33682/wgyb-bt40
10
I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (MIT press, Cambridge, 2016), pp. 236-239.
11
D. Snyder, D. Garcia-Romero, G. Sell, D. Povey, and S. Khudanpur, "X-vectors: Robust dnn embeddings for speaker recognition," Proc. ICASSP. 5329-5333 (2018). 10.1109/ICASSP.2018.8461375
12
Y. Xu, R. Jia, L. Mou, G. Li, Y. Chen, Y. Lu, and Z. Jin, "Improved relation classification by deep recurrent neural networks with data augmentation," arXiv preprint arXiv:1601.03651 (2016).
13
Z. Wu, S. Wang, Y. Qian, and K. Yu, "Data augmentation using variational autoencoder for embedding based speaker verification," Proc. Interspeech, 1163- 1167 (2019). 10.21437/Interspeech.2019-2248
14
L. Perez and J. Wang, "The effectiveness of data augmentation in image classification using deep learning," arXiv preprint arXiv:1712.04621 (2017).
15
J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," Proc. CVPR. 7132-7141 (2018). 10.1109/CVPR.2018.00745
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 40
  • No :2
  • Pages :148-154
  • Received Date : 2021-02-15
  • Accepted Date : 2021-03-15