All Issue

2022 Vol.41, Issue 3 Preview Page

Research Article

31 May 2022. pp. 351-358
Abstract
References
1
B. Ko, K. Lee, I.-C. Yoo, and D. Yook, "Korean voice conversion experiments using CC-GAN and VAW- GAN" (in Korean), Proc, Speech Communication and Signal Processing, 36, 39 (2019).
2
B. Jang, H. Seo, I.-C. Yoo, and D. Yook, "CycleVAE based many-to-many voice conversion experiments using Korean speech corpus" (in Korean), J. Acoust. Soc. Suppl.2(s) 40, 79 (2021).
3
I.-C. Yoo, K. Lee, S.-G. Leem, H. Oh, B. Ko, and D. Yook, "Speaker anonymization for personal information protection using voice conversion techniques," IEEE Access, 8, 198637-198645 (2020). 10.1109/ACCESS.2020.3035416
4
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial nets," Proc. NIPS, 2672-2680 (2014).
5
D. Kingma and M. Welling, "Auto-encoding variational Bayes," arXiv:1312.6114 (2013).
6
J. Zhu, T. Park, P. Isola, and A. Efros, "Unpaired image-to image translation using cycle-consistent adversarial networks," Proc. IEEE Int. Conf. Computer Vision, 2242-2251 (2017). 10.1109/ICCV.2017.244
7
T. Kaneko and H. Kameoka, "CycleGAN-VC: Non- parallel voice conversion using cycle-consistent adversarial networks," Proc. EUSIPCO, 2114-2118 (2018). 10.23919/EUSIPCO.2018.8553236
8
T. Kaneko, H. Kameoka, K. Tanaka, and N. Hojo, "CycleGAN-VC2: Improved CycleGAN-based non- parallel voice conversion," Proc. IEEE ICASSP, 6820- 6824 (2019). 10.1109/ICASSP.2019.8682897
9
T. Kaneko, H. Kameoka, K. Tanaka, and N. Hojo, "CycleGAN-VC3: Examining and improving CycleGAN-VCs for Mel-spectrogram conversion," Proc. Interspeech, 2017-2021 (2020). 10.21437/Interspeech.2020-2280
10
D. Yook, I.-C. Yoo, and S. Yoo, "Voice conversion using conditional CycleGAN," Proc. Int. Conf. CSCI, 1460-1461 (2018). 10.1109/CSCI46756.2018.00290
11
S. Lee, B. Ko, K. Lee, I.-C. Yoo, and D. Yook, "Many-to-many voice conversion using conditional cycle-consistent adversarial networks," Proc. IEEE ICASSP, 6279-6283 (2020). 10.1109/ICASSP40776.2020.9053726
12
H. Kameoka, T. Kaneko, K. Tanaka, and N. Hojo, "StarGAN-VC: Non-parallel many-to-many voice conversion using star generative adversarial networks," Proc. IEEE Workshop on SLT, 266-273 (2018). 10.1109/SLT.2018.8639535
13
T. Kaneko, H. Kameoka, K. Tanaka, and N. Hojo, "StarGAN-VC2: Rethinking conditional methods for StarGAN-based voice conversion," Proc. Interspeech, 679-683 (2019). 10.21437/Interspeech.2019-2236
14
C. Hsu, H. Hwang, Y. Wu, Y. Tsao, and H. Wang, "Voice conversion from non-parallel corpora using variational autoencoder," Proc. APSIPA, 1-6 (2016). 10.1109/APSIPA.2016.7820786
15
A. Oord and O. Vinyals, "Neural discrete representation learning," Proc. NIPS, 6309-6318 (2017).
16
C. Hsu, H. Hwang, Y. Wu, Y. Tsao, and H. Wang, "Voice conversion from unaligned corpora using variational autoencoding Wasserstein generative adversarial networks," Proc. Interspeech, 3364-3368 (2017). 10.21437/Interspeech.2017-6329257322
17
H. Kameoka, T. Kaneko, K. Tanaka, and N. Hojo, "ACVAE-VC: Non-parallel voice conversion with auxiliary classifier variational autoencoder," IEEE/ ACM Trans. on Audio, Speech, and Lang. Process. 27, 1432-1443 (2019). 10.1109/TASLP.2019.2917232
18
P. Tobing, Y. Wu, T. Hayashi, K. Kobayashi, and T. Toda, "Non-parallel voice conversion with cyclic variational autoencoder," Proc. Interspeech, 674-678 (2019). 10.21437/Interspeech.2019-2307
19
D. Yook, S.-G. Leem, K. Lee, and I.-C. Yoo, "Many- to-Many voice conversion using cycle-consistent variational autoencoder with multiple decoders," Proc. Odyssey: The Speaker Language Recognition Workshop, 215-221 (2020). 10.21437/Odyssey.2020-31
20
B. Ko, Many-to-many voice conversion using cycle- consistency for Korean speech (in Korean), (Master Thesis, Korea University, 2020).
21
M. Morise, F. Yokomori, and K. Ozawa, "WORLD: A vocoder-based high-quality speech synthesis system for real-time applications," IEICE Trans. on Information and Systems, 99, 1877-1884 (2016). 10.1587/transinf.2015EDP7457
22
D. Kingma and J. Ba, "Adam: A method for stochastic optimization," Proc. ICLR, 1-13 (2015). 10.1007/978-3-662-46214-0_125497406
23
T. Toda, A. Black, and K. Tokuda, "Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory," IEEE Trans. on Audio, Speech, and Lang. Process. 15, 2222-2235 (2007). 10.1109/TASL.2007.907344
24
S. Takamichi, T. Toda, A. Black, G. Neubig, S. Sakti, and S. Nakamura, "Postfilters to modify the modulation spectrum for statistical parametric speech synthesis," IEEE/ACM Trans. on Audio, Speech, and Lang. Process. 24, 755-767 (2016). 10.1109/TASLP.2016.2522655
Information
  • Publisher :The Acoustical Society of Korea
  • Publisher(Ko) :한국음향학회
  • Journal Title :The Journal of the Acoustical Society of Korea
  • Journal Title(Ko) :한국음향학회지
  • Volume : 41
  • No :3
  • Pages :351-358
  • Received Date : 2022-03-16
  • Revised Date : 2022-04-29
  • Accepted Date : 2022-05-13