End-to-End Korean Speech Synthesis System Using Reformer Network 


Vol. 46,  No. 2, pp. 217-224, Feb.  2021
10.7840/kics.2021.46.2.217


PDF
  Abstract

In this paper, we propose a End-to-end Korean speech synthesis system using a reformer network. Transformer TTS shows high performance among the end-to-end speech synthesis models, but has the memory inefficiency in the training stage. In order to solve the memory inefficiency, Transformer network was replaced with the Reformer network. In addition, the imbalance between the spectrogram and the length of the Korean text sequence had a negative effect on the attention energy estimation. A method of extending the length by repeating text sequence samples and a method of extending the length by adding connection information between samples were used. As a result of the experiment, it was confirmed that the speech synthesis system using the reformer network can be trained with relatively less memory and can generate natural speech using connection information between text samples.

  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

H. R. Ihm, S. J. Cheon, B. J. Choi, M. C. Kim, N. S. Kim, "End-to-End Korean Speech Synthesis System Using Reformer Network," The Journal of Korean Institute of Communications and Information Sciences, vol. 46, no. 2, pp. 217-224, 2021. DOI: 10.7840/kics.2021.46.2.217.

[ACM Style]

Hyeong Rae Ihm, Sung Jun Cheon, Byoung Jin Choi, Min Chan Kim, and Nam Soo Kim. 2021. End-to-End Korean Speech Synthesis System Using Reformer Network. The Journal of Korean Institute of Communications and Information Sciences, 46, 2, (2021), 217-224. DOI: 10.7840/kics.2021.46.2.217.

[KICS Style]

Hyeong Rae Ihm, Sung Jun Cheon, Byoung Jin Choi, Min Chan Kim, Nam Soo Kim, "End-to-End Korean Speech Synthesis System Using Reformer Network," The Journal of Korean Institute of Communications and Information Sciences, vol. 46, no. 2, pp. 217-224, 2. 2021. (https://doi.org/10.7840/kics.2021.46.2.217)