Original |
|
STRAIGHT (analysis-synthesis) |
|
WaveGlow (analysis-synthesis) |
|
(A) Tacotron 2 + WaveGlow |
|
(B) Transformer (FNN) + WaveGlow |
|
(C) Transformer (Conv1D) + WaveGlow |
|
(D) BLSTM + WaveGlow |
|
(E) BLSTM+Taco2dec + WaveGlow |
|
(F) Proposed (0.2) + WaveGlow |
|
(G) Proposed (0.5) + WaveGlow |
|
(H) Proposed (0.7) + WaveGlow |
|
(I) Proposed (1.0) + WaveGlow |
|
(J) FastSpeech (Default) + WaveGlow |
|
(K) FastSpeech (w/o-DP)+ WaveGlow |
|
(L) FastSpeech (Simple) + WaveGlow |
|
Original |
|
STRAIGHT (analysis-synthesis) |
|
WaveGlow (analysis-synthesis) |
|
WaveGlow 256 ch (analysis-synthesis) |
|
Parallel WaveGAN (analysis-synthesis) |
|
BLSTM+Taco2dec (only phoneme) + WaveGlow |
|
Transformer (only phoneme) + WaveGlow |
|
BLSTM + WaveGlow |
|
Tacotron 2 + WaveGlow |
|
BLSTM+Taco2dec + WaveGlow |
|
Transformer + WaveGlow |
|
Transformer + WaveGlow 256 ch |
|
Transformer + Paralle WaveGAN |
|
FastSpeech (Default) + WaveGlow |
|
FastSpeech (w/o-DP) + WaveGlow |
|