| Original | |
| STRAIGHT (analysis-synthesis) | |
| WaveGlow (analysis-synthesis) | |
| (A) Tacotron 2 + WaveGlow | |
| (B) Transformer (FNN) + WaveGlow | |
| (C) Transformer (Conv1D) + WaveGlow | |
| (D) BLSTM + WaveGlow | |
| (E) BLSTM+Taco2dec + WaveGlow | |
| (F) Proposed (0.2) + WaveGlow | |
| (G) Proposed (0.5) + WaveGlow | |
| (H) Proposed (0.7) + WaveGlow | |
| (I) Proposed (1.0) + WaveGlow | |
| (J) FastSpeech (Default) + WaveGlow | |
| (K) FastSpeech (w/o-DP)+ WaveGlow | |
| (L) FastSpeech (Simple) + WaveGlow | |
| Original | |
| STRAIGHT (analysis-synthesis) | |
| WaveGlow (analysis-synthesis) | |
| WaveGlow 256 ch (analysis-synthesis) | |
| Parallel WaveGAN (analysis-synthesis) | |
| BLSTM+Taco2dec (only phoneme) + WaveGlow | |
| Transformer (only phoneme) + WaveGlow | |
| BLSTM + WaveGlow | |
| Tacotron 2 + WaveGlow | |
| BLSTM+Taco2dec + WaveGlow | |
| Transformer + WaveGlow | |
| Transformer + WaveGlow 256 ch | |
| Transformer + Paralle WaveGAN | |
| FastSpeech (Default) + WaveGlow | |
| FastSpeech (w/o-DP) + WaveGlow | |