Harmonic-Net: Fundamental frequency and speech rate controllable fast neural vocoder
K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, and H. Kawai, "Harmonic-Net: Fundamental frequency and speech-rate controllable fast neural vocoder," IEEE/ACM Trans. Audio, Speech, Lang. Process. (accepted, in press)
Unseen speaker synthesis with multi-speaker model trained using JVS corpus
Normal condifion
Male (jvs001)
Original
WORLD
uSFGAN
WaveNet
HiFi-GAN
HiFi-GAN (melspc)
Harmonic-Net
Harmonic-Net+
Female (jvs004)
Original
WORLD
uSFGAN
WaveNet
HiFi-GAN
HiFi-GAN (melspc)
Harmonic-Net
Harmonic-Net+
0.5 x fo condifion
Male (jvs001)
WORLD
uSFGAN
HiFi-GAN
Harmonic-Net
Harmonic-Net+
Female (jvs004)
WORLD
uSFGAN
HiFi-GAN
Harmonic-Net
Harmonic-Net+
1.5 x fo condifion
Male (jvs001)
WORLD
uSFGAN
HiFi-GAN
Harmonic-Net
Harmonic-Net+
Female (jvs004)
WORLD
uSFGAN
HiFi-GAN
Harmonic-Net
Harmonic-Net+
0.8 x T condifion
Male (jvs001)
WORLD
WaveNet
HiFi-GAN
HiFi-GAN (melspc)
Harmonic-Net
Harmonic-Net+
Female (jvs004)
WORLD
WaveNet
HiFi-GAN
HiFi-GAN (melspc)
Harmonic-Net
Harmonic-Net+
1.5 x T condifion
Male (jvs001)
WORLD
WaveNet
HiFi-GAN
HiFi-GAN (melspc)
Harmonic-Net
Harmonic-Net+
Female (jvs004)
WORLD
WaveNet
HiFi-GAN
HiFi-GAN (melspc)
Harmonic-Net
Harmonic-Net+
Full-band singing voice synthesis using Tohoku Kiritan corpus