Noise level limited sub-modeling for diffusion probabilistic vocoders
T. Okamoto, T. Toda, Y. Shiga and H. Kawai,
"Noise level limited sub-modeling for diffusion probabilistic vocoders,"
in Proc. ICASSP, June 2021, pp. 6029–6033. [IEEE Xplore]
Audio Samples
Analysis-synthesis condition
Original
WaveGlow
Parallel WaveGAN (PWG)
WaveGrad (50 iterations)
DiffWave (50 iterations)
WaveGrad (25 iterations)
Sub-WaveGrad (25 iterations)
DiffWave (25 iterations)
Sub-DiffWave (25 iterations)
Sub-WaveGrad (6 iterations)
DiffWave (6 iterations)
Sub-DiffWave (6 iterations)
Text-to-speech condition
WaveGlow
Parallel WaveGAN (PWG)
DiffWave (25 iterations)
Sub-DiffWave (25 iterations)
DiffWave (6 iterations)
Sub-DiffWave (6 iterations)
DiffWaveGrad (10 iterations with 3 sub-DiffWave + 7 sub-WaveGrad): NOT included in the paper