Skip to main content

What's new

  • 2024/2/15 This site has been released.
  • 2025/2/14 Preprint of ICASSP 2025 has been added.

Proceedings of international conferences and workshops

  1. T. Ogura, T. Okamoto, Y. Ohtani, E. Cooper, T. Toda and H. Kawai,
  2. "Mora-level prosody prediction for text-to-speech using Japanese BERT without accentual labels,"
    in Proc. ICASSP, Apr. 2025. (accepted, to appear) [Preprint (PDF)] [Demo page]
  3. Y. Ohtani, T. Okamoto, T. Toda and H. Kawai,
  4. "FIRNet: Fundamental frequency controllable fast neural vocoder with trainable finite impulse response filter,"
    in Proc. ICASSP, Apr. 2024, pp. 10871–10875. [Preprint (PDF)] [Demo page]

  5. T. Okamoto, Y. Ohtani, T. Toda and H. Kawai,
  6. "ConvNeXt-TTS and ConvNeXt-VC: ConvNeXt-based fast end-to-end sequence-to-sequence text-to-speech and voice conversion,"
    in Proc. ICASSP, Apr. 2024, pp. 12456–12460. [Preprint (PDF)] [Demo page]

  7. T. Okamoto, H. Yamashita, Y. Ohtani, T. Toda and H. Kawai,
  8. "WaveNeXt: ConvNeXt-based fast neural vocoder without iSTFT layer,"
    in Proc. ASRU, Dec. 2023. [IEEE Xplore] [Preprint (PDF)] [Demo page]

  9. T. Okamoto, T. Toda and H. Kawai,
  10. "Multi-stream HiFi-GAN with data-driven waveform decomposition,"
    in Proc. ASRU, Dec. 2021, pp. 610–617. [IEEE Xplore] [Preprint (PDF)] [Demo page]

  11. K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, Y. Shiga and H. Kawai,
  12. "High-intelligibility speech synthesis for dysarthric speakers with LPCNet-based TTS and CycleVAE-based VC,"
    in Proc. ICASSP, June 2021, pp. 7058–7062. [IEEE Xplore] [Preprint (PDF)]

  13. T. Okamoto, T. Toda, Y. Shiga and H. Kawai,
  14. "Noise level limited sub-modeling for diffusion probabilistic vocoders,"
    in Proc. ICASSP, June 2021, pp. 6029–6033. [IEEE Xplore] [Preprint (PDF)]

  15. T. Okamoto, T. Toda, Y. Shiga and H. Kawai,
  16. "Transformer-based text-to-speech with weighted forced attention,"
    in Proc. ICASSP, June 2020, pp. 6729–6733. [IEEE Xplore] [Preprint (PDF)]

  17. T. Okamoto, T. Toda, Y. Shiga and H. Kawai,
  18. "Tacotron-based acoustic model using phoneme alignment for practical neural text-to-speech systems,"
    in Proc. ASRU, Dec. 2019, pp. 214–221. [IEEE Xplore] [Preprint (PDF)]

  19. T. Okamoto, T. Toda, Y. Shiga and H. Kawai,
  20. "Investigations of real-time Gaussian FFTNet and parallel WaveNet neural vocoders with simple acoustic features,"
    in Proc. ICASSP, May 2019, pp. 7020–7024. [IEEE Xplore] [Preprint (PDF)]

  21. T. Okamoto, T. Toda, Y. Shiga and H. Kawai,
  22. "Improving FFTNet vocoder with noise shaping and subband approaches,"
    in Proc. SLT, Dec. 2018, pp. 304–311. [IEEE Xplore] [Preprint (PDF)]

  23. T. Okamoto, K. Tachibana, T. Toda, Y. Shiga and H. Kawai,
  24. "An investigation of subband WaveNet vocoder covering entire audible frequency range with limited acoustic features,"
    in Proc. ICASSP, Apr. 2018, pp. 5654–5658. [IEEE Xplore] [Preprint (PDF)]

  25. T. Okamoto, K. Tachibana, T. Toda, Y. Shiga and H. Kawai,
  26. "Subband WaveNet with overlapped single-sideband filterbanks,"
    in Proc. ASRU, Dec. 2017, pp. 698–704. [IEEE Xplore] [Preprint (PDF)]