Xugang Lu (卢绪刚)

主任研究員 @ 国立研究開発法人 情報通信研究機構

現在は日本情報通信研究機構で音声技術関連の研究仕事に従事

研究興味: 音声技術関連の研究、特に音声認識、機械学習に関する.

連絡: xugang dot lu at nict dot go dot jp



研究経歴

2017~ , 同志社大学 (Doshisha University), 客員教授.

2009~ , 国立研究開発法人 情報通信研究機構, 専攻研究員/主任研究員.

2008-2009 ATR 音声コミュニケーション研究所, 主任研究員.

2003-2008 北陸先端科学技術大学院大学, 助教 (assistant professor).

2001-2002 McMaster University, Canada, Postdoc fellow.

1999-2001 Nanyang Technological University, Singapore, Research fellow.

1999 中国科学院自動化研究所, 知能科学専攻修了,博士(工学).

1990-1996 哈爾濱工業大学, 電気工学と計算機科学専攻修了, 学士, 修士.



受賞

中国科学院, 院長優秀賞表彰, 1999.

情報通信研究機構, 成績優秀表彰, 2015.



研究発表

Journals in recent years

  • N. Kanda, X. Lu, H. Kawai, "Maximum A Posteriori based Decoding for End-to-End Acoustic Models," IEEE Trans. on Audio, Speech, and Language Processing, vol.25, no. 5, pp.1023-1034, 2017.
  • X. Lu, P. Shen, Y. Tsao, H. Kawai, "Regularization of neural network model with distance metric learning for i-vector based spoken language identification," Computer Speech & Language, vol.44, pp. 48-60, 2017.
  • Y. Lai, F. Chen, S. Wang, X. Lu, Y. Tsao, C. Lee, "A Deep Denoising Autoencoder Approach to Improving the Intelligibility of Vocoded Speech in Cochlear Implantation," IEEE Trans. on Biomedical Engineering, DOI: 10.1109/TBME.2016.2613960, 2016.
  • T. Ochiai, S. Matsuda, H. Watanabe, X. Lu, C. Hori, H. Kawai, S. Katagiri, "Speaker Adaptive Training Lcoalizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers," IEICE Trans. Info and Sys, Vol. E99.D, No. 10 pp. 2431-2443, 2016.
  • P. Shen, X. Lu, X. Hu, N. Kanda, M. Saiko, C. Hori, H. Kawai, "Combination of multiple acoustic models with unsupervised adaptation for lecture speech transcription," Elsevier, Speech Communication, vol.82, pp. 1-13, Sep, 2016.
  • S. Wang, A. Chern, Y. Tsao, J. Hung, X. Lu, Y. Lai, B. Su, "Wavelet speech enhancement based on nonnegative matrix factorization," IEEE signal processing letter, vol. 23, no. 8, pp. 1101-1105, 2016.
  • S. Morita, M. Unoki, X. Lu, M. Akagi, " Robust Voice Activity Detection Based on Concept of Modulation Transfer Function in Noisy Reverberant Environments," Signal Processing Systems, vol. 82, No. 2, pp. 163-173, 2016..
  • Y. Tsao, P. Lin, T. Hu, X. Lu, "Ensemble environment modeling using affine transform group," Elsevier, Speech Communication, vol. 68, pp. 55-68, 2015.
  • Y. Tsao, X. Lu, P. Dixon, T. Hu, S. Matsuda, C. Hori, "Incorporating Local Information of the Acoustic Environments to MAP-based Feature Compensation and Acoustic Model Adaptation, " Elsevier, Computer Speech and Language, vol. 28, no. 3, pp. 709-726, 2014.
  • X. Lu, M. Unoki, S. Matsuda, C. Hori, H. Kashioka, "Controlling tradeoff between approximation accuracy and complexity of a smooth function in a reproducing kernel Hilbert space for noise reduction," IEEE Trans. on Signal Processing, vol. 61, no. 3, pp. 601-610, 2013.
  • X. Lu, M. Unoki, S. Nakamura, “Subband temporal modulation envelopes and their normalization for automatic speech recognition in reverberant environments,” Elsevier, Computer Speech and Language, vol. 25, no. 3, pp. 571-584, 2011.
  • X. Lu, J. Dang, “Vowel production manifold: intrinsic factor analysis of vowel articulation,” IEEE Trans. on Audio, Speech, and Language Processing,vol 18, no. 5, pp. 1053-1062, 2010.
  • X. Lu, S. Matsuda, M. Unoki, S. Nakamura, “Temporal modulation contrast normalization and edge-preserved smoothing for robust speech recognition,” Elsevier, Speech Communication, vol. 52, no. 1, pp. 1-11, 2010.
  • Q. Fang, S. Fujita, X. Lu, J. Dang, “A model-based investigation of activations of tongue muscles in vowel production,” Journal of Acoustical Science and Technology, vol. 30, no. 4, pp. 277-287, 2009.
  • D. Ying, M. Unoki, X. Lu, J. Dang, “Speech enhancement based on noise eigenspace projection,” IEICE, Trans. Inf. and Syst., vol. E92-D, no. 5, pp. 1137-1145, 2009.
  • X. Lu, J. Dang, “An investigation of dependencies between frequency components and speaker characteristics for text independent speaker identification,” Elsevier, Speech Communication, vol. 50, no. 4, pp. 312–322, 2008.
  • G. Wang, X. Lu, J. Dang, H. Bao, J. Kong, "A study of Mandarin Chinese Using X-ray and MRI," Journal of Chinese Phonetics, no. 2, pp. 51-58, 2009.
  • G. Wang, T. Kitamura, X. Lu, J. Dang, J. Kong, “MRI-based study on morphological and acoustic properties of mandarin sustained vowels,” J. Signal Processing, vol. 12, no. 4, pp.311-314, 2008.
  • X. Lu, M. Unoki, M. Akagi, “Comparative evaluation of modulation transfer function based blind restoration of sub-band power envelopes of speech as a front-end processor for automatic speech recognition systems,” Journal of Acoustical Science and Technology, vol. 29, no.6, pp.351-361, 2008.
  • D. Ying, S. Yu, X. Lu, J. Dang, F. Soong, “Robust voce activity detection based on noise eigenspace,” Journal of Acoustical Science and Technology, vol.28, no.6, pp.413-423, 2007.
  • J. Wei, X. Lu, J. Dang, “A model based learning process for modeling coarticulation of human speech,” IEICE Trans. Inf. and Sys.., vol. E90-D, no. 10, pp. 1581-1591, 2007.
  • Conferences in recent years

  • N. Kanda, X. Lu, H. Kawai, "Bayes risk training of CTC acoustic models in maximum a posteriori based decoding framework," ICASSP 2017, New Orleans, Louisiana, USA, 5-9 Mar. 2017.
  • S. Li, X. Lu, S. Sakai, M. Mimura, T. Kawahara, "Semi-supervised ensemble DNN model traning," ICASSP 2017, New Orleans, Louisiana, USA, 5-9 Mar. 2017.
  • S. Wang, A. Chern, Y. Tsao, J. Hung, X. Lu, Y. Lai, Bo. Su, "Wavelet speech enhancement based on nonnegative matrix factorization," ICASSP 2017, New Orleans, Louisiana, USA, 5-9 Mar. 2017.
  • X. Wang, X. Lu, H. Kawai, S. Yamamoto, "F0 Contour Analysis Based on Empirical Model Decomposition for DNN acoustic modeling in Mandarin Speech Recognition," INTERSPEECH, 2016.
  • X. Lu, P. Shen, Y. Tsao, H. Kawai, "Pair-wise Distance Metric Learning of Deep Neural Network Model for Spoken Language Identification," INTERSPEECH, 2016.
  • N. Kanda, X. Lu, H. Kawai, "Maximum a posteriori based decoding for CTC Acoustic Models," INTERSPEECH 2016.
  • N. Kanda, S. Harada, X. Lu, H. Kawai , "Investigation of Semi-supervised Acoustic Model Training based on the Committee of Heterogeneous Neural Networks," INTERSPEECH, 2016.
  • S. Fu, Y. Tsao, X. Lu, "SNR-Aware Convolutional Neural Network Modeling for Speech Enhancement," INTERSPEECH, 2016.
  • P. Shen, X. Lu, L. Liu and H. Kawai, "Local fisher discriminant analysis for spoken language identification," ICASSP, 2016.
  • T. Ochiai, S. Matsuda, H. Watanabe, X. Lu, H. Kawai, S. Katagiri, "Bottleneck Linear Transformation Network Adaptation for Speaker Adaptative Training-based Hybrid DNN-HMM Speech Recognizer," ICASSP, 2016.
  • N. Kanda, M. Tachimori, X. Lu, H. Kawai, "Training data pseudo-shuffling and direct decoding framework for recurrent neural network based acoustic modeling," ASRU, 2015.
  • T. Ochiai, S. Matsuda, H. Watanabe, X. Lu, C. Hori, S. Katagiri, "Speaker adaptive training for deep neural networks embedding linear transformation networks," ICASSP, 2015.
  • X. Lu, P. Shen, Y. Tsao, C. Hori and H. Kawai, "Sparse representation with temporal max-smoothing for acoustic event detection," INTERSPEECH, 2015.
  • S. Li, X. Lu, Y. Akita, T. Kawahara, "Ensemble speaker modeling using speaker adaptive training deep neural network for speaker adaptation," INTERSPEECH, 2015.
  • X. Lu, Y. Tsao, S. Matsuda, C. Hori, "Sparse representation based on a bag of spectral exemplars for acoustic event detection," ICASSP, 2014.
  • T. Ochiai, S. matsuda, X. Lu, C. Hori, S. Katagiri, "Speaker adaptive training using deep neural networks," ICASSP, 2014.
  • H. Fang, J. Huang, X. Lu, S. Wang, Y. Tsao, "Speech enhancement using segmental nonnegative matrix factorization," ICASSP, 2014.
  • X. Lu, Y. Tsao, S. Matsuda, C. Hori,"Ensemble modeling of denoising autoencoder for speech spectrum restoration," INTERSPEECH, 2014.