Peng Shen (沈鵬)

Researcher @ NICT, Japan

I am a researcher at NICT, Kyoto, Japan, on automatic speech recognition, deep learning technology, spoken language identification, speaker recognition, event detection, etc.
I have four years full-time system development experience as a system engineer.

Research interest: Automatic speech recognition, signal processing, and machine learning (deep learning) related to image processing, data analysis and natural language processing.



Work experience

2014-Present: Researcher at National Institute of Information and Communications Technology (NICT).
2007-2013: MA, Ph. D. in , at Electronics and Information system Engineering, Gifu University, Japan.
2004-2007: Information Management Engineer at Lenovo Group, China.



Awards & Certifications

2015.07: Japanese-Language Proficiency Test N1.
2015.04: NICT team award.
2014.12: Won the 1st position among the 8 international teams (IWSLT TED ASR evaluation).
2006.09: Lenovo R&D award of excellence.
2003.12: Award of excellence.



Publications

Journals (Peer reviewed)

  • Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai, "Regularization of neural network model with distance metric learning for i-vector based spoken language identification," in Computer Speech & Language, vol. 44, pp. 48–60, July 2017.
  • Peng Shen, Xugang Lu, Xinhui Hu, Naoyuki Kanda, Masahiro Saiko, Chiori Hori, Hisashi Kawai, "Combination of multiple acoustic models with unsupervised adaptation for lecture speech transcription," in Speech Communication, vol.82, pp. 1-13, Sep, 2016.
  • Peng Shen, Satoshi Tamura and Satoru Hayamizu, "Multi-Stream Sparse Representation Features for Noise Robust Audio-Visual Speech Recognition," in journal of Acoustical Science and Technology, vol.35, no.1, 2014.
  • International Conferences (Peer reviewed)

  • Peng Shen, Xugang Lu, Sheng Li, Hisashi Kawai, "Conditional Generative Adversarial Nets Classifier for Spoken Language Identification," in Proc. Interspeech, Stochholm, Sweden, Aug. 20-24, 2017. (Accepted)
  • Peng Shen, Xugang Lu, Hisashi Kawai, "Comparison of Regularization Constraints in Deep Neural Network based Speaker Adaptation," in ISCSLP, Oct. 2016.
  • Peng Shen, Xugang Lu, Hisashi Kawai, "Automatic acoustic segmentation in N-best list rescoring for lecture speech recognition" in ISCSLP, Oct. 2016.
  • Xugang Lu, Peng Shen, Hisashi Kawai, "A Pseudo-task Design in Multi-task Learning Deep Neural Network for Speaker Recognition" in ISCSLP, Oct. 2016.
  • Xugang Lu, Peng Shen, Yu Tsao, and Hisashi Kawai, "Pair-wise Distance Metric Learning of Neural Network Model for Spoken Language Identification," in Proc. Interspeech, Sep. 2016.
  • Peng Shen, Xugang Lu, Lemao Liu and Hisashi Kawai, "Local fisher discriminant analysis for spoken language identification," in Proc. ICASSP, Mar. 2016.
  • Xugang Lu, Peng Shen, Yu Tsao, Chiori Hori and Hisashi Kawai, "Sparse representation with temporal max-smoothing for acoustic event detection," in Proc. Interspeech, pp. 1176-1180, Sep. 2015.
  • Peng Shen, Xugang Lu, Xinhui Hu, Naoyuki Kanda, Masahiro Saiko, Chiori Hori, "The NICT ASR System for IWSLT 2014," in International Workshop on Spoken Language Translation (IWSLT), Lake Tahoe, USA, pp.113-118, Dec. 2014.
  • Xugang Lu, Yu Tsao, Peng Shen, Chiori Hori, "Spectral Patch Based Sparse Coding for Acoustic Event Detection," in The 9th International Symposium on Chinese Spoken Language Processing (ISCSLP), Singapore, Set. 2014.
  • Peng Shen, Satoshi Tamura and Satoru Hayamizu, "Audio-visual Interaction in Sparse Representation Features for Noise Robust Audio-visual Speech Recognition," in The 12th International Conference on Auditory-Visual Speech Processing(AVSP), Annecy, France, pp.43-48, Aug. 2013.
  • Peng Shen, Satoshi Tamura and Satoru Hayamizu, "Feature Reconstruction using Sparse Imputation for Noise Robust Audio-Visual Speech Recognition," in Int. Conf. APSIPA ASC, USA, ps.5-sla.18, no.125, pp.1-4, Dec, 2012.
  • Peng Shen, Satoshi Tamura and Satoru Hayamizu, "Evaluation of real-time audio-visual speech recognition," in The 9th International Conference on Auditory-Visual Speech Processing (AVSP), Hakone, Japan, pp.77-80, Oct. 2010.
  • Domestic Conferences

  • Peng Shen, Xugang Lu, Sheng Li and Hisashi Kawai, "cGAN-classifier: Conditional Generative Adversarial Nets for Classification," in 2017 Acoustical Society of Japan, Set. 2017.
  • Sheng Li, Xugang Lu, Peng Shen and Hisashi Kawai, "very deep convolutional residual network acoustic models for Japanese lecture transcription," in 2017 Acoustical Society of Japan, Set. 2017.
  • Kak Soky, Xugang Lu, Peng Shen, Hiroaki Kato, Hisashi Kawai, Chuon Vanna, Vichet Chea, "Building WFST based Grapheme to Phoneme Conversion for Khmer," in KNLP, 2016.
  • Peng Shen, Xugang Lu and Hisashi Kawai, "Investigation on nonparametric discriminant analysis for language identification," in 2016 Spring Meeting of Acoustical Society of Japan, Mar. 2016.
  • Peng Shen, Xugang Lu, Xinhui Hu, Naoyuki Kanda, Masahiro Saiko, Chiori Hori, "The 2014 NICT Automatic Speech Recognition System," in 2015 Spring Meeting of Acoustical Society of Japan, 1-P-20, March, 2015.
  • Peng Shen, Satoshi Tamura and Satoru Hayamizu, "Feature Reconstruction using Sparse Imputation for Noise Robust Audio-Visual Speech Recognition", in 2012 Autumn Meeting Acoustical Society of Japan, 3-P-8, pp.217-218, Set. 2012. (In Japanese)
  • Satoshi Tamura, Peng Shen, Hiroya Okuda, Naoya Ukai, Takuya Kawasaki, Takumi Seko, Satoru Hayamizu, "Recent efforts for high-performance multimodal speech recognition," in Technical Reports. Information Processing Society of Japan, vol.112, no.369, pp.41-46, Dec. 2012. (in Japanese)
  • Peng Shen, Satoshi Tamura and Satoru Hayamizu, "Development of Real-time Audio-Visual Speech Recognition System," in 2010 Spring Meeting of Acoustical Society of Japan, 1-P-27, pp.217-218, March, 2010.
  • Book Translation

  • NIPS 2016 Tutorial: Generative Adversarial Networks.
      (Translation: In Chinese)
      (Original English Version: arXiv:1701.00160)