Result Details

Recurrent Neural Network based Language Modeling in Meeting Recognition

KOMBRINK, S.; MIKOLOV, T.; KARAFIÁT, M.; BURGET, L. Recurrent Neural Network based Language Modeling in Meeting Recognition. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. no. 8, p. 2877-2880. ISBN: 978-1-61839-270-1. ISSN: 1990-9772.
Type
conference paper
Language
English
Authors
Kombrink Stefan, Dipl.-Linguist., DCGM (FIT)
Mikolov Tomáš, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Karafiát Martin, Ing., Ph.D., DCGM (FIT)
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Abstract

In this paper we recommend the use of RNN language models as easy mean to improve an existing LVCSR system, either by improving ngram models using data sampled from an RNN or by performing the proposed rescoring and adaptation postprocessing steps.

Keywords

automatic speech recognition, language modeling,recurrent neural networks, rescoring, adaptation

URL
Annotation

We use recurrent neural network (RNN) based language models to improve the BUT English meeting recognizer. On the baseline setup using the original language models we decrease word error rate (WER) more than 1% absolute by n-best list rescoring and language model adaptation. When n-gram language models are trained on the same moderately sized data set as the RNN models, improvements are higher yielding a system which performs comparable to the baseline. A noticeable improvement was observed with unsupervised adaptation of RNN models. Furthermore, we examine the influence of word history on WER and show how to speed-up rescoring by caching common prefix strings.

Published
2011
Pages
2877–2880
Journal
Proceedings of Interspeech, vol. 2011, no. 8, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2011
Conference
Interspeech Conference
ISBN
978-1-61839-270-1
Publisher
International Speech Communication Association
Place
Florence
BibTeX
@inproceedings{BUT76441,
  author="Stefan {Kombrink} and Tomáš {Mikolov} and Martin {Karafiát} and Lukáš {Burget}",
  title="Recurrent Neural Network based Language Modeling in Meeting Recognition",
  booktitle="Proceedings of Interspeech 2011",
  year="2011",
  journal="Proceedings of Interspeech",
  volume="2011",
  number="8",
  pages="2877--2880",
  publisher="International Speech Communication Association",
  address="Florence",
  isbn="978-1-61839-270-1",
  issn="1990-9772",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2011/kombrink_interspeech2011_792.pdf"
}
Projects
Advanced recognition and presentation of multimedia data, BUT, Vnitřní projekty VUT, FIT-S-11-2, start: 2011-01-01, end: 2013-12-31, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running
Speech Recognition under Real-World Conditions, GACR, Standardní projekty, GA102/08/0707, start: 2008-01-01, end: 2011-12-31, completed
Technologies of speech processing for efficient human-machine communication, TAČR, Program aplikovaného výzkumu a experimentálního vývoje ALFA, TA01011328, start: 2011-01-01, end: 2014-12-31, completed
Theory and applications of phoneme posterior estimation in speech processing, GACR, Doktorské granty, GP102/09/P635, start: 2009-01-01, end: 2011-12-31, completed
Research groups
Departments
Back to top