Faculty of Information Technology, BUT

Publication Details

Variational Approximation of Long-span Language Models for LVCSR

DEORAS Anoop, MIKOLOV Tomáš, KOMBRINK Stefan, KARAFIÁT Martin and KHUDANPUR Sanjeev. Variational Approximation of Long-span Language Models for LVCSR. In: Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011. Praha: IEEE Signal Processing Society, 2011, pp. 5532-5535. ISBN 978-1-4577-0537-3.
Czech title
Variační aproximace jazykových modelů s dlouhým kontextem pro LVCSR
Type
conference paper
Language
english
Authors
Deoras Anoop (JHU)
Mikolov Tomáš, Ing. (DCGM FIT BUT)
Kombrink Stefan, Dipl.-Inf -Ling (DCGM FIT BUT)
Karafiát Martin, Ing., Ph.D. (DCGM FIT BUT)
Khudanpur Sanjeev (JHU)
URL
Keywords
Recurrent Neural Network, Language Model, Variational Inference
Abstract
We have presented experimental evidence that (n-gram) variational approximations of long-span LMs yield greater accuracy in LVCSR than standard n-gram models estimated from the same training text.
Annotation
Long-span language models that capture syntax and semantics are seldom used in the first pass of large vocabulary continuous speech recognition systems due to the prohibitive search-space of sentencehypotheses. Instead, an N-best list of hypotheses is created using tractable n-gram models, and rescored using the long-span models. It is shown in this paper that computationally tractable variational approximations of the long-span models are a better choice than standard n-gram models for first pass decoding. They not only result in a better first pass output, but also produce a lattice with a lower oracle word error rate, and rescoring the N-best list from such lattices with the long-span models requires a smaller N to attain the same accuracy. Empirical results on the WSJ, MIT Lectures, NIST 2007 Meeting Recognition and NIST 2001 Conversational Telephone Recognition data sets are presented to support these claims.
Published
2011
Pages
5532-5535
Proceedings
Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011
Conference
International Conference on Acoustics, Speech and Signal Processing 2011, Praha, CZ
ISBN
978-1-4577-0537-3
Publisher
IEEE Signal Processing Society
Place
Praha, CZ
BibTeX
@INPROCEEDINGS{FITPUB9659,
   author = "Anoop Deoras and Tom\'{a}\v{s} Mikolov and Stefan Kombrink and Martin Karafi\'{a}t and Sanjeev Khudanpur",
   title = "Variational Approximation of Long-span Language Models for LVCSR",
   pages = "5532--5535",
   booktitle = "Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011",
   year = 2011,
   location = "Praha, CZ",
   publisher = "IEEE Signal Processing Society",
   ISBN = "978-1-4577-0537-3",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/9659"
}
Back to top