Result Details

Convolutive Bottleneck Network Features for LVCSR

VESELÝ, K.; KARAFIÁT, M.; GRÉZL, F. Convolutive Bottleneck Network Features for LVCSR. Proceedings of ASRU 2011. Big Island, Hawaii: IEEE Signal Processing Society, 2011. p. 42-47. ISBN: 978-1-4673-0366-8.
Type
conference paper
Language
English
Authors
Abstract

Workshop Article about novel features for tandem LVCSR system, which are based on Convolutive Bottleneck Network. It extends the previous work on Universal Context network by using linear bottleneck and expansion to Convolutive Bottleneck Network,

so all the parameters are trained together.
Keywords

Bottleneck features, Tandem LVCSR system,linear bottleneck, Convolutional Bottleneck Network

URL
Annotation

In this paper, we focus on improvements of the bottleneck ANN in a Tandem LVCSR system. First, the influence of training set size and the ANN size is evaluated. Second, a very positive effect of linear bottleneck is shown. Finally a Convolutive Bottleneck Network is proposed as extension of the current stateof- the-art Universal Context Network. The proposed training method leads to 5.5% relative reduction of WER, compared to the Universal Context ANN baseline. The relative improvement compared to the 5-layer single-bottleneck network is 17.7%. The dataset ctstrain07 composed of more than 2000 hours of English Conversational Telephone Speech was used for the experiments. The TNet toolkit with CUDA GPGPU implementation was used for fast training.

Published
2011
Pages
42–47
Proceedings
Proceedings of ASRU 2011
Conference
IEEE 2011 Workshop on Automatic Speech Recognition and Understanding
ISBN
978-1-4673-0366-8
Publisher
IEEE Signal Processing Society
Place
Big Island, Hawaii
BibTeX
@inproceedings{BUT76443,
  author="Karel {Veselý} and Martin {Karafiát} and František {Grézl}",
  title="Convolutive Bottleneck Network Features for LVCSR",
  booktitle="Proceedings of ASRU 2011",
  year="2011",
  pages="42--47",
  publisher="IEEE Signal Processing Society",
  address="Big Island, Hawaii",
  isbn="978-1-4673-0366-8",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2011/vesely_asru2011_00042.pdf"
}
Projects
Multilingual recognition and search in speech for electronic dictionaries, MPO, TIP, FR-TI1/034, start: 2009-09-01, end: 2013-08-31, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running
Speech Recognition under Real-World Conditions, GACR, Standardní projekty, GA102/08/0707, start: 2008-01-01, end: 2011-12-31, completed
Technologies of speech processing for efficient human-machine communication, TAČR, Program aplikovaného výzkumu a experimentálního vývoje ALFA, TA01011328, start: 2011-01-01, end: 2014-12-31, completed
Theory and applications of phoneme posterior estimation in speech processing, GACR, Doktorské granty, GP102/09/P635, start: 2009-01-01, end: 2011-12-31, completed
Research groups
Departments
Back to top