Result Details
BUT 2014 Babel System: Analysis of adaptation in NN based systems
Grézl František, Ing., Ph.D., DCGM (FIT)
Veselý Karel, Ing., Ph.D., DCGM (FIT)
Hannemann Mirko, Ph.D., DCGM (FIT)
Szőke Igor, Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
The article is about the BUT 2014 Babel System. It describes the analysis of adaptation in neural networks based systems.
speech recognition, discriminative training,
bottle-neck neural networks, deep neural networks, adaptation
of neural networks, fundamental frequency
Features based on a hierarchy of neural networks with compressive layers - Stacked Bottle-Neck (SBN) features - were recently shown to provide excellent performance in LVCSR systems. This paper summarizes several techniques investigated in our work towards Babel 2014 evaluations: (1) using several versions of fundamental frequency (F0) estimates, (2) semi-supervised training on un-transcribed data and mainly (3) adapting the NN structure at different levels. They are tested on three 2014 Babel languages with full GMM- and DNN-based systems. Separately and in combination, they are shown to outperform the baselines and confirm the usefulness of bottle-neck features in current ASR systems.
@inproceedings{BUT111667,
author="Martin {Karafiát} and František {Grézl} and Karel {Veselý} and Mirko {Hannemann} and Igor {Szőke} and Jan {Černocký}",
title="BUT 2014 Babel System: Analysis of adaptation in NN based systems",
booktitle="Proceedings of Interspeech 2014",
year="2014",
pages="3002--3006",
publisher="International Speech Communication Association",
address="Singapore",
isbn="978-1-63439-435-2",
url="http://www.isca-speech.org/archive/interspeech_2014/i14_3002.html"
}
IARPA Building Speech Recognition for Keyword Search in a New Language in a Week with Limited Training Data (BABEL) - Babelon, BBN, start: 2012-03-05, end: 2016-11-04, completed
Speech recognition for low-resource languages, GACR, Postdoktorandské granty, GPP202/12/P604, start: 2012-01-01, end: 2014-12-31, completed
Technologies of speech processing for efficient human-machine communication, TAČR, Program aplikovaného výzkumu a experimentálního vývoje ALFA, TA01011328, start: 2011-01-01, end: 2014-12-31, completed
Zpracování, rozpoznávání a zobrazování multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-14-2506, start: 2014-01-01, end: 2016-12-31, completed