Faculty of Information Technology, BUT

Publication Details

BUT OpenSAT 2017 speech recognition system

KARAFIÁT Martin, BASKAR Murali K., SZŐKE Igor, MALENOVSKÝ Vladimír, VESELÝ Karel, GRÉZL František, BURGET Lukáš and ČERNOCKÝ Jan. BUT OpenSAT 2017 speech recognition system. In: Proceedings of Interspeech 2018. Hyderabad: International Speech Communication Association, 2018, pp. 2638-2642. ISSN 1990-9772. Available from: https://www.isca-speech.org/archive/Interspeech_2018/abstracts/2457.html
Czech title
VUT systém rozpoznávání řeči pro OpenSAT 2017
Type
conference paper
Language
english
Authors
URL
Keywords
speech recognition, multilingual training, BLSTM, data augmentation, robustness
Abstract
(ASR) systems for two domains in OpenSAT evaluations: Low Resourced Languages and Public Safety Communications. The first was challenging due to lack of training data, therefore multilingual approaches for BLSTM training were employed and recently published Residual Memory Networks requiring less training data were used. Combination of both approaches led to superior performance. The second domain was challenging due to recording in extreme conditions: specific channel, speaker under stress, high levels of noise. A data augmentation process was very important to get reasonably good performance.
Published
2018
Pages
2638-2642
Journal
Proceedings of Interspeech, vol. 2018, no. 9, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2018
Conference
Interspeech 2018, Hyderabad, India, IN
Publisher
International Speech Communication Association
Place
Hyderabad, IN
DOI
BibTeX
@INPROCEEDINGS{FITPUB11838,
   author = "Martin Karafi\'{a}t and K. Murali Baskar and Igor Sz\H{o}ke and Vladim\'{i}r Malenovsk\'{y} and Karel Vesel\'{y} and Franti\v{s}ek Gr\'{e}zl and Luk\'{a}\v{s} Burget and Jan \v{C}ernock\'{y}",
   title = "BUT OpenSAT 2017 speech recognition system",
   pages = "2638--2642",
   booktitle = "Proceedings of Interspeech 2018",
   journal = "Proceedings of Interspeech",
   volume = 2018,
   number = 9,
   year = 2018,
   location = "Hyderabad, IN",
   publisher = "International Speech Communication Association",
   ISSN = "1990-9772",
   doi = "10.21437/Interspeech.2018-2457",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/11838"
}
Back to top