Result Details

Boosting Performance on Low-resource Languages by Standard Corpora: AN ANALYSIS

GRÉZL, F.; KARAFIÁT, M. Boosting Performance on Low-resource Languages by Standard Corpora: AN ANALYSIS. In Proceeding of SLT 2016. San Diego: IEEE Signal Processing Society, 2016. p. 629-636. ISBN: 978-1-5090-4903-5.
Type
conference paper
Language
English
Authors
Abstract

In this paper, we have evaluated the multilingual techniques for singlesource-language scenario. Since it is hard to obtain coherentmultilingual corpora usable for multilingual training, using single,well resourced, language instead is quite attractive.

Keywords

DNN topology, Stacked Bottle-neck, feature extraction,multilingual training, system porting, low resource

URL
Annotation

In this paper, we analyze the feasibility of using single wellresourced language - English - as a source language for multilingual techniques in context of Stacked Bottle-Neck tandem system. The effect of amount of data and number of tied-states in the source language on performance of ported system is evaluated together with different porting strategies. Generally, increasing data amount and level-of-detail both is positive. A greater effect is observed for increasing number of tied states. The modified neural network structure, shown useful for multilingual porting, was also evaluated with its specific porting procedure. Using original NN structure in combination with modified porting adapt-adapt strategy was fount as best. It achieves relative improvement 3.5-8.8% on variety of target languages. These results are comparable with using multilingual NNs pretrained on 7 languages.

Published
2016
Pages
629–636
Proceedings
Proceeding of SLT 2016
Conference
2016 IEEE Workshop on Spoken Language Technology
ISBN
978-1-5090-4903-5
Publisher
IEEE Signal Processing Society
Place
San Diego
DOI
UT WoS
000399128000092
EID Scopus
BibTeX
@inproceedings{BUT132605,
  author="František {Grézl} and Martin {Karafiát}",
  title="Boosting Performance  on Low-resource Languages by Standard Corpora: AN ANALYSIS",
  booktitle="Proceeding of SLT 2016",
  year="2016",
  pages="629--636",
  publisher="IEEE Signal Processing Society",
  address="San Diego",
  doi="10.1109/SLT.2016.7846329",
  isbn="978-1-5090-4903-5",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2016/grezl_slt2016_0000629.pdf"
}
Projects
IARPA Building Speech Recognition for Keyword Search in a New Language in a Week with Limited Training Data (BABEL) - Babelon, BBN, start: 2012-03-05, end: 2016-11-04, completed
IT4Innovations excellence in science, MŠMT, Národní program udržitelnosti II, LQ1602, start: 2016-01-01, end: 2020-12-31, completed
Meeting Assistant (MINT), TAČR, Program aplikovaného výzkumu a experimentálního vývoje ALFA, TA04011311, start: 2014-10-01, end: 2017-12-31, completed
Research groups
Departments
Back to top