Result Details
Semi-supervised DNN training with word selection for ASR
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
The article is about semi-supervised DNN training with word selection for Automatic Speaker Recognition (ASR).
semi-supervised training, DNN, word selection,granularity of confidences
Not all the questions related to the semi-supervised training of hybrid ASR system with DNN acoustic model were already deeply investigated. In this paper, we focus on the question of the granularity of confidences (per-sentence, per-word, perframe), the question of how the data should be used (dataselection by masks, or in mini-batch SGD with confidences as weights). Then, we propose to re-tune the system with the manually transcribed data, both with the frame CE training and sMBR training. Our preferred semi-supervised recipe which is both simple and efficient is following: we select words according to the word accuracy we obtain on the development set. Such recipe, which does not rely on a grid-search of the training hyperparameter, generalized well for: Babel Vietnamese (transcribed 11h, untranscribed 74h), Babel Bengali (transcribed 11h, untranscribed 58h) and our custom Switchboard setup (transcribed 14h, untranscribed 95h). We obtained the absolute WER improvements 2.5% for Vietnamese, 2.3% for Bengali and 3.2% for Switchboard.
@inproceedings{BUT144493,
author="Karel {Veselý} and Lukáš {Burget} and Jan {Černocký}",
title="Semi-supervised DNN training with word selection for ASR",
booktitle="Proceedings of Interspeech 2017",
year="2017",
journal="Proceedings of Interspeech",
volume="2017",
number="08",
pages="3687--3691",
publisher="International Speech Communication Association",
address="Stockholm",
doi="10.21437/Interspeech.2017-1385",
issn="1990-9772",
url="http://www.isca-speech.org/archive/Interspeech_2017/pdfs/1385.PDF"
}
IT4Innovations excellence in science, MŠMT, Národní program udržitelnosti II, LQ1602, start: 2016-01-01, end: 2020-12-31, completed
Meeting Assistant (MINT), TAČR, Program aplikovaného výzkumu a experimentálního vývoje ALFA, TA04011311, start: 2014-10-01, end: 2017-12-31, completed
Zpracování, zobrazování a analýza multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-17-3984, start: 2017-03-01, end: 2020-02-29, completed