Result Details

Data selection by sequence summarizing neural network in mismatch condition training

ŽMOLÍKOVÁ, K.; KARAFIÁT, M.; VESELÝ, K.; DELCROIX, M.; WATANABE, S.; BURGET, L.; ČERNOCKÝ, J. Data selection by sequence summarizing neural network in mismatch condition training. In Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016. p. 2354-2358. ISBN: 978-1-5108-3313-5.
Type
conference paper
Language
English
Authors
Abstract

Data augmentation is a simple and efficient technique to improvethe robustness of a speech recognizer when deployed inmismatched training-test conditions. Our paper proposes a newapproach for selecting data with respect to similarity of acousticconditions. The similarity is computed based on a sequencesummarizing neural network which extracts vectors containingacoustic summary (e.g. noise and reverberation characteristics)of an utterance. Several configurations of this network and differentmethods of selecting data using these "summary-vectors"were explored. The results are reported on a mismatched conditionusing AMI training set with the proposed data selectionand CHiME3 test set.

Keywords

Automatic speech recognition, Data augmentation,Data selection, Mismatch training condition, Sequencesummarization

URL
Annotation

Data augmentation is a simple and efficient technique to improve the robustness of a speech recognizer when deployed in mismatched training-test conditions. Our paper proposes a new approach for selecting data with respect to similarity of acoustic conditions. The similarity is computed based on a sequence summarizing neural network which extracts vectors containing acoustic summary (e.g. noise and reverberation characteristics) of an utterance. Several configurations of this network and different methods of selecting data using these "summary-vectors" were explored. The results are reported on a mismatched condition using AMI training set with the proposed data selection and CHiME3 test set.

Published
2016
Pages
2354–2358
Proceedings
Proceedings of Interspeech 2016
Conference
Interspeech Conference
ISBN
978-1-5108-3313-5
Publisher
International Speech Communication Association
Place
San Francisco
DOI
UT WoS
000409394401175
EID Scopus
BibTeX
@inproceedings{BUT132600,
  author="Kateřina {Žmolíková} and Martin {Karafiát} and Karel {Veselý} and Marc {Delcroix} and Shinji {Watanabe} and Lukáš {Burget} and Jan {Černocký}",
  title="Data selection by sequence summarizing neural network in mismatch condition training",
  booktitle="Proceedings of Interspeech 2016",
  year="2016",
  pages="2354--2358",
  publisher="International Speech Communication Association",
  address="San Francisco",
  doi="10.21437/Interspeech.2016-741",
  isbn="978-1-5108-3313-5",
  url="https://www.semanticscholar.org/paper/Data-Selection-by-Sequence-Summarizing-Neural-Zmol%C3%ADkov%C3%A1-Karafi%C3%A1t/bc1832e8b8d4e5edf987e1562b578bd9aa5e18a9"
}
Files
Projects
Applying Pilots Models for Safer Aircraft, MŠMT, Podpora projektů sedmého rámcového programu Evropského společenství pro výzkum, technologický rozvoj a demonstrace (2007 až 2013) podle zákona č. 171/2007 Sb., 7E13047, start: 2013-09-01, end: 2016-08-31, completed
Meeting Assistant (MINT), TAČR, Program aplikovaného výzkumu a experimentálního vývoje ALFA, TA04011311, start: 2014-10-01, end: 2017-12-31, completed
Zpracování, rozpoznávání a zobrazování multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-14-2506, start: 2014-01-01, end: 2016-12-31, completed
Research groups
Departments
Back to top