Result Details

Sequence Summarizing Neural Networks for Spoken Language Recognition

PEŠÁN, J.; BURGET, L.; ČERNOCKÝ, J. Sequence Summarizing Neural Networks for Spoken Language Recognition. In Proceedings of Interspeech 2016. San Francisco: International Speech Communication Association, 2016. p. 3285-3289. ISBN: 978-1-5108-3313-5.
Type
conference paper
Language
English
Authors
Abstract

This paper explores the use of Sequence Summarizing NeuralNetworks (SSNNs) as a variant of deep neural networks(DNNs) for classifying sequences. In this work, it is appliedto the task of spoken language recognition. Unlike other classificationtasks in speech processing where the DNN needs toproduce a per-frame output, language is considered constantduring an utterance. We introduce a summarization componentinto the DNN structure producing one set of language posteriorsper utterance. The training of the DNN is performed byan appropriately modified gradient-descent algorithm. In ourinitial experiments, the SSNN results are compared to a singlestate-of-the-art i-vector based baseline system with a similarcomplexity (i.e. no system fusion, etc.). For some conditions,SSNNs is able to provide performance comparable to the baselinesystem. Relative improvement up to 30% is obtained withthe score level fusion of the baseline and the SSNN systems.

Keywords

Sequence Summarizing Neural Network, DNN,i-vectors

URL
Annotation

Tento článek pojednává o sekvenčních sumarizačních neuronových sítích pro rozpoznávání mluveného jazyka.

Published
2016
Pages
3285–3289
Proceedings
Proceedings of Interspeech 2016
Conference
Interspeech Conference
ISBN
978-1-5108-3313-5
Publisher
International Speech Communication Association
Place
San Francisco
DOI
UT WoS
000409394402038
EID Scopus
BibTeX
@inproceedings{BUT131019,
  author="Jan {Pešán} and Lukáš {Burget} and Jan {Černocký}",
  title="Sequence Summarizing Neural Networks for Spoken Language Recognition",
  booktitle="Proceedings of Interspeech 2016",
  year="2016",
  pages="3285--3289",
  publisher="International Speech Communication Association",
  address="San Francisco",
  doi="10.21437/Interspeech.2016-764",
  isbn="978-1-5108-3313-5",
  url="https://www.researchgate.net/publication/307889421_Sequence_Summarizing_Neural_Networks_for_Spoken_Language_Recognition"
}
Files
Projects
Big speech data analytics for contact centers, EU, Horizon 2020, start: 2015-01-01, end: 2017-12-31, completed
DARPA Robust Automatic Transcription of Speech (RATS) - RATS Patrol II, BBN, start: 2015-02-23, end: 2017-03-31, completed
Information mining in speech acquired by distant microphones, MV, Bezpečnostní výzkum České republiky 2015-2020, VI20152020025, start: 2015-10-01, end: 2020-09-30, completed
IT4Innovations excellence in science, MŠMT, Národní program udržitelnosti II, LQ1602, start: 2016-01-01, end: 2020-12-31, completed
Research groups
Departments
Back to top