Detail výsledku

Sequence Summarizing Neural Network for Speaker Adaptation

VESELÝ, K.; WATANABE, S.; ŽMOLÍKOVÁ, K.; KARAFIÁT, M.; BURGET, L.; ČERNOCKÝ, J. Sequence Summarizing Neural Network for Speaker Adaptation. In Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016. Shanghai: IEEE Signal Processing Society, 2016. p. 5315-5319. ISBN: 978-1-4799-9988-0.
Typ
článek ve sborníku konference
Jazyk
anglicky
Autoři
Abstrakt

In this paper, we propose a DNN adaptation technique, where the i-vector extractor is replaced by a Sequence Summarizing Neural Network (SSNN). Similarly to i-vector extractor, the SSNN produces a "summary vector", representing an acoustic summary of an utterance. Such vector is then appended to the input of main network, while both networks are trained together optimizing single loss function. Both the i-vector and SSNN speaker adaptation methods are compared on AMI meeting data. The results show comparable performance of both techniques on FBANK system with frameclassification training. Moreover, appending both the i-vector and "summary vector" to the FBANK features leads to additional improvement comparable to the performance of FMLLR adapted DNN system.

Klíčová slova

DNN, adaptation, i-vector, sequence summary,SSNN

URL
Rok
2016
Strany
5315–5319
Sborník
Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016
Konference
41th IEEE International Conference on Acoustics, Speech and Signal Processing
ISBN
978-1-4799-9988-0
Vydavatel
IEEE Signal Processing Society
Místo
Shanghai
DOI
UT WoS
000388373405093
EID Scopus
BibTeX
@inproceedings{BUT130964,
  author="Karel {Veselý} and Shinji {Watanabe} and Kateřina {Žmolíková} and Martin {Karafiát} and Lukáš {Burget} and Jan {Černocký}",
  title="Sequence Summarizing Neural Network for Speaker Adaptation",
  booktitle="Proceedings of the 41th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), 2016",
  year="2016",
  pages="5315--5319",
  publisher="IEEE Signal Processing Society",
  address="Shanghai",
  doi="10.1109/ICASSP.2016.7472692",
  isbn="978-1-4799-9988-0",
  url="https://www.fit.vut.cz/research/publication/11145/"
}
Soubory
Projekty
Dolování infoRmAcí z řeči Pořízené vzdÁlenými miKrofony, MV, Bezpečnostní výzkum České republiky 2015-2020, VI20152020025, zahájení: 2015-10-01, ukončení: 2020-09-30, ukončen
Meeting assistant (MINT), TAČR, Program aplikovaného výzkumu a experimentálního vývoje ALFA, TA04011311, zahájení: 2014-10-01, ukončení: 2017-12-31, ukončen
Výzkumné skupiny
Pracoviště
Nahoru