Publication Details
End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA
ROHDIN Johan A., SILNOVA Anna, DIEZ Sánchez Mireia, PLCHOT Oldřich, MATĚJKA Pavel and BURGET Lukáš. End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA. In: Proceedings of ICASSP. Calgary: IEEE Signal Processing Society, 2018, pp. 4874-4878. ISBN 978-1-5386-4658-8.
Czech title
End-to-end DNN rozpoznávání mluvčího inspirované i-vektory a PLDA
Type
conference paper
Language
english
Authors
Rohdin Johan A., Dr. (DCGM FIT BUT)
Silnova Anna, MSc. (DCGM FIT BUT)
Diez Sánchez Mireia, M.Sc., Ph.D. (DCGM FIT BUT)
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Matějka Pavel, Ing., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Silnova Anna, MSc. (DCGM FIT BUT)
Diez Sánchez Mireia, M.Sc., Ph.D. (DCGM FIT BUT)
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Matějka Pavel, Ing., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
URL
Keywords
Speaker verification, DNN, end-to-end
Abstract
Recently, several end-to-end speaker verification systems based on
deep neural networks (DNNs) have been proposed. These systems
have been proven to be competitive for text-dependent tasks as well
as for text-independent tasks with short utterances. However, for
text-independent tasks with longer utterances, end-to-end systems
are still outperformed by standard i-vector + PLDA systems. In this
work, we develop an end-to-end speaker verification system that is
initialized to mimic an i-vector + PLDA baseline. The system is
then further trained in an end-to-end manner but regularized so that
it does not deviate too far from the initial system. In this way we
mitigate overfitting which normally limits the performance of endto-
end systems. The proposed system outperforms the i-vector +
PLDA baseline on both long and short duration utterances.
Published
2018
Pages
4874-4878
Proceedings
Proceedings of ICASSP
Conference
IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, CA
ISBN
978-1-5386-4658-8
Publisher
IEEE Signal Processing Society
Place
Calgary, CA
DOI
BibTeX
@INPROCEEDINGS{FITPUB11724, author = "A. Johan Rohdin and Anna Silnova and Mireia S\'{a}nchez Diez and Old\v{r}ich Plchot and Pavel Mat\v{e}jka and Luk\'{a}\v{s} Burget", title = "End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA", pages = "4874--4878", booktitle = "Proceedings of ICASSP", year = 2018, location = "Calgary, CA", publisher = "IEEE Signal Processing Society", ISBN = "978-1-5386-4658-8", doi = "10.1109/ICASSP.2018.8461958", language = "english", url = "https://www.fit.vut.cz/research/publication/11724" }