Result Details

iVector Fusion of Prosodic and Cepstral Features for Speaker Verification

KOCKMANN, M.; FERRER, L.; BURGET, L.; ČERNOCKÝ, J. iVector Fusion of Prosodic and Cepstral Features for Speaker Verification. Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. no. 8, p. 265-268. ISBN: 978-1-61839-270-1. ISSN: 1990-9772.
Type
conference paper
Language
English
Authors
Kockmann Marcel, Dipl.-Ing., Ph.D., FIT (FIT)
Ferrer Luciana
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Abstract

This publication is about the first results on the use of total variability modelingof the mean supervector space for a set of prosodic features.We show that this iVector approach outperforms the standardJFA approach originally proposed for these features. We notethat this improvement over JFA is observed only when the iVectorsare modeled using the PLDA back end.

Keywords

speaker verification, prosody, JFA, iVector, SMM, fusion

URL
Annotation

In this paper we apply the promising iVector extraction technique followed by PLDA modeling to simple prosodic contour features. With this procedure we achieve results comparable to a system that models much more complex prosodic features using our recently proposed SMM-based iVector modeling technique. We then propose a combination of both prosodic iVectors by joint PLDA modeling that leads to significant improvements over individual systems with an EER of 5.4% on NIST SRE 2008 telephone data. Finally, we can combine these two prosodic iVector front ends with a baseline cepstral iVector system to achieve up to 21% relative reduction in new DCF.

Published
2011
Pages
265–268
Journal
Proceedings of Interspeech, vol. 2011, no. 8, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2011
Conference
Interspeech Conference
ISBN
978-1-61839-270-1
Publisher
International Speech Communication Association
Place
Florence
BibTeX
@inproceedings{BUT76448,
  author="Marcel {Kockmann} and Luciana {Ferrer} and Lukáš {Burget} and Jan {Černocký}",
  title="iVector Fusion of Prosodic and Cepstral Features for Speaker Verification",
  booktitle="Proceedings of Interspeech 2011",
  year="2011",
  journal="Proceedings of Interspeech",
  volume="2011",
  number="8",
  pages="265--268",
  publisher="International Speech Communication Association",
  address="Florence",
  isbn="978-1-61839-270-1",
  issn="1990-9772",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2011/kockmann_interspeech2011_677.pdf"
}
Projects
IARPA Biometrics Exploitation Science and Technology (BEST) - Promoting Robustness in Speaker Modeling (PRISM), IARPA, start: 2009-12-07, end: 2011-12-30, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running
Speech Recognition under Real-World Conditions, GACR, Standardní projekty, GA102/08/0707, start: 2008-01-01, end: 2011-12-31, completed
Theory and applications of phoneme posterior estimation in speech processing, GACR, Doktorské granty, GP102/09/P635, start: 2009-01-01, end: 2011-12-31, completed
Research groups
Departments
Back to top