Result Details

Prosodic Speaker Verification using Subspace Multinomial Models with Intersession Compensation

KOCKMANN, M.; BURGET, L.; GLEMBEK, O.; FERRER, L.; ČERNOCKÝ, J. Prosodic Speaker Verification using Subspace Multinomial Models with Intersession Compensation. Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010). Proceedings of Interspeech. Makuhari, Chiba, Japan: International Speech Communication Association, 2010. no. 9, p. 1061-1064. ISBN: 978-1-61782-123-3. ISSN: 1990-9772.
Type
conference paper
Language
English
Authors
Kockmann Marcel, Dipl.-Ing., Ph.D., DCGM (FIT)
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Glembek Ondřej, Ing., Ph.D., DCGM (FIT)
Ferrer Luciana
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Abstract

The paper is on the proposal of a novel approach to modeling prosodic features. Our model is based on the idea of introducing subspace of model parameters.

Keywords

speaker verification, prosody, JFA, multinomial model

URL
Annotation

We propose a novel approach to modeling prosodic features. Inspired by Joint Factor Analysis model (JFA), our model is based on the same idea of introducing subspace of model parameters. However, the underlying Gaussian Mixture distribution of JFA is replaced by multinomial distribution to model sequences of discrete units rather than continuous features. In this work, we use the subspace model as a feature extractor for support vector machines (SVMs), similar to the recently proposed JFA in total variability space. We can show the capability to reduce high-dimensional count vectors to low dimension while keeping system performance stable. With additional intersession compensation, we can improve 30% relative to the baseline system and reach an equal error rate of 8.8% on the NIST 2006 SRE dataset.

Published
2010
Pages
1061–1064
Journal
Proceedings of Interspeech, vol. 2010, no. 9, ISSN 1990-9772
Proceedings
Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010)
Conference
Interspeech Conference
ISBN
978-1-61782-123-3
Publisher
International Speech Communication Association
Place
Makuhari, Chiba, Japan
BibTeX
@inproceedings{BUT34952,
  author="Marcel {Kockmann} and Lukáš {Burget} and Ondřej {Glembek} and Luciana {Ferrer} and Jan {Černocký}",
  title="Prosodic Speaker Verification using Subspace Multinomial Models with Intersession Compensation",
  booktitle="Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010)",
  year="2010",
  journal="Proceedings of Interspeech",
  volume="2010",
  number="9",
  pages="1061--1064",
  publisher="International Speech Communication Association",
  address="Makuhari, Chiba, Japan",
  isbn="978-1-61782-123-3",
  issn="1990-9772",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2010/kockmann_interspeech2010_IS100048.pdf"
}
Projects
Mobile Biometry, MŠMT, Podpora projektů sedmého rámcového programu Evropského společenství pro výzkum, technologický rozvoj a demonstrace (2007 až 2013) podle zákona č. 171/2007 Sb., 7E08042, start: 2008-01-01, end: 2010-12-31, completed
Recognition and presentation of multimedia data, BUT, Vnitřní projekty VUT, FIT-S-10-2, 2010, start: 2010-04-01, end: 2010-12-31, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running
Speech Recognition under Real-World Conditions, GACR, Standardní projekty, GA102/08/0707, start: 2008-01-01, end: 2011-12-31, completed
Research groups
Departments
Back to top