Result Details

Pairwise Discriminative Speaker Verification in the I -Vector Space

CUMANI, S.; BRUMMER, J.; BURGET, L.; LAFACE, P.; PLCHOT, O.; VASILAKAKIS, V. Pairwise Discriminative Speaker Verification in the I -Vector Space. IEEE Transactions on Audio Speech and Language Processing, 2013, vol. 2013, no. 6, p. 1217-1227. ISSN: 1558-7916.

Type

journal article

Language

English

Authors

Cumani Sandro, Ph.D.
Brummer Johan Nikolaas Langenhoven, Dr.
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Laface Pietro
Plchot Oldřich, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Vasilakakis Vasileios

Abstract

In this work we present a novel framework for discriminative training of speaker verification systems, where a trial
is represented, as in the PLDA approach, by an i-vector pair,
and the task is discrimination between same-speaker and dif-
ferent-speaker classes. This pairwise SVM approach provides
a more natural paradigm to speaker verification compared to
the classical one-vs-all discriminative training.

Keywords

Discriminative training,
I-vector, large-scale
training, probabilistic linear discriminant analysis, speaker recog-
nition, support vector machines

URL

Annotation

This work presents a new and efficient approach to discriminative speaker verification in the i-vector space. We illustrate the development of a linear discriminative classifier that is trained to discriminate between the hypothesis that a pair of feature vectors in a trial belong to the same speaker or to different speakers. This approach is alternative to the usual discriminative setup that discriminates between a speaker and all the other speakers. We use a discriminative classifier based on a Support Vector Machine (SVM) that is trained to estimate the parameters of a symmetric quadratic function approximating a log-likelihood ratio score without explicit modeling of the -vector distributions as in the generative Probabilistic Linear Discriminant Analysis (PLDA) models. Training these models is feasible because it is not necessary to expand the -vector pairs, which would be expensive or even impossible even for medium sized training sets. The results of experiments performed on the tel-tel extended core condition of the NIST 2010 Speaker Recognition Evaluation are competitive with the ones obtained by generative models, in terms of normalized Detection Cost Function and Equal Error Rate.Moreover, we show that it is possible to train a gender-independent discriminative model that achieves state-of-the-art accuracy, comparable to the one of a gender-dependent system, saving memory and execution time both in training and in testing.

Published

2013

Pages

1217–1227

Journal

IEEE Transactions on Audio Speech and Language Processing, vol. 2013, no. 6, ISSN 1558-7916

DOI

10.1109/TASL.2013.2245655

UT WoS

000316475600001

EID Scopus

2-s2.0-84897937000

BibTeX

@article{BUT103568,
  author="Sandro {Cumani} and Johan Nikolaas Langenhoven {Brummer} and Lukáš {Burget} and Pietro {Laface} and Oldřich {Plchot} and Vasileios {Vasilakakis}",
  title="Pairwise Discriminative Speaker Verification in the I -Vector Space",
  journal="IEEE Transactions on Audio Speech and Language Processing",
  year="2013",
  volume="2013",
  number="6",
  pages="1217--1227",
  doi="10.1109/TASL.2013.2245655",
  issn="1558-7916",
  url="https://ieeexplore.ieee.org/abstract/document/6466371"
}

Projects

Centrum excelence IT4Innovations, MŠMT, Operační program Výzkum a vývoj pro inovace, ED1.1.00/02.0070, start: 2011-01-01, end: 2015-12-31, completed
DARPA Robust Automatic Transcription of Speech (RATS) - RATS Patrol I, BBN, start: 2010-09-23, end: 2014-06-30, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running
Support of Interdisciplinary Excellence Research Teams Establishment at BUT, EU, OP VK - Oblast podpory 2.3 - Lidské zdroje ve VaV, EE2.3.30.0005, start: 2012-07-01, end: 2015-06-30, completed

Research groups

Speech Data Mining Research Group BUT Speech@FIT (RG SPEECH)

Departments

Department of Computer Graphics and Multimedia (DCGM)