Faculty of Information Technology, BUT

Publication Details

Exploiting i-vector posterior covariances for short-duration language recognition

CUMANI Sandro, PLCHOT Oldřich and FÉR Radek. Exploiting i-vector posterior covariances for short-duration language recognition. In: Proceedings of Interspeech 2015. Dresden: International Speech Communication Association, 2015, pp. 1002-1006. ISBN 978-1-5108-1790-6. ISSN 1990-9772.
Czech title
Využití posteriorních kovariancí i-vektorů pro rozpoznávání jazyka z krátkých nahrávek
Type
conference paper
Language
english
Authors
Cumani Sandro (POLITO)
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Fér Radek, Ing. (DCGM FIT BUT)
URL
Keywords
i-vector, uncertainty, calibration, stacked bottleneck features, language identification
Abstract
In this work we have proposed an approach that accounts for the uncertainty in the i-vector extraction process in the framework of generative Gaussian models for language recognition.
Annotation
Linear models in i-vector space have shown to be an effective solution not only for speaker identification, but also for language recogniton. The i-vector extraction process, however, is affected by several factors, such as noise level, the acoustic content of the utterance and the duration of the spoken segments. These factors influence both the i-vector estimate and its uncertainty, represented by the i-vector posterior covariance matrix. Modeling of i-vector uncertainty with Probabilistic Linear Discriminant Analysis has shown to be effective for short-duration speaker identification. This paper extends the approach to language recognition, analyzing the effects of i-vector covariances on a state-of-the-art Gaussian classifier, and proposes an effective solution for the reduction of the average detection cost (Cavg) for short segments.
Published
2015
Pages
1002-1006
Journal
Proceedings of Interspeech, vol. 2015, no. 9, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2015
Conference
INTERSPEECH 2015, Dresden, DE
ISBN
978-1-5108-1790-6
Publisher
International Speech Communication Association
Place
Dresden, DE
BibTeX
@INPROCEEDINGS{FITPUB10967,
   author = "Sandro Cumani and Old\v{r}ich Plchot and Radek F\'{e}r",
   title = "Exploiting i-vector posterior covariances for short-duration language recognition",
   pages = "1002--1006",
   booktitle = "Proceedings of Interspeech 2015",
   journal = "Proceedings of Interspeech",
   volume = 2015,
   number = 09,
   year = 2015,
   location = "Dresden, DE",
   publisher = "International Speech Communication Association",
   ISBN = "978-1-5108-1790-6",
   ISSN = "1990-9772",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/10967"
}
Back to top