Result Details

Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge

MATĚJKA, P.; PLCHOT, O.; ZEINALI, H.; MOŠNER, L.; SILNOVA, A.; BURGET, L.; NOVOTNÝ, O.; GLEMBEK, O. Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. no. 9, p. 2448-2452. ISSN: 1990-9772.

Type

conference paper

Language

English

Authors

Matějka Pavel, Ing., Ph.D., DCGM (FIT)
Plchot Oldřich, Ing., Ph.D., DCGM (FIT)
Zeinali Hossein, Ph.D., DCGM (FIT)
Mošner Ladislav, Ing., DCGM (FIT)
Silnova Anna, M.Sc., Ph.D., DCGM (FIT)
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Novotný Ondřej, Ing., Ph.D., DCGM (FIT)
Glembek Ondřej, Ing., Ph.D., DCGM (FIT)

Abstract

This paper is a post-evaluation analysis of our efforts inVOiCES 2019 Speaker Recognition challenge. All systems inthe fixed condition are based on x-vectors with different featuresand DNN topologies. The single best system reaches minDCFof 0.38 (5.25% EER) and a fusion of 3 systems yields minDCFof 0.34 (4.87% EER).We also analyze how speaker verification(SV) systems evolved in last few years and show results also onSITW 2016 Challenge. EER on the core-core condition of theSITW 2016 challenge dropped from 5.85% to 1.65% for systemfusions submitted for SITW 2016 and VOiCES 2019, respectively.The less restrictive open condition allowed us to useexternal data for PLDA adaptation and achieve additional smallperformance improvement. In our submission to open condition,we used three x-vector systems and also one system basedon i-vectors.

Keywords

Far-Field Scenarios, analysis, voices

URL

Published

2019

Pages

2448–2452

Journal

Proceedings of Interspeech, vol. 2019, no. 9, ISSN 1990-9772

Proceedings

Proceedings of Interspeech

Conference

Interspeech Conference

Publisher

International Speech Communication Association

Place

Graz

DOI

10.21437/Interspeech.2019-2471

UT WoS

000831796402122

EID Scopus

2-s2.0-85074694077

BibTeX

@inproceedings{BUT159997,
  author="Pavel {Matějka} and Oldřich {Plchot} and Hossein {Zeinali} and Ladislav {Mošner} and Anna {Silnova} and Lukáš {Burget} and Ondřej {Novotný} and Ondřej {Glembek}",
  title="Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge",
  booktitle="Proceedings of Interspeech",
  year="2019",
  journal="Proceedings of Interspeech",
  volume="2019",
  number="9",
  pages="2448--2452",
  publisher="International Speech Communication Association",
  address="Graz",
  doi="10.21437/Interspeech.2019-2471",
  issn="1990-9772",
  url="https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2471.pdf"
}

Files

pdf matejka_is2019_192471.pdf 200 kB

Projects

DARPA Low Resource Languages for Emergent Incidents (LORELEI) - Exploiting Language Information for Situational Awareness (ELISA), University of Southern California, start: 2015-09-01, end: 2020-03-31, completed
Improving Robustnes in Automatic Speaker Recognition, GACR, Juniorské granty, GJ17-23870Y, start: 2017-01-01, end: 2019-12-31, completed
Information mining in speech acquired by distant microphones, MV, Bezpečnostní výzkum České republiky 2015-2020, VI20152020025, start: 2015-10-01, end: 2020-09-30, completed
IT4Innovations excellence in science, MŠMT, Národní program udržitelnosti II, LQ1602, start: 2016-01-01, end: 2020-12-31, completed
Neural networks for signal processing and speech data mining, TAČR, Program na podporu aplikovaného výzkumu ZÉTA, TJ01000208, start: 2018-01-01, end: 2019-12-31, completed
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
Robust SPEAKER DIariazation systems using Bayesian inferenCE and deep learning methods, EU, Horizon 2020, start: 2017-03-01, end: 2019-02-28, completed
Sequence summarizing neural networks for speaker recognition, EU, Horizon 2020, 5SA15094, start: 2016-07-01, end: 2019-06-30, completed

Research groups

Speech Data Mining Research Group BUT Speech@FIT (RG SPEECH)

Departments

Department of Computer Graphics and Multimedia (DCGM)