Result Details

Analysis of ABC Submission to NIST SRE 2019 CMN and VAST Challenge

ALAM, J.; BOULIANNE, G.; BURGET, L.; DAHMANE, M.; DIEZ SÁNCHEZ, M.; GLEMBEK, O.; LALONDE, M.; LOZANO DÍEZ, A.; MATĚJKA, P.; MIZERA, P.; MOŠNER, L.; NOISEUX, C.; MONTEIRO, J.; NOVOTNÝ, O.; PLCHOT, O.; ROHDIN, J.; SILNOVA, A.; SLAVÍČEK, J.; STAFYLAKIS, T.; ST-CHARLES, P.; WANG, S.; ZEINALI, H. Analysis of ABC Submission to NIST SRE 2019 CMN and VAST Challenge. Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Tokyo: International Speech Communication Association, 2020. no. 11, p. 289-295. ISSN: 2312-2846.

Type

conference paper

Language

English

Authors

Alam Jahangir
Boulianne Gilles
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
DAHMANE, M.
DIEZ SÁNCHEZ, M.
Glembek Ondřej, Ing., Ph.D., DCGM (FIT)
LALONDE, M.
LOZANO DÍEZ, A.
Matějka Pavel, Ing., Ph.D., DCGM (FIT)
MIZERA, P.
Mošner Ladislav, Ing., DCGM (FIT)
NOISEUX, C.
MONTEIRO, J.
Novotný Ondřej, Ing., Ph.D., DCGM (FIT)
Plchot Oldřich, Ing., Ph.D., DCGM (FIT)
Rohdin Johan Andréas, M.Sc., Ph.D., FIT (FIT), DCGM (FIT)
Silnova Anna, M.Sc., Ph.D., DCGM (FIT)
SLAVÍČEK, J.
Stafylakis Themos
ST-CHARLES, P.
Wang Shuai
Zeinali Hossein, Ph.D.

Abstract

We present a condensed description and analysis of the jointsubmission of ABC team for NIST SRE 2019, by BUT, CRIM,Phonexia, Omilia and UAM. We concentrate on challenges thatarose during development and we analyze the results obtainedon the evaluation data and on our development sets. The conversationaltelephone speech (CMN2) condition is challengingfor current state-of-the-art systems, mainly due to the languagemismatch between training and test data. We show that a combinationof adversarial domain adaptation, backend adaptationand score normalization can mitigate this mismatch. On theVAST condition, we demonstrate the importance of deployingdiarization when dealing with multi-speaker utterances and thedrastic improvements that can be obtained by combining audioand visual modalities.

Keywords

speaker verification, NIST SRE, CMN, VAST, system fusion.

URL

Published

2020

Pages

289–295

Journal

Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland, vol. 2020, no. 11, ISSN 2312-2846

Proceedings

Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop

Conference

Odyssey 2020: The Speaker and Language Recognition Workshop

Publisher

International Speech Communication Association

Place

Tokyo

DOI

10.21437/Odyssey.2020-41

BibTeX

@inproceedings{BUT164070,
  author="ALAM, J. and BOULIANNE, G. and BURGET, L. and DAHMANE, M. and DIEZ SÁNCHEZ, M. and GLEMBEK, O. and LALONDE, M. and LOZANO DÍEZ, A. and MATĚJKA, P. and MIZERA, P. and MOŠNER, L. and NOISEUX, C. and MONTEIRO, J. and NOVOTNÝ, O. and PLCHOT, O. and ROHDIN, J. and SILNOVA, A. and SLAVÍČEK, J. and STAFYLAKIS, T. and ST-CHARLES, P. and WANG, S. and ZEINALI, H.",
  title="Analysis of ABC Submission to NIST SRE 2019 CMN and VAST Challenge",
  booktitle="Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop",
  year="2020",
  journal="Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland",
  volume="2020",
  number="11",
  pages="289--295",
  publisher="International Speech Communication Association",
  address="Tokyo",
  doi="10.21437/Odyssey.2020-41",
  issn="2312-2846",
  url="https://www.isca-speech.org/archive/Odyssey_2020/abstracts/73.html"
}

Files

pdf alam_odyssey2020_73.pdf 221 kB

Projects

Employment of artificial intelligence into an emergency call reception, MV, Program bezpečnostního výzkumu ČR v letech 2015-2022 (BV III/1-VS), VI20192022169, start: 2019-07-04, end: 2022-05-31, completed
Information mining in speech acquired by distant microphones, MV, Bezpečnostní výzkum České republiky 2015-2020, VI20152020025, start: 2015-10-01, end: 2020-09-30, completed
IT4Innovations excellence in science, MŠMT, Národní program udržitelnosti II, LQ1602, start: 2016-01-01, end: 2020-12-31, completed
Moderní metody zpracování, analýzy a zobrazování multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-20-6460, start: 2020-03-01, end: 2023-02-28, completed
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
Real time network, text, and speaker analytics for combating organized crime, EU, Horizon 2020, start: 2019-09-01, end: 2022-12-31, completed
Robust End-To-End SPEAKER recognition based on deep learning and attention models, EU, Horizon 2020, start: 2019-06-01, end: 2021-01-31, completed

Research groups

Speech Data Mining Research Group BUT Speech@FIT (RG SPEECH)

Departments

Department of Computer Graphics and Multimedia (DCGM)