Detail výsledku

Challenging margin-based speaker embedding extractors by using the variational information bottleneck

STAFYLAKIS, T.; SILNOVA, A.; ROHDIN, J.; PLCHOT, O.; BURGET, L. Challenging margin-based speaker embedding extractors by using the variational information bottleneck. In Proceedings of Interspeech 2024. Proceedings of Interspeech. Kos: International Speech Communication Association, 2024. no. 9, p. 3220-3224. ISSN: 1990-9772.
Typ
článek ve sborníku konference
Jazyk
anglicky
Autoři
Stafylakis Themos
Silnova Anna, M.Sc., Ph.D., UPGM (FIT)
Rohdin Johan Andréas, M.Sc., Ph.D., FIT (FIT), UPGM (FIT)
Plchot Oldřich, Ing., Ph.D., UPGM (FIT)
Burget Lukáš, doc. Ing., Ph.D., UPGM (FIT)
Abstrakt

Speaker embedding extractors are typically trained using a
classification loss over the training speakers. During the last
few years, the standard softmax/cross-entropy loss has been
replaced by the margin-based losses, yielding significant im-
provements in speaker recognition accuracy. Motivated by
the fact that the margin merely reduces the logit of the target
speaker during training, we consider a probabilistic framework
that has a similar effect. The variational information bottle-
neck provides a principled mechanism for making deterministic
nodes stochastic, resulting in an implicit reduction of the pos-
terior of the target speaker. We experiment with a wide range
of speaker recognition benchmarks and scoring methods and re-
port competitive results to those obtained with the state-of-the-
art Additive Angular Margin loss.

Klíčová slova

speaker recognition, variational information bottleneck

URL
Rok
2024
Strany
3220–3224
Časopis
Proceedings of Interspeech, roč. 2024, č. 9, ISSN 1990-9772
Sborník
Proceedings of Interspeech 2024
Konference
Interspeech Conference
Vydavatel
International Speech Communication Association
Místo
Kos
DOI
EID Scopus
BibTeX
@inproceedings{BUT193738,
  author="Themos {Stafylakis} and Anna {Silnova} and Johan Andréas {Rohdin} and Oldřich {Plchot} and Lukáš {Burget}",
  title="Challenging margin-based speaker embedding extractors by using the variational information bottleneck",
  booktitle="Proceedings of Interspeech 2024",
  year="2024",
  journal="Proceedings of Interspeech",
  volume="2024",
  number="9",
  pages="3220--3224",
  publisher="International Speech Communication Association",
  address="Kos",
  doi="10.21437/Interspeech.2024-2058",
  issn="1990-9772",
  url="https://www.isca-archive.org/interspeech_2024/stafylakis24_interspeech.pdf"
}
Soubory
Projekty
Nástroje boje proti hlasovým DeepFakes, MV, Programu bezpečnostního výzkumu ČR 2021-2026: vývoj, testování a evaluace nových bezpečnostních technologií (SECTECH) - II. veřejná soutěž, VB02000060, zahájení: 2024-01-01, ukončení: 2026-12-31, řešení
Výměny pro výzkum řeči a technologií, EU, Horizon 2020, zahájení: 2021-01-01, ukončení: 2025-12-31, řešení
Výzkumné skupiny
Pracoviště
Nahoru