Result Details

Bayesian Learning for Domain-Invariant Speaker Verification and Anti-Spoofing

LI, J.; MAK, M.; ROHDIN, J.; LEE, K.; HERMANSKY, H. Bayesian Learning for Domain-Invariant Speaker Verification and Anti-Spoofing. In Proceedings of the Annual Conference of the International Speech Communication Association Interspeech. Interspeech. Rotterdam: International Speech Communication Association, 2025. p. 1123-1127.

Type

conference paper

Language

English

Authors

Li Jin
Mak Man Wai
Rohdin Johan Andréas, M.Sc., Ph.D., FIT (FIT), DCGM (FIT)
Lee Kong Aik
Heřmanský Hynek, prof. Ing., Dr. Eng., DCGM (FIT)

Abstract

The performance of automatic speaker verification (ASV) and anti-spoofing drops seriously under real-world domain mismatch conditions. The relaxed instance frequency-wise normalization (RFN), which normalizes the frequency components based on the feature statistics along the time and channel axes, is a promising approach to reducing the domain dependence in the feature maps of a speaker embedding network. We advocate that the different frequencies should receive different weights and that the weights' uncertainty due to domain shift should be accounted for. To these ends, we propose leveraging variational inference to model the posterior distribution of the weights, which results in Bayesian weighted RFN (BWRFN). This approach overcomes the limitations of fixed-weight RFN, making it more effective under domain mismatch conditions. Extensive experiments on cross-dataset ASV, cross-TTS anti-spoofing, and spoofing-robust ASV show that BWRFN is significantly better than WRFN and RFN.

Keywords

anti-spoofing | Bayesian learning | domain generalization | speaker verification

URL

https://www.isca-archive.org/interspeech_2025/li25h_interspeech.pdf

Published

2025

Pages

1123–1127

Journal

Interspeech, ISSN

Proceedings

Proceedings of the Annual Conference of the International Speech Communication Association Interspeech

Conference

Interspeech Conference

Publisher

International Speech Communication Association

Place

Rotterdam

DOI

10.21437/Interspeech.2025-655

EID Scopus

2-s2.0-105020044480

BibTeX

@inproceedings{BUT199931,
  author="{} and  {} and Johan Andréas {Rohdin} and  {} and Hynek {Heřmanský}",
  title="Bayesian Learning for Domain-Invariant Speaker Verification and Anti-Spoofing",
  booktitle="Proceedings of the Annual Conference of the International Speech Communication Association Interspeech",
  year="2025",
  journal="Interspeech",
  pages="1123--1127",
  publisher="International Speech Communication Association",
  address="Rotterdam",
  doi="10.21437/Interspeech.2025-655",
  url="https://www.isca-archive.org/interspeech_2025/li25h_interspeech.pdf"
}

Projects

Linguistics, Artificial Intelligence and Language and Speech Technologies: from Research to Applications, EU, MEZISEKTOROVÁ SPOLUPRÁCE, EH23_020/0008518, start: 2025-01-01, end: 2028-12-31, running

Research groups

Výzkumná skupina dolování dat z řeči BUT Speech@FIT (RG SPEECH)

Departments

Ústav počítačové grafiky a multimédií (DCGM)
Výzkumná skupina dolování dat z řeči BUT Speech@FIT (RG SPEECH)