Result Details

Bayesian HMM based x-vector clustering for Speaker Diarization

DIEZ SÁNCHEZ, M.; BURGET, L.; WANG, S.; ROHDIN, J.; ČERNOCKÝ, J. Bayesian HMM based x-vector clustering for Speaker Diarization. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. no. 9, p. 346-350. ISSN: 1990-9772.
Type
conference paper
Language
English
Authors
Abstract

This paper presents a simplified version of the previously proposeddiarization algorithm based on Bayesian Hidden MarkovModels, which uses Variational Bayesian inference for very fastand robust clustering of x-vector (neural network based speakerembeddings). The presented results show that this clusteringalgorithm provides significant improvements in diarization performanceas compared to the previously used AgglomerativeHierarchical Clustering. The output of this system can be furtheremployed as an initialization for a second stage VB diarizationsystem, using frame-wise MFCC features as input, to obtainoptimal results.

Keywords

Speaker Diarization, Variational Bayes, HMM,x-vector, DIHARD

URL
Published
2019
Pages
346–350
Journal
Proceedings of Interspeech, vol. 2019, no. 9, ISSN 1990-9772
Proceedings
Proceedings of Interspeech
Conference
Interspeech Conference
Publisher
International Speech Communication Association
Place
Graz
DOI
UT WoS
000831796400070
EID Scopus
BibTeX
@inproceedings{BUT159992,
  author="Mireia {Diez Sánchez} and Lukáš {Burget} and Shuai {Wang} and Johan Andréas {Rohdin} and Jan {Černocký}",
  title="Bayesian HMM based x-vector clustering for Speaker Diarization",
  booktitle="Proceedings of Interspeech",
  year="2019",
  journal="Proceedings of Interspeech",
  volume="2019",
  number="9",
  pages="346--350",
  publisher="International Speech Communication Association",
  address="Graz",
  doi="10.21437/Interspeech.2019-2813",
  issn="1990-9772",
  url="https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2813.pdf"
}
Files
Projects
Information mining in speech acquired by distant microphones, MV, Bezpečnostní výzkum České republiky 2015-2020, VI20152020025, start: 2015-10-01, end: 2020-09-30, completed
IT4Innovations excellence in science, MŠMT, Národní program udržitelnosti II, LQ1602, start: 2016-01-01, end: 2020-12-31, completed
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
Robust SPEAKER DIariazation systems using Bayesian inferenCE and deep learning methods, EU, Horizon 2020, start: 2017-03-01, end: 2019-02-28, completed
Sequence summarizing neural networks for speaker recognition, EU, Horizon 2020, 5SA15094, start: 2016-07-01, end: 2019-06-30, completed
Zpracování, zobrazování a analýza multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-17-3984, start: 2017-03-01, end: 2020-02-29, completed
Research groups
Departments
Back to top