Publication Details

BUT System for CHiME-6 Challenge

ŽMOLÍKOVÁ Kateřina, KOCOUR Martin, LANDINI Federico Nicolás, BENEŠ Karel, KARAFIÁT Martin, VYDANA Hari K., LOZANO Díez Alicia, PLCHOT Oldřich, BASKAR Murali K., ŠVEC Ján, MOŠNER Ladislav, MALENOVSKÝ Vladimír, BURGET Lukáš, YUSUF Bolaji, NOVOTNÝ Ondřej, GRÉZL František, SZŐKE Igor and ČERNOCKÝ Jan. BUT System for CHiME-6 Challenge. In: Proceedings of CHiME 2020 Virtual Workshop. Barcelona: University of Sheffield, 2020, pp. 1-3. Available from: https://chimechallenge.github.io/chime2020-workshop/programme.html
Czech title
Systém VUT v Brně pro CHiME-6 Challenge
Type
conference paper
Language
english
Authors
Žmolíková Kateřina, Ing. (DCGM FIT BUT)
Kocour Martin, Ing. (DCGM FIT BUT)
Landini Federico Nicolás (DCGM FIT BUT)
Beneš Karel, Ing. (DCGM FIT BUT)
Karafiát Martin, Ing., Ph.D. (DCGM FIT BUT)
Vydana Hari K. (DCGM FIT BUT)
Lozano Díez Alicia, Ph.D. (DCGM FIT BUT)
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Baskar Murali K. (DCGM FIT BUT)
Švec Ján, Ing. (DCGM FIT BUT)
Mošner Ladislav, Ing. (DCGM FIT BUT)
Malenovský Vladimír, Ing., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Yusuf Bolaji (DCGM FIT BUT)
Novotný Ondřej, Ing. (DCGM FIT BUT)
Grézl František, Ing., Ph.D. (DCGM FIT BUT)
Szőke Igor, Ing., Ph.D. (DCGM FIT BUT)
Černocký Jan, doc. Dr. Ing. (DCGM FIT BUT)
URL
Keywords

diarization, neural network, acoustic model, language model, enhancement

Abstract

This paper describes BUTs efforts in the development of the system for the CHiME-6 challenge with far-field dinner party recordings [1]. Our experiments are on both diarization and speech recognition parts of the system. For diarization, we employ the VBx framework which uses Bayesian hidden Markov model with eigenvoice priors on x-vectors. For acoustic modeling, we explore using different subsets of data for training, different neural network architectures, discriminative training, more robust i-vectors, and semi-supervised training on Vox- Celeb data. Besides, we perform experiments with a neural network-based language model, exploring how to overcome the small size of the text corpus and incorporate across-segment context. When fusing our best systems, we achieve 41.21 % / 42.55 % WER on Track 1, for development and evaluation respectively, and 55.15% / 69.04 % on Track 2, for development and evaluation respectively.

Published
2020
Pages
1-3
Proceedings
Proceedings of CHiME 2020 Virtual Workshop
Conference
The 6th International Workshop on Speech Processing in Everyday Environments, Barcelona - Virtual Workshop - satelite event to ICASSP 2020, ES
Publisher
University of Sheffield
Place
Barcelona, ES
BibTeX
@INPROCEEDINGS{FITPUB12283,
   author = "Kate\v{r}ina \v{Z}mol\'{i}kov\'{a} and Martin Kocour and Nicol\'{a}s Federico Landini and Karel Bene\v{s} and Martin Karafi\'{a}t and K. Hari Vydana and Alicia D\'{i}ez Lozano and Old\v{r}ich Plchot and K. Murali Baskar and J\'{a}n \v{S}vec and Ladislav Mo\v{s}ner and Vladim\'{i}r Malenovsk\'{y} and Luk\'{a}\v{s} Burget and Bolaji Yusuf and Ond\v{r}ej Novotn\'{y} and Franti\v{s}ek Gr\'{e}zl and Igor Sz\H{o}ke and Jan \v{C}ernock\'{y}",
   title = "BUT System for CHiME-6 Challenge",
   pages = "1--3",
   booktitle = "Proceedings of CHiME 2020 Virtual Workshop",
   year = 2020,
   location = "Barcelona, ES",
   publisher = "University of Sheffield",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/12283"
}
Back to top