Project Details
Robustní zpracování nahrávek pro operativu a bezpečnost
Project Period: 1. 10. 2020 - 30. 9. 2025
Project Type: grant
Code: VJ01010108
Agency: Ministry of Interior of the Czech Republic
Program: PROGRAM STRATEGICKÁ PODPORA ROZVOJE BEZPEČNOSTNÍHO VÝZKUMU ČR 2019-2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/1VS)
speech recognition, robust, recordings, operations, security
The aim of the project is to increase competencies, unification and greater coordination of two leading Czech research institutes, in the field of speech information mining from real recordings in the field of security and close cooperation with security corps to put research results into practice of investigation and intelligence. This goal includes a shift in robust automatic speech recognition (ASR), training / adaptation of ASRs for different environments, determining when a person is speaking in a recording (diarization), and researching recordings through acoustic queries (Query by Example)
Diez Sánchez Mireia, M.Sc., Ph.D. (UPGM FIT VUT) , team leader
Matějka Pavel, Ing., Ph.D. (UPGM FIT VUT) , team leader
Szőke Igor, Ing., Ph.D. (UPGM FIT VUT) , team leader
Malenovský Vladimír, Ing., Ph.D. (UPGM FIT VUT)
Mošner Ladislav, Ing. (UPGM FIT VUT)
Plchot Oldřich, Ing., Ph.D. (UPGM FIT VUT)
Schwarz Petr, Ing., Ph.D. (UPGM FIT VUT)
Silnova Anna, MSc., Ph.D. (UPGM FIT VUT)
2022
- SILNOVA Anna, STAFYLAKIS Themos, MOŠNER Ladislav, PLCHOT Oldřich, ROHDIN Johan A., MATĚJKA Pavel, BURGET Lukáš, GLEMBEK Ondřej and BRUMMER Johan Nikolaas Langenhoven. Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch. In: Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022). Beijing: International Speech Communication Association, 2022, pp. 9-16. Detail
- KOCOUR Martin, UMESH Jahnavi, KARAFIÁT Martin, ŠVEC Ján, LOPEZ Fernando, BENEŠ Karel, DIEZ Sánchez Mireia, SZŐKE Igor, LUQUE Jordi, VESELÝ Karel, BURGET Lukáš and ČERNOCKÝ Jan. BCN2BRNO: ASR System Fusion for Albayzin 2022 Speech to Text Challenge. In: Proceedings of IberSpeech 2022. Granada: International Speech Communication Association, 2022, pp. 276-280. Detail
- ALAM Jahangir, BURGET Lukáš, GLEMBEK Ondřej, MATĚJKA Pavel, MOŠNER Ladislav, PLCHOT Oldřich, ROHDIN Johan A., SILNOVA Anna and STAFYLAKIS Themos et al. Development of ABC systems for the 2021 edition of NIST Speaker Recognition evaluation. In: Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022). Beijing: International Speech Communication Association, 2022, pp. 346-353. Detail
- LANDINI Federico Nicolás, LOZANO Díez Alicia, DIEZ Sánchez Mireia and BURGET Lukáš. From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Incheon: International Speech Communication Association, 2022, pp. 5095-5099. ISSN 1990-9772. Detail
- MOŠNER Ladislav, PLCHOT Oldřich, BURGET Lukáš and ČERNOCKÝ Jan. Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022, pp. 7982-7986. ISBN 978-1-6654-0540-9. Detail
- MOŠNER Ladislav, PLCHOT Oldřich, BURGET Lukáš and ČERNOCKÝ Jan. Multisv: Dataset for Far-Field Multi-Channel Speaker Verification. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022, pp. 7977-7981. ISBN 978-1-6654-0540-9. Detail
- BRUMMER Johan Nikolaas Langenhoven, SWART Albert du Preez, MOŠNER Ladislav, SILNOVA Anna, PLCHOT Oldřich, STAFYLAKIS Themos and BURGET Lukáš. Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Incheon: International Speech Communication Association, 2022, pp. 1446-1450. ISSN 1990-9772. Detail
- STAFYLAKIS Themos, MOŠNER Ladislav, PLCHOT Oldřich, ROHDIN Johan A., SILNOVA Anna, BURGET Lukáš and ČERNOCKÝ Jan. Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundaries. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Incheon: International Speech Communication Association, 2022, pp. 605-609. ISSN 1990-9772. Detail
2021
- LANDINI Federico Nicolás, GLEMBEK Ondřej, MATĚJKA Pavel, ROHDIN Johan A., BURGET Lukáš, DIEZ Sánchez Mireia and SILNOVA Anna. Analysis of the BUT Diarization System for Voxconverse Challenge. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021, pp. 5819-5823. ISBN 978-1-7281-7605-5. Detail
- KARAFIÁT Martin, VESELÝ Karel, ČERNOCKÝ Jan, PROFANT Ján, NYTRA Jiří, HLAVÁČEK Miroslav and PAVLÍČEK Tomáš. Analysis of X-Vectors for Low-Resource Speech Recognition. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021, pp. 6998-7002. ISBN 978-1-7281-7605-5. Detail