Detail projektu
Neural Representations in multi-modal and multi-lingual modeling
Období řešení: 1. 1. 2019 - 31. 12. 2023
Typ projektu: grant
Kód: GX19-26934X
Agentura: Grantová agentura České republiky
Program: Grantové projekty exelence v základním výzkumu EXPRO - 2019
hluboké strojové učení;neuronové sítě;spojité reprezentace;zpracování přirozeného jazyka;zpracování řeči a textu;strojový překlad; multimodalita;mnohojazyčnost
Projekt NEUREM3 spojuje základní výzkum v oblasti zpracování mluvené řeči (speech processing, SP) a přirozeného jazyka (natural language processing, NLP) s důrazem na vícejazyčnost a multi-modalitu (zpracování řeči a textu s podporou obrazové informace). V jádru současných metod hlubokého strojového učení leží spojité vektorové reprezentace, které si neuronové samy budují během trénování. Ačkoli empiricky dosahují neuronové sítě často vynikajících výsledků, znalosti a pochopení získaných reprezentací jsou nedostatečné. NEUREM3 má ambici tuto mezeru vyplnit a studovat neuronové reprezentace pro jednotky textu a řeči různého rozsahu (od fonémů a písmen až po proslovy a dokumenty) a reprezentace získané pro izolované úlohy i více úloh současně (multi-tasking). NEUREM3 vylepší architektury i techniky trénování neuronových sítí, aby je bylo možné trénovat je na neúplných nebo nekoherentních datech.
Karafiát Martin, Ing., Ph.D. (UPGM FIT VUT) , spoluřešitel
Veselý Karel, Ing., Ph.D. (UPGM FIT VUT) , spoluřešitel
Baskar Murali K. (UPGM FIT VUT)
Beneš Karel, Ing. (UPGM FIT VUT)
Han Jiangyu, M.Eng. (UPGM FIT VUT)
Kesiraju Santosh (UPGM FIT VUT)
Peng Junyi, Msc. Eng. (UPGM FIT VUT)
Plchot Oldřich, Ing., Ph.D. (UPGM FIT VUT)
Rohdin Johan A., Dr. (UPGM FIT VUT)
Sarvaš Marek, Bc. (UPGM FIT VUT)
2024
- HAN Jiangyu, LANDINI Federico Nicolás, ROHDIN Johan A., DIEZ Sánchez Mireia, BURGET Lukáš, CAO Yuhang, LU Heng a ČERNOCKÝ Jan. Diacorrect: Error Correction Back-End for Speaker Diarization. In: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul: IEEE Signal Processing Society, 2024, s. 11181-11185. ISBN 979-8-3503-4485-1. Detail
- LANDINI Federico Nicolás, DIEZ Sánchez Mireia, STAFYLAKIS Themos a BURGET Lukáš. DiaPer: End-to-End Neural Diarization With Perceiver-Based Attractors. IEEE Transactions on Audio, Speech, and Language Processing, roč. 32, č. 7, 2024, s. 3450-3465. ISSN 1558-7916. Detail
- KLEMENT Dominik, DIEZ Sánchez Mireia, LANDINI Federico Nicolás, BURGET Lukáš, SILNOVA Anna, DELCROIX Marc a TAWARA Naohiro. Discriminative Training of VBx Diarization. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024, s. 11871-11875. ISBN 979-8-3503-4485-1. Detail
- BENEŠ Karel, KOCOUR Martin a BURGET Lukáš. Hystoc: Obtaining Word Confidences for Fusion of End-To-End ASR Systems. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024, s. 11276-11280. ISBN 979-8-3503-4485-1. Detail
- PENG Junyi, DELCROIX Marc, OCHIAI Tsubasa, ASHIHARA Takanori, PLCHOT Oldřich, ARAKI Shoko a ČERNOCKÝ Jan. Probing Self-Supervised Learning Models With Target Speech Extraction. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024, s. 535-539. ISBN 979-8-3503-7451-3. Detail
- PENG Junyi, DELCROIX Marc, OCHIAI Tsubasa, PLCHOT Oldřich, ARAKI Shoko a ČERNOCKÝ Jan. Target Speech Extraction with Pre-Trained Self-Supervised Learning Models. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Seoul: IEEE Signal Processing Society, 2024, s. 10421-10425. ISBN 979-8-3503-4485-1. Detail
2023
- SILNOVA Anna, SLAVÍČEK Josef, MOŠNER Ladislav, KLČO Michal, PLCHOT Oldřich, MATĚJKA Pavel, PENG Junyi, STAFYLAKIS Themos a BURGET Lukáš. ABC System Description for NIST LRE 2022. In: Proceedings of NIST LRE 2022 Workshop. Washington DC: National Institute of Standards and Technology, 2023, s. 1-5. Detail
- PENG Junyi, PLCHOT Oldřich, STAFYLAKIS Themos, MOŠNER Ladislav, BURGET Lukáš a ČERNOCKÝ Jan. An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification. In: 2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings. Doha: IEEE Signal Processing Society, 2023, s. 555-562. ISBN 978-1-6654-7189-3. Detail
- KESIRAJU Santosh, BENEŠ Karel, TIKHONOV Maksim a ČERNOCKÝ Jan. BUT Systems for IWSLT 2023 Marathi - Hindi Low Resource Speech Translation Task. In: 20th International Conference on Spoken Language Translation, IWSLT 2023 - Proceedings of the Conference. Toronto (in-person and online): Association for Computational Linguistics, 2023, s. 227-234. ISBN 978-1-959429-84-5. Detail
- MATĚJKA Pavel, SILNOVA Anna, SLAVÍČEK Josef, MOŠNER Ladislav, PLCHOT Oldřich, KLČO Michal, PENG Junyi, STAFYLAKIS Themos a BURGET Lukáš. Description and Analysis of ABC Submission to NIST LRE 2022. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Dublin: International Speech Communication Association, 2023, s. 511-515. ISSN 1990-9772. Detail
- YUSUF Bolaji, ČERNOCKÝ Jan a SARAÇLAR Murat. End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, roč. 31, č. 08, 2023, s. 3070-3080. ISSN 2329-9290. Detail
- STAFYLAKIS Themos, MOŠNER Ladislav, KAKOUROS Sofoklis, PLCHOT Oldřich, BURGET Lukáš a ČERNOCKÝ Jan. Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations. In: 2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings. Doha: IEEE Signal Processing Society, 2023, s. 1136-1143. ISBN 978-1-6654-7189-3. Detail
- PENG Junyi, PLCHOT Oldřich, STAFYLAKIS Themos, MOŠNER Ladislav, BURGET Lukáš a ČERNOCKÝ Jan. Improving Speaker Verification with Self-Pretrained Transformer Models. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Dublin: International Speech Communication Association, 2023, s. 5361-5365. ISSN 1990-9772. Detail
- MOŠNER Ladislav, PLCHOT Oldřich, PENG Junyi, BURGET Lukáš a ČERNOCKÝ Jan. Multi-Channel Speech Separation with Cross-Attention and Beamforming. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Dublin: International Speech Communication Association, 2023, s. 1693-1697. ISSN 1990-9772. Detail
- LANDINI Federico Nicolás, DIEZ Sánchez Mireia, LOZANO Díez Alicia a BURGET Lukáš. Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization. In: Proceedings of ICASSP 2023. Rhodes Island: IEEE Signal Processing Society, 2023, s. 1-5. ISBN 978-1-7281-6327-7. Detail
- DELCROIX Marc, TAWARA Naohiro, DIEZ Sánchez Mireia, LANDINI Federico Nicolás, SILNOVA Anna, OGAWA Atsunori, NAKATANI Tomohiro, BURGET Lukáš a ARAKI Shoko. Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Dublin: International Speech Communication Association, 2023, s. 3477-3481. ISSN 1990-9772. Detail
- PENG Junyi, STAFYLAKIS Themos, GU Rongzhi, PLCHOT Oldřich, MOŠNER Ladislav, BURGET Lukáš a ČERNOCKÝ Jan. Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Rhodes Island: IEEE Signal Processing Society, 2023, s. 1-5. ISBN 978-1-7281-6327-7. Detail
- KAKOUROS Sofoklis, STAFYLAKIS Themos, MOŠNER Ladislav a BURGET Lukáš. Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing. In: Proceedings of ICASSP 2023. Rhodes Island: IEEE Signal Processing Society, 2023, s. 1-5. ISBN 978-1-7281-6327-7. Detail
- KESIRAJU Santosh, SARVAŠ Marek, PAVLÍČEK Tomáš, MACAIRE Cécile a CIUBA Alejandro. Strategies for Improving Low Resource Speech to Text Translation Relying on Pre-trained ASR Models. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Dublin: International Speech Communication Association, 2023, s. 2148-2152. ISSN 1990-9772. Detail
- SILNOVA Anna, BRUMMER Johan Nikolaas Langenhoven, SWART Albert du Preez a BURGET Lukáš. Toroidal Probabilistic Spherical Discriminant Analysis. In: Proceedings of ICASSP 2023. Rhodes Island: IEEE Signal Processing Society, 2023, s. 1-5. ISBN 978-1-7281-6327-7. Detail
- YU Dong, GONG Yifan, PICHENY Michael Alan, RAMABHADRAN Bhuvana, HAKKANI-TÜR Dilek, PRASAD Rohit, ZEN Heiga, SKOGLUND Jan, ČERNOCKÝ Jan, BURGET Lukáš a MOHAMED Abdelrahman. Twenty-Five Years of Evolution in Speech and Language Processing. IEEE Signal Processing Magazine, roč. 40, č. 5, 2023, s. 27-39. ISSN 1558-0792. Detail
2022
- SILNOVA Anna, STAFYLAKIS Themos, MOŠNER Ladislav, PLCHOT Oldřich, ROHDIN Johan A., MATĚJKA Pavel, BURGET Lukáš, GLEMBEK Ondřej a BRUMMER Johan Nikolaas Langenhoven. Analyzing speaker verification embedding extractors and back-ends under language and channel mismatch. In: Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022). Beijing: International Speech Communication Association, 2022, s. 9-16. Detail
- LANDINI Federico Nicolás, PROFANT Ján, DIEZ Sánchez Mireia a BURGET Lukáš. Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks. Computer Speech and Language, roč. 71, č. 101254, 2022, s. 1-16. ISSN 0885-2308. Detail
- KOCOUR Martin, UMESH Jahnavi, KARAFIÁT Martin, ŠVEC Ján, LOPEZ Fernando, BENEŠ Karel, DIEZ Sánchez Mireia, SZŐKE Igor, LUQUE Jordi, VESELÝ Karel, BURGET Lukáš a ČERNOCKÝ Jan. BCN2BRNO: ASR System Fusion for Albayzin 2022 Speech to Text Challenge. In: Proceedings of IberSpeech 2022. Granada: International Speech Communication Association, 2022, s. 276-280. Detail
- ALAM Jahangir, BURGET Lukáš, GLEMBEK Ondřej, MATĚJKA Pavel, MOŠNER Ladislav, PLCHOT Oldřich, ROHDIN Johan A., SILNOVA Anna a STAFYLAKIS Themos a kol. Development of ABC systems for the 2021 edition of NIST Speaker Recognition evaluation. In: Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022). Beijing: International Speech Communication Association, 2022, s. 346-353. Detail
- HAN Jiangyu, LONG Yanhua, BURGET Lukáš a ČERNOCKÝ Jan. DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation and Extraction. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022, s. 7292-7296. ISBN 978-1-6654-0540-9. Detail
- LANDINI Federico Nicolás, LOZANO Díez Alicia, DIEZ Sánchez Mireia a BURGET Lukáš. From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Incheon: International Speech Communication Association, 2022, s. 5095-5099. ISSN 1990-9772. Detail
- KIŠŠ Martin, KOHÚT Jan, BENEŠ Karel a HRADIŠ Michal. Importance of Textlines in Historical Document Classification. In: Uchida, S., Barney, E., Eglin, V. (eds) Document Analysis Systems. Lecture Notes in Computer Science, roč. 13237. La Rochelle: Springer Nature Switzerland AG, 2022, s. 158-170. ISBN 978-3-031-06554-5. Detail
- PENG Junyi, GU Rongzhi, MOŠNER Ladislav, PLCHOT Oldřich, BURGET Lukáš a ČERNOCKÝ Jan. Learnable Sparse Filterbank for Speaker Verification. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Incheon: International Speech Communication Association, 2022, s. 5110-5114. ISSN 1990-9772. Detail
- MOŠNER Ladislav, PLCHOT Oldřich, BURGET Lukáš a ČERNOCKÝ Jan. Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022, s. 7982-7986. ISBN 978-1-6654-0540-9. Detail
- MOŠNER Ladislav, PLCHOT Oldřich, BURGET Lukáš a ČERNOCKÝ Jan. Multisv: Dataset for Far-Field Multi-Channel Speaker Verification. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022, s. 7977-7981. ISBN 978-1-6654-0540-9. Detail
- BURGET Lukáš a BOJAR Ondřej. NEUREM3 Interim Research Report. Brno: Ústav počítačové grafiky a multimédií FIT VUT v Brně, 2022. Detail
- ONDEL Yang Lucas Antoine Francois, YUSUF Bolaji, BURGET Lukáš a SARAÇLAR Murat. Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, roč. 30, č. 5, 2022, s. 1902-1917. ISSN 2329-9290. Detail
- BRUMMER Johan Nikolaas Langenhoven, SWART Albert du Preez, MOŠNER Ladislav, SILNOVA Anna, PLCHOT Oldřich, STAFYLAKIS Themos a BURGET Lukáš. Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Incheon: International Speech Communication Association, 2022, s. 1446-1450. ISSN 1990-9772. Detail
- PENG Junyi, ZHANG Chunlei, ČERNOCKÝ Jan a YU Dong. Progressive contrastive learning for self-supervised text-independent speaker verification. In: Proceedings of The Speaker and Language Recognition Workshop (Odyssey 2022). Beijing: International Speech Communication Association, 2022, s. 17-24. Detail
- NADIMPALLI Vijaya Lakshmi V., KESIRAJU Santosh, BANKA Rohith, KETHIREDDY Rashmi a GANGASHETTY Suryakanth V. Resources and Benchmarks for Keyword Search in Spoken Audio From Low-Resource Indian Languages. IEEE Access, roč. 10, č. 2022, 2022, s. 34789-34799. ISSN 2169-3536. Detail
- BASKAR Murali K., HERZIG Tim, NGUYEN Diana, DIEZ Sánchez Mireia, POLZEHL Tim, BURGET Lukáš a ČERNOCKÝ Jan. Speaker adaptation for Wav2vec2 based dysarthric ASR. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Incheon: International Speech Communication Association, 2022, s. 3403-3407. ISSN 1990-9772. Detail
- EGOROVA Ekaterina, VYDANA Hari K., BURGET Lukáš a ČERNOCKÝ Jan. Spelling-Aware Word-Based End-to-End ASR. IEEE Signal Processing Letters, roč. 29, č. 29, 2022, s. 1729-1733. ISSN 1558-2361. Detail
- STAFYLAKIS Themos, MOŠNER Ladislav, PLCHOT Oldřich, ROHDIN Johan A., SILNOVA Anna, BURGET Lukáš a ČERNOCKÝ Jan. Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundaries. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Incheon: International Speech Communication Association, 2022, s. 605-609. ISSN 1990-9772. Detail
2021
- YUSUF Bolaji, ONDEL Yang Lucas Antoine Francois, BURGET Lukáš, ČERNOCKÝ Jan a SARAÇLAR Murat. A Hierarchical Subspace Model for Language-Attuned Acoustic Unit Discovery. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021, s. 3710-3714. ISBN 978-1-7281-7605-5. Detail
- LANDINI Federico Nicolás, GLEMBEK Ondřej, MATĚJKA Pavel, ROHDIN Johan A., BURGET Lukáš, DIEZ Sánchez Mireia a SILNOVA Anna. Analysis of the BUT Diarization System for Voxconverse Challenge. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021, s. 5819-5823. ISBN 978-1-7281-7605-5. Detail
- KARAFIÁT Martin, VESELÝ Karel, ČERNOCKÝ Jan, PROFANT Ján, NYTRA Jiří, HLAVÁČEK Miroslav a PAVLÍČEK Tomáš. Analysis of X-Vectors for Low-Resource Speech Recognition. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021, s. 6998-7002. ISBN 978-1-7281-7605-5. Detail
- KIŠŠ Martin, BENEŠ Karel a HRADIŠ Michal. AT-ST: Self-Training Adaptation Strategy for OCR in Domains with Limited Transcriptions. In: Lladós J., Lopresti D., Uchida S. (eds) Document Analysis and Recognition - ICDAR 2021. Lecture Notes in Computer Science, roč. 12824. Lausanne: Springer Nature Switzerland AG, 2021, s. 463-477. ISBN 978-3-030-86336-4. Detail
- KOCOUR Martin, CÁMBARA Guillermo, LUQUE Jordi, BONET David, FARRÚS Mireia, KARAFIÁT Martin, VESELÝ Karel a ČERNOCKÝ Jan. BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge. In: Proceedings of IberSPEECH 2021. Vallaloid: International Speech Communication Association, 2021, s. 113-117. Detail
- LANDINI Federico Nicolás, LOZANO Díez Alicia, BURGET Lukáš, DIEZ Sánchez Mireia, SILNOVA Anna, ŽMOLÍKOVÁ Kateřina, GLEMBEK Ondřej, MATĚJKA Pavel, STAFYLAKIS Themos a BRUMMER Johan Nikolaas Langenhoven. BUT System Description for The Third DIHARD Speech Diarization Challenge. In: Proceedings available at Dihard Challenge Github. on-line by LDC and University of Pennsylvania, 2021, s. 1-5. Detail
- BASKAR Murali K., BURGET Lukáš, WATANABE Shinji, ASTUDILLO Ramon a ČERNOCKÝ Jan. Eat: Enhanced ASR-TTS for Self-Supervised Speech Recognition. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021, s. 6753-6757. ISBN 978-1-7281-7605-5. Detail
- PENG Junyi, QU Xiaoyang, GU Rongzhi, WANG Jianzong, XIAO Jing, BURGET Lukáš a ČERNOCKÝ Jan. Effective Phase Encoding for End-To-End Speaker Verification. In: Proceedings Interspeech 2021. Brno: International Speech Communication Association, 2021, s. 2366-2370. ISSN 1990-9772. Detail
- PENG Junyi, QU Xiaoyang, WANG Jianzong, GU Rongzhi, XIAO Jing, BURGET Lukáš a ČERNOCKÝ Jan. ICSpk: Interpretable Complex Speaker Embedding Extractor from Raw Waveform. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Brno: International Speech Communication Association, 2021, s. 511-515. ISSN 1990-9772. Detail
- ŽMOLÍKOVÁ Kateřina, DELCROIX Marc, BURGET Lukáš, NAKATANI Tomohiro a ČERNOCKÝ Jan. Integration of Variational Autoencoder and Spatial Clustering for Adaptive Multi-Channel Neural Speech Separation. In: 2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings. Shenzhen - virtual : IEEE Signal Processing Society, 2021, s. 889-896. ISBN 978-1-7281-7066-4. Detail
- VYDANA Hari K., KARAFIÁT Martin, ŽMOLÍKOVÁ Kateřina, BURGET Lukáš a ČERNOCKÝ Jan. Jointly Trained Transformers Models for Spoken Language Translation. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021, s. 7513-7517. ISBN 978-1-7281-7605-5. Detail
- EGOROVA Ekaterina, VYDANA Hari K., BURGET Lukáš a ČERNOCKÝ Jan. Out-of-Vocabulary Words Detection with Attention and CTC Alignments in an End-to-End ASR System. In: Proceedings Interspeech 2021. Brno: International Speech Communication Association, 2021, s. 2901-2905. ISSN 1990-9772. Detail
- STAFYLAKIS Themos, ROHDIN Johan A. a BURGET Lukáš. Speaker embeddings by modeling channel-wise correlations. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Brno: International Speech Communication Association, 2021, s. 501-505. ISSN 1990-9772. Detail
- BENEŠ Karel a BURGET Lukáš. Text Augmentation for Language Models in High Error Recognition Scenario. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Brno: International Speech Communication Association, 2021, s. 1872-1876. ISSN 1990-9772. Detail
- VYDANA Hari K., KARAFIÁT Martin, BURGET Lukáš a ČERNOCKÝ Jan. The IWSLT 2021 BUT Speech Translation Systems. In: Proceedings of 18th International Conference on Spoken Language Translation (IWSLT) . Bangkok, on-line: Association for Computational Linguistics, 2021, s. 75-83. ISBN 978-1-7138-3378-9. Detail
2020
- MATĚJKA Pavel, PLCHOT Oldřich, GLEMBEK Ondřej, BURGET Lukáš, ROHDIN Johan A., ZEINALI Hossein, MOŠNER Ladislav, SILNOVA Anna, NOVOTNÝ Ondřej, DIEZ Sánchez Mireia a ČERNOCKÝ Jan. 13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE. Computer Speech and Language, roč. 2020, č. 63, s. 1-15. ISSN 0885-2308. Detail
- ALAM Jahangir, BOULIANNE Gilles, BURGET Lukáš, DAHMANE Mohamed, DIEZ Sánchez Mireia, GLEMBEK Ondřej, LALONDE Marc, LOZANO Díez Alicia, MATĚJKA Pavel, MIZERA Petr, MOŠNER Ladislav, NOISEUX Cédric, MONTEIRO Joao, NOVOTNÝ Ondřej, PLCHOT Oldřich, ROHDIN Johan A., SILNOVA Anna, SLAVÍČEK Josef, STAFYLAKIS Themos, ST-CHARLES Pierre-Luc, WANG Shuai a ZEINALI Hossein. Analysis of ABC Submission to NIST SRE 2019 CMN and VAST Challenge. In: Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop. Tokyo: International Speech Communication Association, 2020, s. 289-295. ISSN 2312-2846. Detail
- DIEZ Sánchez Mireia, BURGET Lukáš, LANDINI Federico Nicolás a ČERNOCKÝ Jan. Analysis of Speaker Diarization based on Bayesian HMM with Eigenvoice Priors. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, roč. 28, č. 1, 2020, s. 355-368. ISSN 2329-9290. Detail
- ZULUAGA-GOMEZ Juan, MOTLÍČEK Petr, ZHAN Qingran, VESELÝ Karel a BRAUN Rudolf. Automatic Speech Recognition Benchmark for Air-Traffic Communications. In: Proceedings of Interspeech 2020. Shanghai: International Speech Communication Association, 2020, s. 2297-2301. ISSN 1990-9772. Detail
- BURGET Lukáš, GLEMBEK Ondřej, LOZANO Díez Alicia, MATĚJKA Pavel, NOVOTNÝ Ondřej, PLCHOT Oldřich, PULUGUNDLA Bhargav, ROHDIN Johan A., SILNOVA Anna a VESELÝ Karel. BUT System Description to SdSV Challenge 2020. In: Proceedings of Short-duration Speaker Verification Challenge 2020 Workshop. Shanghai, on-line event of Interspeech 2020 Conference, 2020, s. 1-5. Detail
- LOZANO Díez Alicia, SILNOVA Anna, PULUGUNDLA Bhargav, ROHDIN Johan A., VESELÝ Karel, BURGET Lukáš, PLCHOT Oldřich, GLEMBEK Ondřej, NOVOTNÝ Ondřej a MATĚJKA Pavel. BUT Text-Dependent Speaker Verification System for SdSV Challenge 2020. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. Shanghai: International Speech Communication Association, 2020, s. 761-765. ISSN 1990-9772. Detail
- ROHDIN Johan A., SILNOVA Anna, DIEZ Sánchez Mireia, PLCHOT Oldřich, MATĚJKA Pavel, BURGET Lukáš a GLEMBEK Ondřej. End-to-end DNN based text-independent speaker recognition for long and short utterances. Computer Speech and Language, roč. 2020, č. 59, s. 22-35. ISSN 0885-2308. Detail
- WANG Shuai, ROHDIN Johan A., PLCHOT Oldřich, BURGET Lukáš, YU Kai a ČERNOCKÝ Jan. Investigation of Specaugment for Deep Speaker Embedding Learning. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Barcelona: IEEE Signal Processing Society, 2020, s. 7139-7143. ISBN 978-1-5090-6631-5. Detail
- SILNOVA Anna, BRUMMER Johan Nikolaas Langenhoven, ROHDIN Johan A., STAFYLAKIS Themos a BURGET Lukáš. Probabilistic embeddings for speaker diarization. In: Proceedings of Odyssey 2020 The Speaker and Language Recognition Workshop. Tokyo: International Speech Communication Association, 2020, s. 24-31. ISSN 2312-2846. Detail
2019
- ZEINALI Hossein, ČERNOCKÝ Jan a BURGET Lukáš. A multi purpose and large scale speech corpus in Persian and English for speaker and speech Recognition: the DeepMine database. In: IEEE Automatic Speech Recognition and Understanding Workshop - Proceedings (ASRU). Sentosa, Singapore: IEEE Signal Processing Society, 2019, s. 397-402. ISBN 978-1-7281-0306-8. Detail
- ALAM Jahangir, BOULIANNE Gilles, BURGET Lukáš, GLEMBEK Ondřej, LOZANO Díez Alicia, MATĚJKA Pavel, MIZERA Petr, MOŠNER Ladislav, NOVOTNÝ Ondřej, PLCHOT Oldřich, ROHDIN Johan A., SILNOVA Anna, SLAVÍČEK Josef, STAFYLAKIS Themos, WANG Shuai, ZEINALI Hossein, DAHMANE Mohamed, ST-CHARLES Pierre-Luc, LALONDE Marc, NOISEUX Cédric a MONTEIRO Joao. ABC System Description for NIST Multimedia Speaker Recognition Evaluation 2019. In: Proceedings of NIST 2019 SRE Workshop. Sentosa, Singapore: National Institute of Standards and Technology, 2019, s. 1-7. Detail
- MATĚJKA Pavel, PLCHOT Oldřich, ZEINALI Hossein, MOŠNER Ladislav, SILNOVA Anna, BURGET Lukáš, NOVOTNÝ Ondřej a GLEMBEK Ondřej. Analysis of BUT Submission in Far-Field Scenarios of VOiCES 2019 Challenge. In: Proceedings of Interspeech. Graz: International Speech Communication Association, 2019, s. 2448-2452. ISSN 1990-9772. Detail
- NOVOTNÝ Ondřej, PLCHOT Oldřich, GLEMBEK Ondřej, ČERNOCKÝ Jan a BURGET Lukáš. Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition. Computer Speech and Language, roč. 2019, č. 58, s. 403-421. ISSN 0885-2308. Detail
- DIEZ Sánchez Mireia, BURGET Lukáš, WANG Shuai, ROHDIN Johan A. a ČERNOCKÝ Jan. Bayesian HMM based x-vector clustering for Speaker Diarization. In: Proceedings of Interspeech. Graz: International Speech Communication Association, 2019, s. 346-350. ISSN 1990-9772. Detail
- ONDEL Yang Lucas Antoine Francois, VYDANA Hari K., BURGET Lukáš a ČERNOCKÝ Jan. Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery. In: Proceedings of Interspeech 2019. Graz: International Speech Communication Association, 2019, s. 261-265. ISSN 1990-9772. Detail
- ZEINALI Hossein, WANG Shuai, SILNOVA Anna, MATĚJKA Pavel a PLCHOT Oldřich. BUT System Description to VoxCeleb Speaker Recognition Challenge 2019. In: Proceedings of The VoxCeleb Challange Workshop 2019. Graz, 2019, s. 1-4. Detail
- NOVOTNÝ Ondřej, PLCHOT Oldřich, GLEMBEK Ondřej a BURGET Lukáš. Factorization of Discriminatively Trained i-Vector Extractor for Speaker Recognition. In: Proceedings of Interspeech. Graz: International Speech Communication Association, 2019, s. 4330-4334. ISSN 1990-9772. Detail
- WANG Shuai, ROHDIN Johan A., BURGET Lukáš, PLCHOT Oldřich, QIAN Yanmin, YU Kai a ČERNOCKÝ Jan. On the Usage of Phonetic Information for Text-independent Speaker Embedding Extraction. In: Proceedings of Interspeech. Graz: International Speech Communication Association, 2019, s. 1148-1152. ISSN 1990-9772. Detail
- STAFYLAKIS Themos, ROHDIN Johan A., PLCHOT Oldřich, MIZERA Petr a BURGET Lukáš. Self-supervised speaker embeddings. In: Proceedings of Interspeech. Graz: International Speech Communication Association, 2019, s. 2863-2867. ISSN 1990-9772. Detail
- ŽMOLÍKOVÁ Kateřina, DELCROIX Marc, KINOSHITA Keisuke, OCHIAI Tsubasa, NAKATANI Tomohiro, BURGET Lukáš a ČERNOCKÝ Jan. SpeakerBeam: Speaker Aware Neural Network for Target Speaker Extraction in Speech Mixtures. IEEE Journal of Selected Topics in Signal Processing, roč. 13, č. 4, 2019, s. 800-814. ISSN 1932-4553. Detail
2020
- Bayesovské shlukování x-vektorů založené na HMM - VBx, software, 2020
Autoři: Diez Sánchez Mireia, Landini Federico Nicolás, Burget Lukáš Detail