Result Details
Analysis of X-Vectors for Low-Resource Speech Recognition
        KARAFIÁT, M.; VESELÝ, K.; ČERNOCKÝ, J.; PROFANT, J.; NYTRA, J.; HLAVÁČEK, M.; PAVLÍČEK, T. Analysis of X-Vectors for Low-Resource Speech Recognition. In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Ontario: IEEE Signal Processing Society, 2021. p. 6998-7002.  ISBN: 978-1-7281-7605-5.
    
                Type
            
        
                conference paper
            
        
                Language
            
        
                English
            
        
            Authors
            
        
                Karafiát Martin, Ing., Ph.D., DCGM (FIT)
                
Veselý Karel, Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Profant Ján, Ing.
Nytra Jiří, Bc.
HLAVÁČEK, M.
Pavlíček Tomáš, Ing.
        Veselý Karel, Ing., Ph.D., DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Profant Ján, Ing.
Nytra Jiří, Bc.
HLAVÁČEK, M.
Pavlíček Tomáš, Ing.
                    Abstract
            
        The paper presents a study of usability of x-vectors for adaptationof automatic speech recognition (ASR) systems. Xvectorsare Neural Network (NN)-based speaker embeddingsrecently proposed in speaker recognition (SR). They quicklyreplaced common i-vectors and became new state-of-the-arttechnique. Here, the same approach is adopted for ASR withthe hope of similar outcome. All experiments were done onASR for the latest IARPA MATERIAL evaluation running onPashto language. Over 1% absolute improvement was observedwith x-vectors over traditional i-vectors, even whenthe x-vector extractor was not trained on target Pashto data.
                Keywords
            
        speech recognition, adaptation, x-vectors,data augmentation, robustness
                URL
            
        
                Published
            
            
                    2021
                    
                
            
                    Pages
                
            
                        6998–7002
                
            
                        Proceedings
                
            
                    ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
                
            
                    Conference
                
            
                    2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
                
            
                    ISBN
                
            
                    978-1-7281-7605-5
                
            
                    Publisher
                
            
                    IEEE Signal Processing Society
                
            
                    Place
                
            
                    Toronto, Ontario
                
            
                    DOI
                
            
                    UT WoS
                
            
                    000704288407055
                
            
                EID Scopus
                
            
                    BibTeX
                
            @inproceedings{BUT175794,
  author="KARAFIÁT, M. and VESELÝ, K. and ČERNOCKÝ, J. and PROFANT, J. and NYTRA, J. and HLAVÁČEK, M. and PAVLÍČEK, T.",
  title="Analysis of X-Vectors for Low-Resource Speech Recognition",
  booktitle="ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)",
  year="2021",
  pages="6998--7002",
  publisher="IEEE Signal Processing Society",
  address="Toronto, Ontario",
  doi="10.1109/ICASSP39728.2021.9414725",
  isbn="978-1-7281-7605-5",
  url="https://www.fit.vut.cz/research/publication/12525/"
}
                Files
            
        
                Projects
            
        
        
    
    
        IARPA Machine Translation for English Retrieval of Information in Any Language (MATERIAL) - Foreign Language Automated Information Retrieval (FLAIR), IARPA, start: 2017-09-21, end: 2021-10-22, completed
                
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
Real time network, text, and speaker analytics for combating organized crime, EU, Horizon 2020, start: 2019-09-01, end: 2022-12-31, completed
Robust processing of recordings for operations and security, MV, PROGRAM STRATEGICKÁ PODPORA ROZVOJE BEZPEČNOSTNÍHO VÝZKUMU ČR 2019-2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/1VS), VJ01010108, start: 2020-10-01, end: 2025-09-30, completed
        Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
Real time network, text, and speaker analytics for combating organized crime, EU, Horizon 2020, start: 2019-09-01, end: 2022-12-31, completed
Robust processing of recordings for operations and security, MV, PROGRAM STRATEGICKÁ PODPORA ROZVOJE BEZPEČNOSTNÍHO VÝZKUMU ČR 2019-2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/1VS), VJ01010108, start: 2020-10-01, end: 2025-09-30, completed
                Research groups
            
        
                Speech Data Mining Research Group BUT Speech@FIT (RG SPEECH)
            
        
                Departments