Result Details
Bayesian HMM based x-vector clustering for Speaker Diarization
        DIEZ SÁNCHEZ, M.; BURGET, L.; WANG, S.; ROHDIN, J.; ČERNOCKÝ, J. Bayesian HMM based x-vector clustering for Speaker Diarization. In Proceedings of Interspeech. Proceedings of Interspeech. Graz: International Speech Communication Association, 2019. no. 9, p. 346-350.  ISSN: 1990-9772.
    
                Type
            
        
                conference paper
            
        
                Language
            
        
                English
            
        
            Authors
            
        
                Diez Sánchez Mireia, M.Sc., Ph.D., DCGM (FIT)
                
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Wang Shuai, FIT (FIT), DCGM (FIT)
Rohdin Johan Andréas, M.Sc., Ph.D., FIT (FIT), DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
        Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Wang Shuai, FIT (FIT), DCGM (FIT)
Rohdin Johan Andréas, M.Sc., Ph.D., FIT (FIT), DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
                    Abstract
            
        This paper presents a simplified version of the previously proposeddiarization algorithm based on Bayesian Hidden MarkovModels, which uses Variational Bayesian inference for very fastand robust clustering of x-vector (neural network based speakerembeddings). The presented results show that this clusteringalgorithm provides significant improvements in diarization performanceas compared to the previously used AgglomerativeHierarchical Clustering. The output of this system can be furtheremployed as an initialization for a second stage VB diarizationsystem, using frame-wise MFCC features as input, to obtainoptimal results.
                Keywords
            
        Speaker Diarization, Variational Bayes, HMM,x-vector, DIHARD
                URL
            
        
                Published
            
            
                    2019
                    
                
            
                    Pages
                
            
                        346–350
                
            
                    Journal
                
            
                    Proceedings of Interspeech, vol. 2019, no. 9, ISSN 1990-9772
                
            
                        Proceedings
                
            
                    Proceedings of Interspeech
                
            
                    Conference
                
            
                    Interspeech Conference
                
            
                    Publisher
                
            
                    International Speech Communication Association
                
            
                    Place
                
            
                    Graz
                
            
                    DOI
                
            
                    UT WoS
                
            
                    000831796400070
                
            
                EID Scopus
                
            
                    BibTeX
                
            @inproceedings{BUT159992,
  author="Mireia {Diez Sánchez} and Lukáš {Burget} and Shuai {Wang} and Johan Andréas {Rohdin} and Jan {Černocký}",
  title="Bayesian HMM based x-vector clustering for Speaker Diarization",
  booktitle="Proceedings of Interspeech",
  year="2019",
  journal="Proceedings of Interspeech",
  volume="2019",
  number="9",
  pages="346--350",
  publisher="International Speech Communication Association",
  address="Graz",
  doi="10.21437/Interspeech.2019-2813",
  issn="1990-9772",
  url="https://www.isca-speech.org/archive/Interspeech_2019/pdfs/2813.pdf"
}
                
                Files
            
        
                Projects
            
        
        
            
        
    
    
        Information mining in speech acquired by distant microphones, MV, Bezpečnostní výzkum České republiky 2015-2020, VI20152020025, start: 2015-10-01, end: 2020-09-30, completed
                
IT4Innovations excellence in science, MŠMT, Národní program udržitelnosti II, LQ1602, start: 2016-01-01, end: 2020-12-31, completed
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
Robust SPEAKER DIariazation systems using Bayesian inferenCE and deep learning methods, EU, Horizon 2020, start: 2017-03-01, end: 2019-02-28, completed
Sequence summarizing neural networks for speaker recognition, EU, Horizon 2020, 5SA15094, start: 2016-07-01, end: 2019-06-30, completed
Zpracování, zobrazování a analýza multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-17-3984, start: 2017-03-01, end: 2020-02-29, completed
        IT4Innovations excellence in science, MŠMT, Národní program udržitelnosti II, LQ1602, start: 2016-01-01, end: 2020-12-31, completed
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
Robust SPEAKER DIariazation systems using Bayesian inferenCE and deep learning methods, EU, Horizon 2020, start: 2017-03-01, end: 2019-02-28, completed
Sequence summarizing neural networks for speaker recognition, EU, Horizon 2020, 5SA15094, start: 2016-07-01, end: 2019-06-30, completed
Zpracování, zobrazování a analýza multimediálních a 3D dat, BUT, Vnitřní projekty VUT, FIT-S-17-3984, start: 2017-03-01, end: 2020-02-29, completed
                Research groups
            
        
                Speech Data Mining Research Group BUT Speech@FIT (RG SPEECH)
            
        
                Departments