Result Details

Diacorrect: Error Correction Back-End for Speaker Diarization

HAN, J.; LANDINI, F.; ROHDIN, J.; DIEZ SÁNCHEZ, M.; BURGET, L.; CAO, Y.; LU, H.; ČERNOCKÝ, J. Diacorrect: Error Correction Back-End for Speaker Diarization. In ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul: IEEE Signal Processing Society, 2024. p. 11181-11185. ISBN: 979-8-3503-4485-1.
Type
conference paper
Language
English
Authors
Han Jiangyu, DCGM (FIT)
Landini Federico Nicolás, Ph.D., DCGM (FIT)
Rohdin Johan Andréas, M.Sc., Ph.D., FIT (FIT), DCGM (FIT)
DIEZ SÁNCHEZ, M.
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
CAO, Y.
LU, H.
Černocký Jan, prof. Dr. Ing., DCGM (FIT)
Abstract

In this work, we propose an error correction framework, named
DiaCorrect, to refine the output of a diarization system in a simple
yet effective way. This method is inspired by error correction
techniques in automatic speech recognition. Our model consists
of two parallel convolutional encoders and a transformerbased
decoder. By exploiting the interactions between the input
recording and the initial system's outputs, DiaCorrect can
automatically correct the initial speaker activities to minimize
the diarization errors. Experiments on 2-speaker telephony data
show that the proposed DiaCorrect can effectively improve the
initial model's results. Our source code is publicly available at
https://github.com/BUTSpeechFIT/diacorrect.

Keywords

Speaker diarization, error correction, conversational
telephone speech

URL
Published
2024
Pages
11181–11185
Proceedings
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Conference
2024 IEEE International Conference on Acoustics, Speech and Signal Processing IEEE
ISBN
979-8-3503-4485-1
Publisher
IEEE Signal Processing Society
Place
Seoul
DOI
EID Scopus
BibTeX
@inproceedings{BUT189697,
  author="HAN, J. and LANDINI, F. and ROHDIN, J. and DIEZ SÁNCHEZ, M. and BURGET, L. and CAO, Y. and LU, H. and ČERNOCKÝ, J.",
  title="Diacorrect: Error Correction Back-End for Speaker Diarization",
  booktitle="ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)",
  year="2024",
  pages="11181--11185",
  publisher="IEEE Signal Processing Society",
  address="Seoul",
  doi="10.1109/ICASSP48485.2024.10446968",
  isbn="979-8-3503-4485-1",
  url="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10446968"
}
Files
Projects
Exchanges for SPEech ReseArch aNd TechnOlogies, EU, Horizon 2020, start: 2021-01-01, end: 2025-12-31, running
Neural Representations in multi-modal and multi-lingual modeling, GACR, Grantové projekty exelence v základním výzkumu EXPRO - 2019, GX19-26934X, start: 2019-01-01, end: 2023-12-31, completed
Robust processing of recordings for operations and security, MV, PROGRAM STRATEGICKÁ PODPORA ROZVOJE BEZPEČNOSTNÍHO VÝZKUMU ČR 2019-2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/1VS), VJ01010108, start: 2020-10-01, end: 2025-09-30, completed
Research groups
Departments
Back to top