Publication Details
Diacorrect: Error Correction Back-End for Speaker Diarization
Landini Federico Nicolás, Ph.D. (RG SPEECH)
Rohdin Johan Andréas, M.Sc., Ph.D. (DCGM)
DIEZ SÁNCHEZ, M.
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
CAO, Y.
LU, H.
Černocký Jan, prof. Dr. Ing. (DCGM)
Speaker diarization, error correction, conversational
telephone speech
In this work, we propose an error correction framework, named
DiaCorrect, to refine the output of a diarization system in a simple
yet effective way. This method is inspired by error correction
techniques in automatic speech recognition. Our model consists
of two parallel convolutional encoders and a transformerbased
decoder. By exploiting the interactions between the input
recording and the initial system's outputs, DiaCorrect can
automatically correct the initial speaker activities to minimize
the diarization errors. Experiments on 2-speaker telephony data
show that the proposed DiaCorrect can effectively improve the
initial model's results. Our source code is publicly available at
https://github.com/BUTSpeechFIT/diacorrect.
@inproceedings{BUT189697,
author="HAN, J. and LANDINI, F. and ROHDIN, J. and DIEZ SÁNCHEZ, M. and BURGET, L. and CAO, Y. and LU, H. and ČERNOCKÝ, J.",
title="Diacorrect: Error Correction Back-End for Speaker Diarization",
booktitle="ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)",
year="2024",
pages="11181--11185",
publisher="IEEE Signal Processing Society",
address="Seoul",
doi="10.1109/ICASSP48485.2024.10446968",
isbn="979-8-3503-4485-1",
url="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10446968"
}