Publication Details
TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models
ASHIHARA, T.
Delcroix Marc
OCHIAI, T.
Plchot Oldřich, Ing., Ph.D. (DCGM)
ARAKI, S.
Černocký Jan, prof. Dr. Ing. (DCGM)
Self-supervised learning, target-speaker speech process, speech recognition,
speech enhancement, voice activity detection
Self-supervised learning (SSL) models have significantly advanced speech
processing tasks, and several benchmarks have been pro- posed to validate their
effectiveness. However, previous benchmarks have primarily focused on
single-speaker scenarios, with less exploration of target-speaker tasks in noisy,
multi-talker conditions-a more challenging yet practical case. In this paper, we
introduce the Target-Speaker Speech Processing Universal Performance Benchmark
(TS-SUPERB), which includes four widely recognized target-speaker processing
tasks that require identifying the target speaker and extracting information from
the speech mixture. In our benchmark, the speaker embedding extracted from
enrollment speech is used as a clue to condition downstream models. The benchmark
result reveals the importance of evaluating SSL models in target speaker
scenarios, demonstrating that performance cannot be easily inferred from related
single-speaker tasks. Moreover, by using a unified SSL-based target speech
encoder, consisting of a speaker encoder and an extractor module, we also
investigate joint optimization across TS tasks to leverage mutual information and
demonstrate its effectiveness.
@inproceedings{BUT198051,
author="PENG, J. and ASHIHARA, T. and DELCROIX, M. and OCHIAI, T. and PLCHOT, O. and ARAKI, S. and ČERNOCKÝ, J.",
title="TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning Models",
booktitle="Proceedings of ICASSP 2025",
year="2025",
pages="1--5",
publisher="IEEE Biometric Council",
address="Hyderabad",
doi="10.1109/ICASSP49660.2025.10887574",
isbn="979-8-3503-6874-1",
url="https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10887574"
}