Result Details
SCDF: A Speaker Characteristics DeepFake Speech Dataset for Bias Analysis
Srna Karel, Bc., FIT (FIT), DCGM (FIT)
Firc Anton, Ing., Ph.D., DITS (FIT)
Malinka Kamil, doc. Mgr., Ph.D., DITS (FIT)
Despite growing attention to deepfake speech detection, the aspects of
bias and fairness remain underexplored in the speech domain. To address
this gap, we introduce the Speaker Characteristics Deepfake (SCDF)
dataset: a novel, richly annotated resource enabling systematic
evaluation of demographic biases in deepfake speech detection. SCDF
contains over 237,000 utterances in a balanced representation of both
male and female speakers spanning five languages and a wide age range.
We evaluate several state-of-the-art detectors and show that speaker
characteristics significantly influence detection performance, revealing
disparities across sex, language, age, and synthesizer type. These
findings highlight the need for bias-aware development and provide a
foundation for building non-discriminatory deepfake detection systems
aligned with ethical and regulatory standards.
Bias, Fairness, Dataset, Deepfake Speech, Anti-spoofing
@inproceedings{BUT198600,
author="Vojtěch {Staněk} and Karel {Srna} and Anton {Firc} and Kamil {Malinka}",
title="SCDF: A Speaker Characteristics DeepFake Speech Dataset for Bias Analysis",
booktitle="Proceedings of the 24th International Conference of the Biometrics Special Interest Group (BIOSIG 2025)",
year="2025",
journal="GI-Edition. Proceedings",
volume="2025",
number="367",
pages="55--64",
publisher="Gesellschaft für Informatik e.V.",
address="Darmstadt",
doi="10.18420/biosig\{_}2025\{_}005",
issn="1617-5468",
url="https://doi.org/10.18420/biosig_2025_005"
}