Result Details

Speech production under stress for machine learning: multimodal dataset of 79 cases and 8 signals

PEŠÁN, J.; JUŘÍK, V.; RŮŽIČKOVÁ, A.; SVOBODA, V.; JANOUŠEK, O.; NĚMCOVÁ, A.; BOJANOVSKÁ, H.; ALDABAGHOVÁ, J.; KYSLÍK, F.; VODIČKOVÁ, K.; SODOMOVÁ, A.; BARTYS, P.; CHUDÝ, P.; ČERNOCKÝ, J. Speech production under stress for machine learning: multimodal dataset of 79 cases and 8 signals. Scientific Data, 2024, vol. 11, no. 1, p. 1-9. ISSN: 2052-4463.

Type

journal article

Language

English

Authors

Pešán Jan, Ing.
Juřík Vojtěch, Mgr., Ph.D., AIU (FCE)
Růžičková Alexandra
Svoboda Vojtěch, Bc.
Janoušek Oto, Ing., Ph.D., UBMI (FEEC)
Němcová Andrea, Ing., Ph.D., UBMI (FEEC)
Bojanovská Hana, Bc.
Aldabaghová Jasmína, Mgr.
Kyslík Filip
Vodičková Kateřina, Bc.
Sodomová Adéla, Bc.
Bartys Patrik, Mgr.
Chudý Peter, doc. Ing., Ph.D., MBA, DCGM (FIT)
Černocký Jan, prof. Dr. Ing., DCGM (FIT)

Abstract

Early identification of cognitive or physical overload is critical in fields where human decision making matters when preventing threats to safety and property. Pilots, drivers, surgeons, and operators of nuclear plants are among those affected by this challenge, as acute stress can impair their cognition. In this context, the significance of paralinguistic automatic speech processing increases for early stress detection. The intensity, intonation, and cadence of an utterance are examples of paralinguistic traits that determine the meaning of a sentence and are often lost in the verbatim transcript. To address this issue, tools are being developed to recognize paralinguistic traits effectively. However, a data bottleneck still exists in the training of paralinguistic speech traits, and the lack of high-quality reference data for the training of artificial systems persists. Regarding this, we present an original empirical dataset collected using the BESST experimental protocol for capturing speech signals under induced stress. With this data, our aim is to promote the development of pre-emptive intervention systems based on stress estimation from speech.

Keywords

speech, stress, machine learning

URL

Published

2024

Pages

1–9

Journal

Scientific Data, vol. 11, no. 1, ISSN 2052-4463

Publisher

Springer Nature

DOI

10.1038/s41597-024-03991-w

UT WoS

001353330000007

EID Scopus

2-s2.0-85209350842

BibTeX

@article{BUT193434,
  author="Jan {Pešán} and Vojtěch {Juřík} and Alexandra {Růžičková} and Vojtěch {Svoboda} and Oto {Janoušek} and Andrea {Němcová} and Hana {Bojanovská} and Jasmína {Aldabaghová} and Filip {Kyslík} and Kateřina {Vodičková} and Adéla {Sodomová} and Patrik {Bartys} and Peter {Chudý} and Jan {Černocký}",
  title="Speech production under stress for machine learning: multimodal dataset of 79 cases and 8 signals",
  journal="Scientific Data",
  year="2024",
  volume="11",
  number="1",
  pages="1--9",
  doi="10.1038/s41597-024-03991-w",
  url="https://www.nature.com/articles/s41597-024-03991-w"
}

Files

pdf pesan_sci data_2024_s41597-024-03991-w.pdf 2 MB

Projects

Multilingual and Cross-cultural interactions for context-aware, and bias-controlled dialogue systems for safety-critical applications, EU, HORIZON EUROPE, start: 2024-01-01, end: 2026-12-31, running
Robust processing of recordings for operations and security, MV, PROGRAM STRATEGICKÁ PODPORA ROZVOJE BEZPEČNOSTNÍHO VÝZKUMU ČR 2019-2025 (IMPAKT 1) PODPROGRAMU 1 SPOLEČNÉ VÝZKUMNÉ PROJEKTY (BV IMP1/1VS), VJ01010108, start: 2020-10-01, end: 2025-09-30, completed

Research groups

Speech Data Mining Research Group BUT Speech@FIT (RG SPEECH)

Departments

Department of Biomedical Engineering (UBMI)
Department of Computer Graphics and Multimedia (DCGM)
Institute of Computer Aided Engineering and Computer Science (AIU)