Publication Details
Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing
Stafylakis Themos (OMILIA)
Mošner Ladislav, Ing. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
emotion recognition, self-supervised features, iemocap, hubert, wavlm, wav2vec 2.0
When recognizing emotions from speech, we encounter two common problems: how to optimally capture emotion- relevant information from the speech signal and how to best quantify or categorize the noisy subjective emotion labels. Self-supervised pre-trained representations can robustly cap- ture information from speech enabling state-of-the-art results in many downstream tasks including emotion recognition. However, better ways of aggregating the information across time need to be considered as the relevant emotion informa- tion is likely to appear piecewise and not uniformly across the signal. For the labels, we need to take into account that there is a substantial degree of noise that comes from the subjective human annotations. In this paper, we propose a novel approach to attentive pooling based on correlations be- tween the representations' coefficients combined with label smoothing, a method aiming to reduce the confidence of the classifier on the training labels. We evaluate our proposed approach on the benchmark dataset IEMOCAP, and demon- strate high performance surpassing that in the literature. The code to reproduce the results is available at github.com/ skakouros/s3prl_attentive_correlation.
@INPROCEEDINGS{FITPUB13054, author = "Sofoklis Kakouros and Themos Stafylakis and Ladislav Mo\v{s}ner and Luk\'{a}\v{s} Burget", title = "Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing", pages = "1--5", booktitle = "Proceedings of ICASSP 2023", year = 2023, location = "Rhodes Island, GR", publisher = "IEEE Signal Processing Society", ISBN = "978-1-7281-6327-7", doi = "10.1109/ICASSP49357.2023.10094673", language = "english", url = "https://www.fit.vut.cz/research/publication/13054" }