Detail výsledku

FIT BUT at SemEval-2023 Task 12: Sentiment Without Borders - Multilingual Domain Adaptation for Low-Resource Sentiment Classification

APAROVICH, M.; KESIRAJU, S.; DUFKOVÁ, A.; SMRŽ, P. FIT BUT at SemEval-2023 Task 12: Sentiment Without Borders - Multilingual Domain Adaptation for Low-Resource Sentiment Classification. In Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023). Toronto (online): Association for Computational Linguistics, 2023. p. 1518-1524. ISBN: 978-1-959429-99-9.
Typ
článek ve sborníku konference
Jazyk
anglicky
Autoři
Aparovich Maksim
Kesiraju Santosh, Ph.D., UPGM (FIT)
Dufková Aneta, Ing.
Smrž Pavel, doc. RNDr., Ph.D., UPGM (FIT)
Abstrakt

This paper presents our proposed method for SemEval-2023 Task 12, which focuses on sentiment analysis for low-resource African lan- guages. Our method utilizes a language-centric domain adaptation approach which is based on adversarial training, where a small version of Afro-XLM-Roberta serves as a generator model and a feed-forward network as a discriminator. We participated in all three subtasks: monolingual (12 tracks), multilingual (1 track), and zero-shot (2 tracks). Our results show an improvement in weighted F1 for 13 out of 15 tracks with a maximum increase of 4.3 points for Moroccan Arabic compared to the baseline. We observed that using language family-based labels along with sequence-level input representations for the discriminator model improves the quality of the cross-lingual sentiment analysis for the languages unseen during the training. Additionally, our experimental results suggest that training the system on languages that are close in a language families tree enhances the quality of sentiment analysis for low-resource languages. Lastly, the computational complexity of the prediction step was kept at the same level which makes the approach to be interesting from a practical perspective. The code of the approach can be found in our repository.

Klíčová slova

sentiment analysis, cross-lingual sentiment analysis, domain adaptation, adversarial training, low-resource languages, African languages, transformer, feed-forward neural network

URL
Rok
2023
Strany
1518–1524
Sborník
Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023)
Konference
The 61st Annual Meeting of the Association for Computational Linguistics
ISBN
978-1-959429-99-9
Vydavatel
Association for Computational Linguistics
Místo
Toronto (online)
DOI
UT WoS
001281001900208
EID Scopus
BibTeX
@inproceedings{BUT187994,
  author="Maksim {Aparovich} and Santosh {Kesiraju} and Aneta {Dufková} and Pavel {Smrž}",
  title="FIT BUT at SemEval-2023 Task 12: Sentiment Without Borders - Multilingual Domain Adaptation for Low-Resource Sentiment Classification",
  booktitle="Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023)",
  year="2023",
  pages="1518--1524",
  publisher="Association for Computational Linguistics",
  address="Toronto (online)",
  doi="10.18653/v1/2023.semeval-1.209",
  isbn="978-1-959429-99-9",
  url="https://aclanthology.org/2023.semeval-1.209/"
}
Projekty
Automatizace vývojových operací (DevOps) za pomoci umělé inteligence, EU, Horizon 2020, 8A21015, 101007350, zahájení: 2021-04-01, ukončení: 2024-03-31, řešení
Soudobé metody zpracování, analýzy a zobrazování multimediálních a 3D dat, VUT, Vnitřní projekty VUT, FIT-S-23-8278, zahájení: 2023-03-01, ukončení: 2026-02-28, řešení
Výzkumné skupiny
Pracoviště
Nahoru