Thesis Details

Multilingual Open-Domain Question Answering

Master's Thesis Student: Slávka Michal Academic Year: 2020/2021 Supervisor: Fajčík Martin, Ing.
Czech title
Vícejazyčný systém pro odpovídání na otázky nad otevřenou doménou
Language
English
Abstract

This thesis explores automatic Multilingual Open-Domain Question Answering. In this work are proposed approaches to this less explored research area. More precisely, this work examines if: (i) utilization of an English system is sufficient, (ii) multilingual models can benefit from a translated question into other languages (iii) or avoiding translation is a better choice. English system based on the T5 model that uses a machine translation is compared to natively multilingual systems based on the multilingual MT5 model. The English system with machine translation only slightly outperforms its monolingual counterparts in multiple tasks. Compared to multilingual models, the English system was trained on a much larger dataset, but the results were comparable. This shows that the use of natively multilingual systems is a promising approach for future research. I also present a method of retrieving multilingual evidence using the BM25 ranking algorithm and compare it with English retrieval. The use of multilingual evidence seems to be beneficial and improves the performance of the systems.

Keywords

Natural Language Processing, Question Answering, Information Retrieval, Multilingual, BM25, Transformers

Department
Degree Programme
Information Technology and Artificial Intelligence, Specialization Machine Learning
Files
Status
defended, grade C
Date
23 June 2021
Reviewer
Committee
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT), předseda
Bařina David, Ing., Ph.D. (DCGM FIT BUT), člen
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Čadík Martin, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Češka Milan, doc. RNDr., Ph.D. (DITS FIT BUT), člen
Rozman Jaroslav, Ing., Ph.D. (DITS FIT BUT), člen
Citation
SLÁVKA, Michal. Multilingual Open-Domain Question Answering. Brno, 2021. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2021-06-23. Supervised by Fajčík Martin. Available from: https://www.fit.vut.cz/study/thesis/23369/
BibTeX
@mastersthesis{FITMT23369,
    author = "Michal Sl\'{a}vka",
    type = "Master's thesis",
    title = "Multilingual Open-Domain Question Answering",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2021,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/23369/"
}
Back to top