Thesis Details

Strojové učení pro odpovídání na otázky v přirozeném jazyce

Bachelor's Thesis Student: Sasín Jonáš Academic Year: 2020/2021 Supervisor: Smrž Pavel, doc. RNDr., Ph.D.
English title
Machine Learning for Natural Language Question Answering

This thesis deals with natural language question answering using Czech Wikipedia. Question answering systems are experiencing growing popularity, but most of them are developed for English. The main purpose of this work is to explore possibilities and datasets available and create such system for Czech. In the thesis I focused on two approaches. One of them uses English model ALBERT and machine translation of passages. The other one utilizes the multilingual BERT. Several variants of the system are compared in this work. Possibilities of relevant passage retrieval are also discussed. Standard evaluation is provided for every variant of the tested system. The best system version has been evaluated on the SQAD v3.0 dataset, reaching 0.44 EM and 0.55 F1 score, which is an excellent result compared to other existing systems. The main contribution of this work is the analysis of existing possibilities and setting a benchmark for further development of better systems for Czech.


natural language processing, NLP, Czech, question answering, machine learning, knowledge mining, Wikipedia, open-domain, SQAD, ALBERT, BERT, BM25 

Degree Programme
Information Technology
defended, grade B
16 June 2021
Smrž Pavel, doc. RNDr., Ph.D. (DCGM FIT BUT), předseda
Burgetová Ivana, Ing., Ph.D. (DIFS FIT BUT), člen
Kreslíková Jitka, doc. RNDr., CSc. (DIFS FIT BUT), člen
Peringer Petr, Dr. Ing. (DITS FIT BUT), člen
Strnadel Josef, Ing., Ph.D. (DCSY FIT BUT), člen
SASÍN, Jonáš. Strojové učení pro odpovídání na otázky v přirozeném jazyce. Brno, 2021. Bachelor's Thesis. Brno University of Technology, Faculty of Information Technology. 2021-06-16. Supervised by Smrž Pavel. Available from:
    author = "Jon\'{a}\v{s} Sas\'{i}n",
    type = "Bachelor's thesis",
    title = "Strojov\'{e} u\v{c}en\'{i} pro odpov\'{i}d\'{a}n\'{i} na ot\'{a}zky v p\v{r}irozen\'{e}m jazyce",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2021,
    location = "Brno, CZ",
    language = "czech",
    url = ""
Back to top