Thesis Details

Uspořádání fragmentů textu s pomocí jazykového modelu

Master's Thesis Student: Holubec Michael Academic Year: 2021/2022 Supervisor: Beneš Karel, Ing.
English title
Reordering Text Fragments Using a Language Model
Language
Czech
Abstract

The aim of this work is to construct and experimentally verify the effectiveness of the language model in identifying the reading order. For this purpose language model with LSTM architecture was constructed. This work designs and implements three methods which are used to identify reading order. These methods are Language analysis, Spatial analysis and Combined analysis. Language analysis and combined analysis used constructed language model. The success of the language model, and all three methods, was measured on three datasets containing newspaper articles. Language analysis reaches 57,6 % and spatial analysis reaches 91,6 %. Combined analysis achieved the best results 92,9 %. The work shows that the language model can be used to identify reading order but use of additional data (e.g. spatial data

Keywords

Reading order, Language model, Language analysis, Spatial analysis

Department
Degree Programme
Information Technology and Artificial Intelligence, Specialization Information Systems and Databases
Files
Status
defended, grade A
Date
21 June 2022
Reviewer
Committee
Kolář Dušan, doc. Dr. Ing. (DIFS FIT BUT), předseda
Bartík Vladimír, Ing., Ph.D. (DIFS FIT BUT), člen
Hruška Tomáš, prof. Ing., CSc. (DIFS FIT BUT), člen
Hynek Jiří, Ing., Ph.D. (DIFS FIT BUT), člen
Veselý Vladimír, Ing., Ph.D. (DIFS FIT BUT), člen
Vojnar Tomáš, prof. Ing., Ph.D. (DITS FIT BUT), člen
Citation
HOLUBEC, Michael. Uspořádání fragmentů textu s pomocí jazykového modelu. Brno, 2022. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2022-06-21. Supervised by Beneš Karel. Available from: https://www.fit.vut.cz/study/thesis/23379/
BibTeX
@mastersthesis{FITMT23379,
    author = "Michael Holubec",
    type = "Master's thesis",
    title = "Uspo\v{r}\'{a}d\'{a}n\'{i} fragment\r{u} textu s pomoc\'{i} jazykov\'{e}ho modelu",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2022,
    location = "Brno, CZ",
    language = "czech",
    url = "https://www.fit.vut.cz/study/thesis/23379/"
}
Back to top