Thesis Details
Analýza rozložení stran textových dokumentů pomocí hlubokých neuronových sítí
The goal of this thesis is to create a tool for analyzig the page layouts of text documents. The problem is solved by convolution neural networks. The architecture chosen in this thesis is the U-Net architecture. The cross entropy error function with weight map is used for train the network model. Paragraph regions are obtained throught connected component analysis. Experiments are evaluated using the Symmetric Best Dice object metric. Experiments have shown that it is better to use all paragraph edges than to focus only on vertical paragraph edges. In addition, experiments show that batche sampling strategies and adaptive resolution help to improve analysis results. The experiments also describe the application of separators, which is useful in analyzing multi-column documents.
computer vision, deep neural networks, page layout analysis, image segmentation, U-Net, artificial intelligence
Bidlo Michal, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Čadík Martin, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Křivka Zbyněk, Ing., Ph.D. (DIFS FIT BUT), člen
Rogalewicz Adam, doc. Mgr., Ph.D. (DITS FIT BUT), člen
@bachelorsthesis{FITBT20900, author = "David Endrych", type = "Bachelor's thesis", title = "Anal\'{y}za rozlo\v{z}en\'{i} stran textov\'{y}ch dokument\r{u} pomoc\'{i} hlubok\'{y}ch neuronov\'{y}ch s\'{i}t\'{i}", school = "Brno University of Technology, Faculty of Information Technology", year = 2019, location = "Brno, CZ", language = "czech", url = "https://www.fit.vut.cz/study/thesis/20900/" }