Thesis Details

Neural Networks for Automatic Table Recognition

Master's Thesis Student: Piwowarski Lukáš Academic Year: 2021/2022 Supervisor: Hradiš Michal, Ing., Ph.D.
Czech title
Automatické rozpoznávání tabulek pomocí neuronových sítí
Language
English
Abstract

This thesis introduces the reader to the current table recognition techniques mainly used to extract information from historical handwritten and printed tables. We also introduce a method based on graph neural network, which is inspired by the presented techniques. The method consists of three phases: graph initialization, node/edge classification and graph to text transformation phase. In the graph initialization phase, we use the node visibility algorithm and OCR to create an initial graph representation of the input table. In the node and edge classification phase, the nodes and edges are classified, and in the graph to text transformation phase, we fit the graph's nodes into a grid which is then used to produce the final text representation of the table. The implemented model achieved horizontal neighbours detection precision of 68 %, vertical neighbours detection precision of 71 % and cell detection precision of 85 % on the ABP dataset.

Keywords

table recognition, graph neural network, transformer neural network, edge discover, node discovery, optical character recognition, table recognition datasets, graph initialization, table recognition evaluation

Department
Degree Programme
Files
Status
defended, grade A
Date
20 June 2022
Reviewer
Committee
Zemčík Pavel, prof. Dr. Ing. (DCGM FIT BUT), předseda
Beran Vítězslav, Ing., Ph.D. (DCGM FIT BUT), člen
Čadík Martin, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Juránek Roman, Ing., Ph.D. (DCGM FIT BUT), člen
Křivka Zbyněk, Ing., Ph.D. (DIFS FIT BUT), člen
Milet Tomáš, Ing., Ph.D. (DCGM FIT BUT), člen
Citation
PIWOWARSKI, Lukáš. Neural Networks for Automatic Table Recognition. Brno, 2022. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2022-06-20. Supervised by Hradiš Michal. Available from: https://www.fit.vut.cz/study/thesis/24864/
BibTeX
@mastersthesis{FITMT24864,
    author = "Luk\'{a}\v{s} Piwowarski",
    type = "Master's thesis",
    title = "Neural Networks for Automatic Table Recognition",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2022,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/24864/"
}
Back to top