Neural Networks for Automatic Table Recognition

Master's Thesis Student: Piwowarski Lukáš Academic Year: 2021/2022 Supervisor: Hradiš Michal, Ing., Ph.D.
Automatické rozpoznávání tabulek pomocí neuronových sítí

This thesis introduces the reader to the current table recognition techniques mainly used to extract information from historical handwritten and printed tables. We also introduce a method based on graph neural network, which is inspired by the presented techniques. The method consists of three phases: graph initialization, node/edge classification and graph to text transformation phase. In the graph initialization phase, we use the node visibility algorithm and OCR to create an initial graph representation of the input table. In the node and edge classification phase, the nodes and edges are classified, and in the graph to text transformation phase, we fit the graph's nodes into a grid which is then used to produce the final text representation of the table. The implemented model achieved horizontal neighbours detection precision of 68 %, vertical neighbours detection precision of 71 % and cell detection precision of 85 % on the ABP dataset.


table recognition, graph neural network, transformer neural network, edge discover, node discovery, optical character recognition, table recognition datasets, graph initialization, table recognition evaluation

defended, grade A
20 June 2022
