Thesis Details
Extrakce dat z popisu zboží
This work concentrates on the design and implementation of an automated support for data extraction from product descriptions. This system will be used for e-shop purposes. The work introduces present approaches to information extraction from HTML documents. It focuses chiefly at wrappers and methods for their induction. The visual approach to information extraction is also mentioned. System requirements and basic principles are described in the design part of the work. Next, a detailed description of a path tracing algorithm in document object model is explained. The last section of the work evaluates the results of experiments made with the implemented system.
Information extraction, wrapper, wrapper induction, webshop, e-shop, JavaScript, DOM.
Burget Radek, doc. Ing., Ph.D. (DIFS FIT BUT), člen
Drahanský Martin, prof. Ing., Dipl.-Ing., Ph.D. (DITS FIT BUT), člen
Matoušek Petr, doc. Ing., Ph.D., M.A. (DIFS FIT BUT), člen
Šafařík Jiří, prof. Ing., CSc. (WBU in Pilsen), člen
Vojnar Tomáš, prof. Ing., Ph.D. (DITS FIT BUT), člen
@mastersthesis{FITMT7080, author = "Vojt\v{e}ch Sl\'{a}ma", type = "Master's thesis", title = "Extrakce dat z popisu zbo\v{z}\'{i}", school = "Brno University of Technology, Faculty of Information Technology", year = 2008, location = "Brno, CZ", language = "czech", url = "https://www.fit.vut.cz/study/thesis/7080/" }