Detail výsledku
HTML Document Analysis for Information Extraction
BURGET, R. HTML Document Analysis for Information Extraction. Proceedings of 8th EEICT conference. Brno: Faculty of Information Technology BUT, 2002. p. 426-430. ISBN: 80-214-2116-9.
Typ
článek ve sborníku konference
Jazyk
anglicky
Autoři
Abstrakt
The today's World Wide Web contains a vast amount ofinformation stored in HTML documents. However, the HTML languageprimarily describes the look of the documents and it doesn't containfacilities for the description of contained data structure. In thispaper we propose a model of a Web site that describes logical structureof contained data. Furthermore, we propose methods for creating such a model by analyzing the look and the structure of HTML documents.
Klíčová slova
HTML Analysis, Information Extraction
Rok
2002
Strany
426–430
Sborník
Proceedings of 8th EEICT conference
Konference
Student EEICT 2002
ISBN
80-214-2116-9
Vydavatel
Faculty of Information Technology BUT
Místo
Brno
BibTeX
@inproceedings{BUT10014,
author="Radek {Burget}",
title="HTML Document Analysis for Information Extraction",
booktitle="Proceedings of 8th EEICT conference",
year="2002",
pages="426--430",
publisher="Faculty of Information Technology BUT",
address="Brno",
isbn="80-214-2116-9"
}
Výzkumné skupiny
Pracoviště