Detail výsledku

Multi-aspect Document Content Analysis using Ontological Modelling

MILIČKA, M.; BURGET, R. Multi-aspect Document Content Analysis using Ontological Modelling. Proceedings of 9th Workshop on Intelligent and Knowledge Oriented Technologies (WIKT 2014). Smolenice: Vydavateľstvo STU, 2014. p. 9-12. ISBN: 978-80-227-4267-2.

Typ

článek ve sborníku konference

Jazyk

anglicky

Autoři

Milička Martin, Ing., UIFS (FIT)
Burget Radek, doc. Ing., Ph.D., UIFS (FIT)

Abstrakt

Existing methods of information extraction from web documents are usually based on a single aspect of the document or its contents such as the code, textual features or visual features. Due to the great variability of the available online documents, it seems reasonable to combine multiple kinds of analysis in order to use all the available knowledge for identifying a particular information in the document. In this paper, we propose an ontological document model that allows to integrate the results of the analysis of different document aspects. We propose a generic architecture of an information extraction system based on this model and we show its applicability on a practical example.

Klíčová slova

document modeling, information extraction, page segmentation, content classification, ontology, RDF

Rok

2014

Strany

9–12

Sborník

Proceedings of 9th Workshop on Intelligent and Knowledge Oriented Technologies (WIKT 2014)

Konference

9th Workshop on Intelligent and Knowledge oriented Technologies

ISBN

978-80-227-4267-2

Vydavatel

Vydavateľstvo STU

Místo

Smolenice

BibTeX

@inproceedings{BUT111652,
  author="Martin {Milička} and Radek {Burget}",
  title="Multi-aspect Document Content Analysis using Ontological Modelling",
  booktitle="Proceedings of 9th Workshop on Intelligent and Knowledge Oriented Technologies (WIKT 2014)",
  year="2014",
  pages="9--12",
  publisher="Vydavateľstvo STU",
  address="Smolenice",
  isbn="978-80-227-4267-2",
  url="https://www.fit.vut.cz/research/publication/10724/"
}

Soubory

pdf wikt_burget.pdf 142 kB

Projekty

Centrum excelence IT4Innovations, MŠMT, Operační program Výzkum a vývoj pro inovace, ED1.1.00/02.0070, zahájení: 2011-01-01, ukončení: 2015-12-31, ukončen
Výzkum pokročilých metod ICT a jejich aplikace, VUT, Vnitřní projekty VUT, FIT-S-14-2299, zahájení: 2014-01-01, ukončení: 2016-12-31, ukončen

Výzkumné skupiny

Výzkumná skupina informačních a databázových systémů (VZ IS)

Pracoviště

Ústav informačních systémů (UIFS)