Thesis Details
Extrakce informací z Wikipedie
This bachelor's thesis deals with the transfer of articles from Wikipedia into vertical text. Processing of the vertical text corpus using linguistic tools and converting into a format for indexing. Inserting articles with added annotations to index of MG4J. Processing statistics on the number of occurrences of phrases containing certain types of relations. Creating searching patterns based on the most frequent occurrences. Generalization of patterns. Designing a system for extraction of selected relations from articles in the index. Implementation of the system and testing functionality. Starting extractions based on patterns and evaluation of results of extraction.
Information extraction, Wikipedia, text converting, knowledge base, vertical text, indexation, MG4J, annotations of names and titles, creating and generalization of patterns, system of information extraction, extraction of relations.
Kořenek Jan, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Květoňová Šárka, Ing., Ph.D. (DIFS FIT BUT), člen
Španěl Michal, Ing., Ph.D. (DCGM FIT BUT), člen
Zbořil František, doc. Ing., Ph.D. (DITS FIT BUT), člen
@bachelorsthesis{FITBT17549, author = "Miroslav Posp\'{i}\v{s}il", type = "Bachelor's thesis", title = "Extrakce informac\'{i} z Wikipedie", school = "Brno University of Technology, Faculty of Information Technology", year = 2015, location = "Brno, CZ", language = "czech", url = "https://www.fit.vut.cz/study/thesis/17549/" }