Thesis Details

Metody extrakce informace z textových dokumentů

Master's Thesis Student: Sychra Tomáš Academic Year: 2007/2008 Supervisor: Bartík Vladimír, Ing., Ph.D.

English title

Methods for Information Extraction in Text Documents

Language

Czech

Abstract

Knowledge discovery in text documents is part of data mining. However, text documents have different properties in comparison to regular databases. This project contains an overview of methods for knowledge discovery in text documents. The most frequently used task in this area is document classification. Various approaches for text classification will be described. Finally, I will present algorithm Winnow that should perform better than any other algorithm for classification. There is a description of Winnow implementation and an overview of experimental results.

Keywords

text documents, information extraction, knowledge discovery, classification, categorization, linear classification, Winnow, Balanced Winnow, Positive Winnow

Department

Department of Information Systems FIT BUT

Degree Programme

Information Technology, Field of Study Information Systems

Files

Thesis text 1.6 MB

Status

defended, grade A

Date

22 February 2008

Reviewer

Burget Radek, doc. Ing., Ph.D.

Committee

Hruška Tomáš, prof. Ing., CSc. (DIFS FIT BUT), předseda
Burget Radek, doc. Ing., Ph.D. (DIFS FIT BUT), člen
Češka Milan, prof. RNDr., CSc. (DITS FIT BUT), člen
Matoušek Petr, doc. Ing., Ph.D., M.A. (DIFS FIT BUT), člen
Motyčka Arnošt, doc. Ing., CSc. (Mendelu), člen
Švéda Miroslav, prof. Ing., CSc. (DIFS FIT BUT), člen

Citation

SYCHRA, Tomáš. Metody extrakce informace z textových dokumentů. Brno, 2008. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2008-02-22. Supervised by Bartík Vladimír. Available from: https://www.fit.vut.cz/study/thesis/4772/

BibTeX

@mastersthesis{FITMT4772,
    author = "Tom\'{a}\v{s} Sychra",
    type = "Master's thesis",
    title = "Metody extrakce informace z textov\'{y}ch dokument\r{u}",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2008,
    location = "Brno, CZ",
    language = "czech",
    url = "https://www.fit.vut.cz/study/thesis/4772/"
}

Theses