Thesis Details

Sémantická podobnost textů

Bachelor's Thesis Student: Bradáč Václav Academic Year: 2014/2015 Supervisor: Smrž Pavel, doc. RNDr., Ph.D.
English title
Semantic Similarity of Texts
Language
Czech
Abstract

This paper deals with the determination of semantic similarity texts, focusing on scalability. Part of treatment is a theoretical overview of the tools to implement the system on test data. Tested corpus contains expert articles in the English language. The aim is to analyze these articles, modified to facilitate the analysis of their semantic analogues. One of the most utilized tools is a representation of data in a vector space model.

Keywords

Semantic similarity, TF-IDF, Latent semantic analysis, Latent semantic indexing, Singular value decomposition, Latent Direchletova allocation, Python, Gensim, PHP, Elasticsearch, MoreLikeThis

Department
Degree Programme
Information Technology
Files
Status
defended, grade E
Date
16 June 2015
Reviewer
Committee
Meduna Alexander, prof. RNDr., CSc. (DIFS FIT BUT), předseda
Beran Vítězslav, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Drábek Vladimír, doc. Ing., CSc. (DCSY FIT BUT), člen
Křena Bohuslav, Ing., Ph.D. (DITS FIT BUT), člen
Očenášek Pavel, Mgr. Ing., Ph.D. (DIFS FIT BUT), člen
Citation
BRADÁČ, Václav. Sémantická podobnost textů. Brno, 2015. Bachelor's Thesis. Brno University of Technology, Faculty of Information Technology. 2015-06-16. Supervised by Smrž Pavel. Available from: https://www.fit.vut.cz/study/thesis/17736/
BibTeX
@bachelorsthesis{FITBT17736,
    author = "V\'{a}clav Brad\'{a}\v{c}",
    type = "Bachelor's thesis",
    title = "S\'{e}mantick\'{a} podobnost text\r{u}",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2015,
    location = "Brno, CZ",
    language = "czech",
    url = "https://www.fit.vut.cz/study/thesis/17736/"
}
Back to top