Thesis Details

Bioinformatic Tool for Classification of Bacteria into Taxonomic Categories Based on the Sequence of 16S rRNA Gene

Master's Thesis Student: Valešová Nikola Academic Year: 2018/2019 Supervisor: Smatana Stanislav, Ing.
Czech title
Bioinformatický nástroj pro klasifikaci bakterií do taxonomických kategorií na základě sekvence genu 16S rRNA
Language
English
Abstract

This thesis deals with the problem of automated classification and recognition of bacteria after obtaining their DNA by the sequencing process. In the scope of this work, a new classification method based on the 16S rRNA gene segment is designed and described. The presented principle is constructed according to the tree structure of taxonomic categories and uses well-known machine learning algorithms to classify bacteria into one of the classes at the lower taxonomic level. A part of this thesis is also dedicated to the implementation of the described algorithm and evaluation of its prediction accuracy. The performance of various classifier types and their settings is examined and the setting with the best accuracy is determined. The accuracy of the implemented algorithm is also compared to several existing methods. During validation, the implemented KTC application reached more than 45 % accuracy on genus prediction on both BLAST 16S and BLAST V4 datasets. At the end of the thesis, there are mentioned several possibilities to improve and extend the current implementation of the algorithm.

Keywords

Machine learning, metagenomics, bacteria classification, phylogenetic tree, taxonomy, 16S rRNA, DNA sequencing, scikit-learn

Department
Degree Programme
Information Technology, Field of Study Intelligent Systems
Files
Status
defended, grade A
Date
17 June 2019
Reviewer
Committee
Zbořil František, doc. Ing., Ph.D. (DITS FIT BUT), předseda
Bidlo Michal, doc. Ing., Ph.D. (DCSY FIT BUT), člen
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT), člen
Grézl František, Ing., Ph.D. (DCGM FIT BUT), člen
Lucká Mária, prof. RNDr., Ph.D. (FIIT STU), člen
Rogalewicz Adam, doc. Mgr., Ph.D. (DITS FIT BUT), člen
Citation
VALEŠOVÁ, Nikola. Bioinformatic Tool for Classification of Bacteria into Taxonomic Categories Based on the Sequence of 16S rRNA Gene. Brno, 2019. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2019-06-17. Supervised by Smatana Stanislav. Available from: https://www.fit.vut.cz/study/thesis/21517/
BibTeX
@mastersthesis{FITMT21517,
    author = "Nikola Vale\v{s}ov\'{a}",
    type = "Master's thesis",
    title = "Bioinformatic Tool for Classification of Bacteria into Taxonomic Categories Based on the Sequence of 16S rRNA Gene",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2019,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/21517/"
}
Back to top