Thesis Details

Multi-Criteria Clustering of Files

Master's Thesis Student: Jasnický Matúš Academic Year: 2020/2021 Supervisor: Zobal Lukáš, Ing.
Czech title
Multikriteriální shlukování souborů
Language
English
Abstract

This work aims to create the clustering part of a new version of the clustering tool named Clusty, which is developed by Avast Software. Clusty is a tool for automatic analysis and online clustering of all incoming samples. The most notable shortcomings are using a single criterion for clustering, vertical scalability, and lack of support for achieving high availability. Among the good features belong a good performance, interpretability of clusters' origin, and an ability to use other techniques like YARA rules.The designed tool overcome the shortcomings while keeping the features. None of the existing clustering methods is being used because none of them had satisfied the requirements. Instead, three new methods are proposed. They are based on the method in the current version of Clusty and the standard methods. The tool uses so-called rules to allow using multiple clustering methods concurrently.The clustering results can be considered better compared to the results from the current version. This work proposes a solution for the shortcomings and shows the usable clustering methods.

Keywords

clustering, malware, file clustering, malware clustering, online clustering, multi-criteria clustering

Department
Degree Programme
Information Technology and Artificial Intelligence, Specialization Cybersecurity
Files
Reason for publication postponement

The publication of the diploma thesis is in accordance with the provision of § 47b par. 4 of the Act no. 111/1998, about universities and about the change and supplementing other laws (Higher Education Act), as amended, delayed by 3 years. The reason for the delay of the publication is the protection of intellectual property and the fact that the thesis contains business secret in the sense of the relevant provisions of the Act no. 89/2012 Coll., Civil Code.

Status
defended, grade B
Date
22 June 2021
Reviewer
Committee
Hanáček Petr, doc. Dr. Ing. (DITS FIT BUT), předseda
Drábek Vladimír, doc. Ing., CSc. (DCSY FIT BUT), člen
Drahanský Martin, prof. Ing., Dipl.-Ing., Ph.D. (DITS FIT BUT), člen
Holík Lukáš, doc. Mgr., Ph.D. (DITS FIT BUT), člen
Malinka Kamil, Mgr., Ph.D. (DITS FIT BUT), člen
Veselý Vladimír, Ing., Ph.D. (DIFS FIT BUT), člen
Citation
JASNICKÝ, Matúš. Multi-Criteria Clustering of Files. Brno, 2021. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2021-06-22. Supervised by Zobal Lukáš. Available from: https://www.fit.vut.cz/study/thesis/23730/
BibTeX
@mastersthesis{FITMT23730,
    author = "Mat\'{u}\v{s} Jasnick\'{y}",
    type = "Master's thesis",
    title = "Multi-Criteria Clustering of Files",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2021,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/23730/"
}
Back to top