Thesis Details

Beat Grep with Counters, Challenge

Master's Thesis Student: Horký Michal Academic Year: 2020/2021 Supervisor: Holík Lukáš, doc. Mgr., Ph.D.
Czech title
Rychlejší než grep pomocí čítačů
Language
English
Abstract

Regular expression matching has an irreplaceable role in software development. The speed of the matching is crucial since it can have a significant impact on the overall usability of the software. However, standard approaches for regular expression matching suffer from high complexity computation for some kinds of regexes. This makes them vulnerable to attacks based on high complexity evaluation of regexes (so-called ReDoS attacks). Regexes with counting operators, which often occurs in practice, are one of such kind. Succinct representation and fast matching of such regexes can be archived by using a novel counting-set automaton. We present a C++ implementation of a matching algorithm based on the counting-set automaton. The implementation is done within the RE2 library, which is a fast state-of-the-art regular expression matcher. We perform experiments on real-life regexes. The experiments show that implementation within the RE2 is faster than the original C# implementation.

Keywords

regular expression, regex matching, bounded repetition, counting automata, counting-set automata, RE2, C++, C#

Department
Degree Programme
Information Technology, Field of Study Information Systems
Files
Status
defended, grade B
Date
21 June 2021
Reviewer
Committee
Hruška Tomáš, prof. Ing., CSc. (DIFS FIT BUT), předseda
Burget Radek, doc. Ing., Ph.D. (DIFS FIT BUT), člen
Grégr Matěj, Ing., Ph.D. (DIFS FIT BUT), člen
Kreslíková Jitka, doc. RNDr., CSc. (DIFS FIT BUT), člen
Meduna Alexander, prof. RNDr., CSc. (DIFS FIT BUT), člen
Očenášek Pavel, Mgr. Ing., Ph.D. (DIFS FIT BUT), člen
Citation
HORKÝ, Michal. Beat Grep with Counters, Challenge. Brno, 2021. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2021-06-21. Supervised by Holík Lukáš. Available from: https://www.fit.vut.cz/study/thesis/22461/
BibTeX
@mastersthesis{FITMT22461,
    author = "Michal Hork\'{y}",
    type = "Master's thesis",
    title = "Beat Grep with Counters, Challenge",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2021,
    location = "Brno, CZ",
    language = "english",
    url = "https://www.fit.vut.cz/study/thesis/22461/"
}
Back to top