This page contains the Pattern_Match Package documentation.
Bases: netbench.pattern_match.b_symbol.b_Symbol
A base class to represent a char symbol.
Parameters: |
|
---|
If symbol is at the beginning of the text, is removed from the text and reminder is returned. Otherwise accept_exception is raised. Behavior of this method is controled by attributes greedy and limit. The limit attribute has bigger priority than the greedy one. Limit sets exact number of accepted characters. If -1 is set, the greedy attribute will be used. If greedy is True, the accept() method will consume as much characters from input string as posible. If greedy is False, the accept() mathod will consume only minimal number of characters from input string with respect to the m parameter.
Parameter: | text (string) – Text to be parsed. |
---|---|
Returns: | Text without begining. |
Return type: | string |
Rises: | accept_exception if symbol is not at the begining of the text. |
This symbol is used only for representing PCRE constraint repetition block. This method implements prefix collision.
Parameter: | set_of_symbols (set(b_Symbol)) – Set of symbols. |
---|---|
Returns: | True if at least two symbols are in collision, otherwise False is returned. |
Return type: | boolean |
Compute collision between self and compSymbol.
NOTE: This method is not implemented.
Parameter: | other (b_Symbol) – Other symbol. |
---|---|
Returns: | Resolved collision - changes to the symbols and new ones, if they are created. |
Return type: | tuple(set(b_Symbol), set(b_Symbol), set(b_Symbol)) |
Compute double stride using self and compSymbol. This method should be called only by double_stride method.
NOTE: This method is not implemented.
Parameters: |
|
---|---|
Returns: | New strided symbol. |
Return type: | sym_kchar |
Compute if two symbols (self and other) are equivalent.
Parameter: | other – Other symbol. |
---|---|
Returns: | True if the symbols are equivalent, otherwise returns False. |
Return type: | boolean |
Returns symbol representation compatible with FSM tools - http://www2.research.att.com/~fsmtools/fsm/man4/fsm.5.html. The symbol is encoded in string without whitespace chars. First char is used for symbol class specification and therefore is not part of the encoded symbol. Symbol class specification char for this symbol class is defined by b_symbol.io_mapper[“b_Sym_cnt_constr”].
Returns: | Symbol representation compatible with FSM tools. |
---|---|
Return type: | string |
Return value of the greedy attribute.
Returns: | Value of the greedy attribute. |
---|---|
Return type: | boolean |
Return value of the limit attribute.
Returns: | Value of the limit attribute. |
---|---|
Return type: | int |
Return supported types of symbols for current type of symbol.
Returns: | Supported types of symbols for current type of symbol. |
---|---|
Return type: | list(int) |
Creates symbol from its string representation compatible with FSM tools. See export method for more datails.
Parameters: |
|
---|
Return True if symbol is empty. False if is not empty.
Returns: | True if symbol is empty. False if is not empty. |
---|---|
Return type: | boolean |
Set greedy attribute of this symbol. This attribute is used to controle behavior of the accept method.
Parameter: | greedy (boolean) – New value of the attribute: If greedy is True, the accept() method will consume as much characters from input string as posible. If greedy is False, the accept() mathod will consume only minimal number of characters from input string with respect to the m parameter. |
---|
Set limit attribute of this symbol. This attribute is used to controle behavior of the accept() method and has bigger priority than greedy attribute.
Parameter: | limit (int) – Sets exact number of accepted characters. If -1 is set, the greedy attribute will be used. |
---|
Bases: netbench.pattern_match.b_nfa.b_nfa
Class for NFA reductions.
This method reduce states of NFA by prefix sharing. Only static prefixes are shared.
This method sets _compute to False, and get_compute() will return False until compute() is called.
Bases: netbench.pattern_match.b_automaton.b_Automaton
A base class for DFA automata.
Determinisation of automaton.
Parameters: |
|
---|---|
Flags: | Set Deterministic, Epsilon Free and Alphabet collision free. |
This method sets _compute to False, and get_compute() will return False until compute() is called.
Check isomorfism of automaton with another nfa_data object. Both automatons must be deterministic and without unreachable states.
Parameter: | nfa_data (nfa_data) – nfa_data object containing second automaton. |
---|---|
Returns: | True if automata are isomorphic, else otherwise. |
Return type: | boolean |
Minimalization of DFA automaton.
Raises: | ALPHABET_COLLISION_ERROR() if alphabet is not collision free. |
---|---|
Raises: | DETERMINISTIC_ERROR() if automaton is not deterministic. |
Flags: | Sets Minimal flag to true. |
This method sets _compute to False, and get_compute() will return False until compute() is called.
Report consumed memory in bytes. Naive mapping algorithm is used (2D array). Basic algorithm for this variant of mapping is: M = |states| * |alphabet| * ceil(log(|states| + 1, 2) / 8)
Returns: | Returns number of bytes. |
---|---|
Return type: | int |
Report consumed memory in bytes. Optimal mapping algorithm is used (with oracle). Basic algorithm for this variant of mapping is: M = |transitions| * ceil(log(|states|, 2) / 8)
Returns: | Returns number of bytes. |
---|---|
Return type: | int |
Bases: netbench.pattern_match.nfa_parser.nfa_parser
A mata class wrapping under single interface any class for parsing of regular expressions based on base class nfa_parser.
Parameters: |
|
---|
Parse a current line and returns parsed nfa.
Returns: | Created automaton in nfa_data format. Returns None if failure happens. |
---|---|
Return type: | nfa_data or None |
Returns position in ruleset.
Returns: | Position in ruleset. |
---|---|
Return type: | int |
This function is used to specify input file and load the whole file into the input text atribute.
Parameter: | filename (string) – Name of file. |
---|
Move to the specified line
Parameter: | line (int) – Line number. |
---|---|
Returns: | True if move was performed, Otherwise False is returned. |
Return type: | boolean |
Move to the next line (next regular expression)
Returns: | True if move was performed, Otherwise False is returned. |
---|---|
Return type: | boolean |
Returns number of lines.
Returns: | Number of lines. Each line corespond to single regular expression. |
---|---|
Return type: | int |
Set text to parse - can have multiple text lines
Parameter: | input_text (string) – Regular expressions. |
---|
A class to specify NFA automaton (Q,T,q0,Delta,F).
Attribute | Type | Description |
---|---|---|
states | dict(int, b_State) | Finite set of states. |
alphabet | dict(int, b_Symbol) | Symbols of alphabet. |
start | int | ID of Start state. |
transitions | set(tupple(int, int, int)) | Transitions. Format (Start state ID, Symbol ID, End state ID) |
final | set(int) | Final states. |
Flags | dict(string, any type) | Flags for specified properties. |
Flags dictionary stores various properties of automaton. The list of all supported flags is in the next table.
Flag | Type | Description |
---|---|---|
Deterministic | boolean | Is automaton deterministic? |
Strided | boolean | Is automaton strided (accept multiple characters)? |
Stride | int | Number of characters accepted at once by the automaton. |
Epsilon Free | boolean | Is automaton without epsilon transitions? |
Alphabet collision free | boolean | Is alphabet without collisions? |
Minimal | boolean | Is DFA minimal? |
Delay DFA | boolean | Is automaton Delay DFA? |
Extend_FA | boolean | Is automaton Extend DFA? |
History FA | boolean | Is automaton History DFA? |
Hybrid FA - one NFA part | boolean | Is automaton NFA part of Hybrid DFA? |
Hybrid FA - DFA part | boolean | Is automaton DFA part of Hybrid DFA? |
Save automaton to file in FSM format. Based on FSM man page: http://www2.research.att.com/~fsmtools/fsm/man4.html
Parameters: |
|
---|
NOTE: This method is deprecated. Use export_to_fsm().
Load automaton from file in FSM format. Based on FSM man page: http://www2.research.att.com/~fsmtools/fsm/man4.html . This method must be updated if new symbol is added to Netbench. Raises Exception if unknown symbol string type is found and coresponding class can not be determinated.
Parameters: |
|
---|---|
Raises: | nfa_data_import_exception if unknown symbol string type is found and coresponding class can not be determinated. |
NOTE: This method is deprecated. Use import_from_fsm().
Load nfa_data from file (serialisation).
Parameter: | FileName (string) – Name of file from which nfa_data will be loaded. |
---|---|
Returns: | Object created from file is returned. |
Return type: | nfa_data |
NOTE: This method is deprecated. Use load_from_file().
Save nfa_data to file (serialisation).
Parameter: | FileName (string) – Name of file in which nfa_data will be saved. |
---|---|
Returns: | True if success, False otherwise. |
Return type: | boolean |
NOTE: This method is deprecated. Use save_to_file().
Save graphviz dot file, representing graphical structure of nfa_data.
Parameters: |
|
---|---|
Returns: | True if success, False otherwise. |
Return type: | boolean |
NOTE: This method is deprecated. Use show().
Add states to automaton. States can be list, tuple, set and frozen set of states objects. States can also be an an state object (object of b_State class or class derived from b_State class). If state is final its added to the final set of automaton. If state can not be added due an id collision exception is raised and rollback on states and final is performed.
Parameter: | states (list(b_State), tuple(b_State), set(b_State), frozenset(b_State) or b_State) – States id to add. |
---|---|
Raises: | general_unsupported_type if type of states is not supported. |
Add symbols to automaton. Symbols can be list, tuple, set and frozen set of symbol objects (objects of classes derived from b_Symbol). Symbols can also be a symbol object. If symbol can not be added due an id collision exception is raised and rollback on symbols is performed. Before calling this method, the symbols must be checked by check_symbols() or has_symbol() method and all symbol must be unique (result of this methods must be False).
Parameter: | symbols (list(b_Symbol), tuple(b_Symbol), set(b_Symbol), frozenset(b_Symbol) or b_Symbol) – Symbols id to add. |
---|---|
Raises: | general_unsupported_type if type of symbols is not supported. |
Add transitions to automaton. Transitions can be list, tuple, set and frozen set of tuple(int, int, int). Transitions can also be a tuple(int, int, int).
Parameter: | transitions (list(tuple(int, int, int)), tuple(tuple(int, int, int)), set(tuple(int, int, int)), frozenset(tuple(int, int, int)) or tuple(int, int, int)) – Transitions to add. |
---|---|
Raises: | general_unsupported_type if type of transitions is not supported. |
This method checks if symbols in symbols are in automaton alphabet. Symbols can be list, tuple, set and frozen set of symbolobjects. Returns list of boolean values - True if symbol is in alphabet, otherwise False.
Parameter: | symbols (list(b_Symbol), tuple(b_Symbol), set(b_Symbol), or frozenset(b_Symbol)) – Symbols id to check. |
---|---|
Returns: | Mapping of symbols to their presence in automaton. |
Return type: | list(tuple(b_Symbol, boolean)) |
Raises: | general_unsupported_type if type of symbols is not supported. |
Save automaton to file in FSM format. Based on FSM man page: http://www2.research.att.com/~fsmtools/fsm/man4.html
Parameters: |
|
---|
Returns max id of symbols
Returns: | Max id of symbols. |
---|---|
Return type: | int |
Returns max id of states.
Returns: | Max id of states. |
---|---|
Return type: | int |
This method returns id of equivalent symbol is in alphabet, otherwise throws exception unknown_symbol.
Parameter: | symbol (b_Symbol) – Symbol to be checked. |
---|---|
Returns: | Id of equivalent symbol in alphabet. |
Return type: | int |
Raises: | symbol_not_found if symbol is not in alphabet. |
This method returns True if symbol is in alphabet, otherwise returns False.
Parameter: | symbol (b_Symbol) – Symbol to be checked. |
---|---|
Returns: | True if symbol is in alphabet, otherwise returns False. |
Return type: | boolean |
Load automaton from file in FSM format. Based on FSM man page: http://www2.research.att.com/~fsmtools/fsm/man4.html . This method must be updated if new symbol is added to Netbench. Raises Exception if unknown symbol string type is found and coresponding class can not be determinated.
Parameters: |
|
---|---|
Raises: | nfa_data_import_exception if unknown symbol string type is found and coresponding class can not be determinated. |
Return True if nfa_data structure is consistent, otherwise returns False. This method is intended for debuging purposes.
Parameter: | syndrome (dict(int->int)) – Dict of inconsistency syndromes. The dict have to be set to empty dict (If the dict is nonempty, it will be cleared). If syndrome is None no syndrome is returned. Syndrome keys are ints describing kind of inconsitency, the values are lists of appropriate objects.
|
||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Returns: | True if nfa_data structure is consistent, otherwise returns False. | ||||||||||||||||||||||||||||||||||||
Return type: | boolean |
Return True if nfa_data structure is empty, false otherwise.
Returns: | True if nfa_data structure is empty, false otherwise. |
---|---|
Return type: | boolean |
Load nfa_data from file (serialisation).
Parameter: | FileName (string) – Name of file from which nfa_data will be loaded. |
---|---|
Returns: | Object created from file is returned. |
Return type: | nfa_data |
Removes states from automaton. States can be list, tuple, set and frozen set of ids. States can also be an id. If state is final state, the state will be also removed from final.
Parameter: | states (list(int), tuple(int), set(int), frozenset(int) or int) – States id to be removed. |
---|---|
Raises: | general_unsupported_type if type of states is not supported. |
Removes symbols from automaton. Symbols can be list, tuple, set and frozen set of ids. Symbols can also be an id.
Parameter: | symbols (list(int), tuple(int), set(int), frozenset(int) or int) – Symbols id to be removed. |
---|---|
Raises: | general_unsupported_type if type of symbols is not supported. |
Removes transitions from automaton. Transitions can be list, tuple, set and frozen set of tuple(int, int, int). Transitions can also be a tuple(int, int, int).
Parameter: | transitions (list(tuple(int, int, int)), tuple(tuple(int, int, int)), set(tuple(int, int, int)), frozenset(tuple(int, int, int)) or tuple(int, int, int)) – Transitions to be removed. |
---|---|
Raises: | general_unsupported_type if type of transitions is not supported. |
Save nfa_data to file (serialisation).
Parameter: | FileName (string) – Name of file in which nfa_data will be saved. |
---|---|
Returns: | True if success, False otherwise. |
Return type: | boolean |
Save graphviz dot file, representing graphical structure of nfa_data.
Parameters: |
|
---|---|
Returns: | True if success, False otherwise. |
Return type: | boolean |
Bases: netbench.pattern_match.nfa_parser.nfa_parser
Class for parsing RE using new C pcre parser. For the correct work of this class it is required that the path to the netbench was in the system variable NETBENCHPATH..
FORMAT of Automata file (MSFM2) version 2:
Parse a current line and returns parsed nfa.
Returns: | Created automaton in nfa_data format. Returns None if failure happens. |
---|---|
Return type: | nfa_data or None |
Bases: netbench.pattern_match.b_symbol.b_Symbol
A base class to represent a k-char symbol. Chars can be char or char classes.
Parameters: |
|
---|
If symbol is at the beginning of the text, is removed from the text and reminder is returned. Otherwise accept_exception is raised.
Parameter: | text (string) – Text to be parsed. |
---|---|
Returns: | Text without begining. |
Return type: | string |
Rises: | accept_exception if symbol is not at the begining of the text. |
Return True if two or more symbols from set_of_symbols can be accepted for the same text.
Parameter: | set_of_symbols (set(b_Symbol)) – Set of symbols. |
---|---|
Returns: | True if at least two symbols are in collision, otherwise False is returned. |
Return type: | boolean |
Compute collision between self and compSymbol.
Parameter: | other (b_Symbol) – Other symbol. |
---|---|
Returns: | Resolved collision - changes to the symbols and new ones, if they are created. |
Return type: | tuple(set(b_Symbol), set(b_Symbol), set(b_Symbol)) |
Compute double stride using self and compSymbol. This method should be called only by double_stride method.
Parameters: |
|
---|---|
Returns: | New strided symbol. |
Return type: | sym_kchar |
Compute if two symbols (self and other) are equivalent.
Parameter: | other – Other symbol. |
---|---|
Returns: | True if the symbols are equivalent, otherwise returns False. |
Return type: | boolean |
Returns symbol representation compatible with FSM tools - http://www2.research.att.com/~fsmtools/fsm/man4/fsm.5.html. The symbol is encoded in string without whitespace chars. First char is used for symbol class specification and therefore is not part of the encoded symbol. Symbol class specification char for this symbol class is defined by b_symbol.io_mapper[“b_Sym_kchar”].
Returns: | Symbol representation compatible with FSM tools. |
---|---|
Return type: | string |
Return supported types of symbols for current type of symbol.
Returns: | Supported types of symbols for current type of symbol. |
---|---|
Return type: | list(int) |
Creates symbol from its string representation compatible with FSM tools. See export method for more datails.
Parameters: |
|
---|
Return True if symbol is empty. False if is not empty.
Returns: | True if symbol is empty. False if is not empty. |
---|---|
Return type: | boolean |
Module containing auxiliary functions for classes, scripts, etc.
Prints deprecation warning on stderr.
Parameters: |
|
---|
Parses the NETBENCHPATH environment variable and returns path to pattern match.
Raises: | no_netbench_path_variable() if environment variable NETBENCHPATH doesn’t exist. |
---|---|
Returns: | Path to Netbench base dir. |
Return type: | string |
Acording to suffix of file name determinates if file is pickled object (.pkl) or ruleset.
Parameter: | fileName (string) – Name of input file |
---|---|
Returns: | True if pickled object, otherwise return False. |
Return type: | boolean |
A base class for parsing of regular expressions.
Parse a current line and returns parsed nfa.
Returns: | Created automaton in nfa_data format. Returns None if failure happens. |
---|---|
Return type: | nfa_data or None |
Returns position in ruleset.
Returns: | Position in ruleset. |
---|---|
Return type: | int |
This function is used to specify input file and load the whole file into the input text atribute. Format of this file is simple - one regular expression on each line. Regular expression (RE) must have following form: /RE/M - where RE is regular expression and M are PCRE modifiers.
Parameter: | filename (string) – Name of file. |
---|
Move to the specified line
Parameter: | line (int) – Line number. |
---|---|
Returns: | True if move was performed, Otherwise False is returned. |
Return type: | boolean |
Move to the next line (next regular expression)
Returns: | True if move was performed, Otherwise False is returned. |
---|---|
Return type: | boolean |
Returns number of lines.
Returns: | Number of lines. Each line corespond to single regular expression. |
---|---|
Return type: | int |
Set text to parse - can have multiple text lines. Format of the text is simple - one regular expression on each line. Regular expression (RE) must have following form: /RE/M - where RE is regular expression and M are PCRE modifiers.
Parameter: | input_text (string) – Regular expressions. |
---|
Bases: netbench.pattern_match.b_symbol.b_Symbol
Class for default symbol, which is used in Delay DFA.
Parameters: |
|
---|
If symbol is at the beginning of the text, is removed from the text and reminder is returned. Otherwise accept_exception is raised.
Parameter: | text (string) – Text to be parsed. |
---|---|
Returns: | Text without begining. |
Return type: | string |
Rises: | accept_exception if symbol is not at the begining of the text. |
Returns True if def_symbol is present in set_of_symbols otherwise returns False. From definition this symbol can be in collision only with self.
Parameter: | set_of_symbols (set(b_Symbol)) – Set of symbols. |
---|---|
Returns: | True if at least two symbols are in collision, otherwise False is returned. |
Return type: | boolean |
Compute collision with def_symbol. Collision can be only with other def_symbol.
Parameter: | other (b_Symbol) – Other symbol. |
---|---|
Returns: | Resolved collision - changes to the symbols and new ones, if they are created. |
Return type: | tuple(set(b_Symbol), set(b_Symbol), set(b_Symbol)) |
Compute if two symbols (self and other) are equivalent.
Parameter: | other – Other symbol. |
---|---|
Returns: | True if the symbols are equivalent, otherwise returns False. |
Return type: | boolean |
Returns symbol representation compatible with FSM tools - http://www2.research.att.com/~fsmtools/fsm/man4/fsm.5.html. The symbol is encoded in string without whitespace chars. First char is used for symbol class specification and therefore is not part of the encoded symbol. Symbol class specification char for this symbol class is defined by b_symbol.io_mapper[“DEF_SYMBOLS”].
Returns: | Symbol representation compatible with FSM tools. |
---|---|
Return type: | string |
Creates symbol from its string representation compatible with FSM tools. See export method for more datails.
Parameters: |
|
---|
Return True if symbol is empty. False if is not empty.
Returns: | True if symbol is empty. False if is not empty. |
---|---|
Return type: | boolean |
A base class to represent a symbol.
Parameters: |
|
---|
If symbol is at the beginning of the text, is removed from the text and reminder is returned. Otherwise accept_exception is raised.
Parameter: | text (string) – Text to be parsed. |
---|---|
Returns: | Text without begining. |
Return type: | string |
Rises: | accept_exception if symbol is not at the begining of the text. |
Return True if two or more symbols from set_of_symbols can be accepted for the same text.
Parameter: | set_of_symbols (set(b_Symbol)) – Set of symbols. |
---|---|
Returns: | True if at least two symbols are in collision, otherwise False is returned. |
Return type: | boolean |
Compute double stride using self and compSymbol. This method should be called only by double_stride method.
Parameters: |
|
---|---|
Returns: | New strided symbol and unused symbols from local chars. |
Return type: | tuple(b_Symbol, set()) |
Double stride using self and compSymbol.
Parameters: |
|
---|---|
Returns: | New strided symbol and unused symbols from local chars. |
Return type: | tuple(b_Symbol, set()) |
Raises: | symbol_double_stride_exception if double stride can’t be computed. |
Returns symbol representation compatible with FSM tools - http://www2.research.att.com/~fsmtools/fsm/man4/fsm.5.html. The symbol is encoded in string without whitespace chars. First char is used for symbol class specification and therefore is not part of the encoded symbol. Symbol class spcification char should be in [0-9a-zA-Z] and must be unique.
Returns: | Symbol representation compatible with FSM tools. |
---|---|
Return type: | string |
Return symbol identification number.
Returns: | Symbol identification number. |
---|---|
Return type: | int |
TODO: dohodnout se na schuzi Return supported types of symbols for current type of symbol and method.
Parameter: | method – Specifies method for which supported symbols are requested. |
---|---|
Returns: | Set of supported types. |
Return type: | set(int) |
Return symbol description for graph representation.
Returns: | Symbol description for graph representation. |
---|---|
Return type: | string |
Return type of symbol.
Returns: | Type of symbol. |
---|---|
Return type: | int |
Creates symbol from its string representation compatible with FSM tools. See export method for more datails.
Parameters: |
|
---|
Return True if symbol is empty. False if is not empty.
Returns: | True if symbol is empty. False if is not empty. |
---|---|
Return type: | boolean |
Resolve collision with self and compSymbol.
Parameter: | compSymbol (b_Symbol) – Other symbol. |
---|---|
Returns: | Resolved collision - changes to the symbols and new ones, if they are created. |
Return type: | tuple(set(b_Symbol), set(b_Symbol), set(b_Symbol)) |
Raises: | symbol_resolve_collision_exception() if collision can’t be computed. |
Sets the id of symbol to new_id.
Parameter: | new_id (int) – Symbol identification value. |
---|
Sets the text description of symbol to text.
Parameter: | text (string) – Description of symbol. |
---|
Bases: netbench.pattern_match.b_symbol.b_Symbol
A base class to represent a string symbol.
Parameters: |
|
---|
If symbol is at the beginning of the text, is removed from the text and reminder is returned. Otherwise accept_exception is raised.
Parameter: | text (string) – Text to be parsed. |
---|---|
Returns: | Text without begining. |
Return type: | string |
Rises: | accept_exception if symbol is not at the begining of the text. |
Return True if two or more symbols from set_of_symbols can be accepted for the same text.
Parameter: | set_of_symbols (set(b_Symbol)) – Set of symbols. |
---|---|
Returns: | True if at least two symbols are in collision, otherwise False is returned. |
Return type: | boolean |
Compute if two symbols (self and other) are equivalent.
Parameter: | other – Other symbol. |
---|---|
Returns: | True if the symbols are equivalent, otherwise returns False. |
Return type: | boolean |
Returns symbol representation compatible with FSM tools - http://www2.research.att.com/~fsmtools/fsm/man4/fsm.5.html. The symbol is encoded in string without whitespace chars. First char is used for symbol class specification and therefore is not part of the encoded symbol. Symbol class specification char for this symbol class is defined by b_symbol.io_mapper[“b_Sym_string”].
Returns: | Symbol representation compatible with FSM tools. |
---|---|
Return type: | string |
Return supported types of symbols for current type of symbol.
Returns: | Supported types of symbols for current type of symbol. |
---|---|
Return type: | list(int) |
Creates symbol from its string representation compatible with FSM tools. See export method for more datails.
Parameters: |
|
---|
Return True if symbol is empty. False if is not empty.
Returns: | True if symbol is empty. False if is not empty. |
---|---|
Return type: | boolean |
A base class for state representation.
Parameters: |
|
---|
Computes join of two states. Note that new state ID must be set after. Default value is -2.
Parameter: | other (b_State) – Second state. |
---|---|
Returns: | Joined states. |
Return type: | b_State |
Returns state identification number.
Returns: | State identification number |
---|---|
Return type: | int |
Returns set of indexes of regular expression, which corresponds to the final state. If empty set value is returned the state is not final and do not represent any regular expression.
Returns: | Set of regular expression numbres coresponding to the state |
---|---|
Return type: | set(int) |
Returns supported types of states for current type of state.
Returns: | Returns supported types of states for current type of state. |
---|---|
Return type: | list(int) |
Returns text description for graph representation.
Returns: | Text description of state |
---|---|
Return type: | string |
Returns type of state.
Returns: | Returns type of state. |
---|---|
Return type: | int |
Returns true if the state is a final state.
Returns: | True if the state is a final state, False otherwise. |
---|---|
Return type: | boolean |
Joins using self and other state. Creates new state.
Parameter: | other (b_State) – Other state. |
---|---|
Returns: | New joined state. |
Return type: | b_State |
Raises: | state_join_exception() if join can’t be computed. |
Sets the id of state to new_id.
Parameter: | new_id (int) – New unique state identification number. |
---|
Sets set of indexes of regular expression, which corresponds to the final state. If empty set is set the state is not final and do not represent any regular expression.
Parameter: | new_rnum (Set of Int) – New set of regular expression numbres |
---|
Bases: netbench.pattern_match.b_symbol.b_Symbol
A base class to represent a char class symbol.
Parameters: |
|
---|
If symbol is at the beginning of the text, it’s removed from the text and reminder is returned. Otherwise accept_exception is raised.
Parameter: | text (string) – Text to be parsed. |
---|---|
Returns: | Text without begining. |
Return type: | string |
Rises: | accept_exception if symbol is not at the begining of the text. |
Return True if two or more symbols from set_of_symbols can be accepted for the same text.
Parameter: | set_of_symbols (set(b_Symbol)) – Set of symbols. |
---|---|
Returns: | True if at least two symbols are in collision, otherwise False is returned. |
Return type: | boolean |
Compute collision between self and compSymbol.
Parameter: | other (b_Symbol) – Other symbol. |
---|---|
Returns: | Resolved collision - changes to the symbols and new ones, if they are created. |
Return type: | tuple(set(b_Symbol), set(b_Symbol), set(b_Symbol)) |
Compute double stride using self and compSymbol. This method should be called only by double_stride method.
Parameters: |
|
---|---|
Returns: | New strided symbol. |
Return type: | sym_kchar |
Compute if two symbols (self and other) are equivalent.
Parameter: | other – Other symbol. |
---|---|
Returns: | True if the symbols are equivalent, otherwise returns False. |
Return type: | boolean |
Returns symbol representation compatible with FSM tools - http://www2.research.att.com/~fsmtools/fsm/man4/fsm.5.html. The symbol is encoded in string without whitespace chars. First char is used for symbol class specification and therefore is not part of the encoded symbol. Symbol class specification char for this symbol class is defined by b_symbol.io_mapper[“b_Sym_char_class”].
Returns: | Symbol representation compatible with FSM tools. |
---|---|
Return type: | string |
Return supported types of symbols for current type of symbol.
Returns: | Supported types of symbols for current type of symbol. |
---|---|
Return type: | list(int) |
Return symbol description for graph representation.
Returns: | Symbol description for graph representation. |
---|---|
Return type: | string |
Creates symbol from its string representation compatible with FSM tools. See export method for more datails.
Parameters: |
|
---|
Return True if symbol is empty. False if is not empty.
Returns: | True if symbol is empty. False if is not empty. |
---|---|
Return type: | boolean |
Bases: netbench.pattern_match.b_symbol.b_Symbol
A base class to represent a char symbol.
Parameters: |
|
---|
If symbol is at the beginning of the text, is removed from the text and reminder is returned. Otherwise accept_exception is raised.
Parameter: | text (string) – Text to be parsed. |
---|---|
Returns: | Text without begining. |
Return type: | string |
Rises: | accept_exception if symbol is not at the begining of the text. |
Return True if two or more symbols from set_of_symbols can be accepted for the same text.
Parameter: | set_of_symbols (set(b_Symbol)) – Set of symbols. |
---|---|
Returns: | True if at least two symbols are in collision, otherwise False is returned. |
Return type: | boolean |
Compute collision between self and compSymbol.
Parameter: | other (b_Symbol) – Other symbol. |
---|---|
Returns: | Resolved collision - changes to the symbols and new ones, if they are created. |
Return type: | tuple(set(b_Symbol), set(b_Symbol), set(b_Symbol)) |
Compute double stride using self and compSymbol. This method should be called only by double_stride method.
Parameters: |
|
---|---|
Returns: | New strided symbol and unused symbols from local chars. |
Return type: | tuple(b_Symbol, list(set())) |
Compute if two symbols (self and other) are equivalent.
Parameter: | other – Other symbol. |
---|---|
Returns: | True if the symbols are equivalent, otherwise returns False. |
Return type: | boolean |
Returns symbol representation compatible with FSM tools - http://www2.research.att.com/~fsmtools/fsm/man4/fsm.5.html. The symbol is encoded in string without whitespace chars. First char is used for symbol class specification and therefore is not part of the encoded symbol. Symbol class specification char for this symbol class is defined by b_symbol.io_mapper[“b_Sym_char”].
Returns: | Symbol representation compatible with FSM tools. |
---|---|
Return type: | string |
Return supported types of symbols for current type of symbol.
Returns: | Supported types of symbols for current type of symbol. |
---|---|
Return type: | list(int) |
Creates symbol from its string representation compatible with FSM tools. See export method for more datails.
Parameters: |
|
---|
Return True if symbol is empty. False if is not empty.
Returns: | True if symbol is empty. False if is not empty. |
---|---|
Return type: | boolean |
This module groups all exceptions of Netbench - Pattern Match.
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising ALPHABET_COLLISION_FREE_ERROR exception if something happen with “Alphabet collision free” flag of automaton. For example, does not exist or has unexpected value.
Parameter: | msg (string) – Optional string describing the cause of the exception. Defaults to empty string. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising COMPUTE_ERROR exception if something happen with variable self._compute of automaton. For example, does not exist or has unexpected value.
Parameter: | msg (string) – Optional string describing the cause of the exception. Defaults to empty string. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising DETERMINISTIC_ERROR exception if something happen with “Deterministic” flag of automaton. For example, does not exist or has unexpected value.
Parameter: | msg (string) – Optional string describing the cause of the exception. Defaults to empty string. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising MINIMAL_ERROR exception if something happen with “Minimal” flag of automaton. For example, does not exist or has unexpected value.
Parameter: | msg (string) – Optional string describing the cause of the exception. Defaults to empty string. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising exception if automaton is empty.
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising general NOT IMPLEMENTED Exceptions. This exception should be used when some method, class, function or feature is defined but not implemented yet.
Parameter: | msg (string) – Optional string describing the cause of the exception. Defaults to empty string. |
---|
Bases: exceptions.Exception
Base class for all exceptions defined in pattern match part of the Netbench.
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising exception if unsuported data typy was pased to method.
Parameters: |
|
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising symbol exception if the detected type of class in import string doesn’t corespond to any class of the symbol.
Parameter: | msg (string) – String containing the detected type of class in import string. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising exception if environment variable NETBENCHPATH is not set.
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising symbol exception if file is not a timbuk formated file.
Parameter: | msg (string) – String containing the detected type of class in import string. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising exception if automaton is empty.
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising exception if automaton is not strided.
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising exception if pcre parser C-based implementation can not be run or compiled. Version for PCRE parser.
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising state exception if unsupported join colour operation is required.
Parameter: | operation (string) – Requested operation. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising state exception if state id collision in automaton occured.
Parameter: | state_id (int) – Id of the state, which caused the collision. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising state exception if join cannot be resolved. Neither state is able to compute join with other symbol.
Parameters: |
|
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising symbol acception Exceptions.
Parameter: | msg (string) – Optional string describing the cause of the exception. Defaults to empty string. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising symbol exception if double stride cannot be resolved. Neither symbol is able to compute double stride with other symbol.
Parameters: |
|
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising symbol exception if symbol equality cannot be resolved. Neither symbol is able to resolve equality with other symbol.
Parameters: |
|
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising symbol exception if symbol id collision in automaton occured.
Parameter: | symbol_id (int) – Id of the symbol, which caused the collision. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising symbol exception if the detected type of class in import string doesn’t corespond to the class of the symbol.
Parameter: | msg (string) – Optional string describing the cause of the exception. Defaults to empty string. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising symbol exception if symbol have not been found in alphabet during retrival of its alphabet id.
Parameter: | msg (string) – String containing string description of the symbol. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising symbol exception if collision cannot be resolved. Neither symbol is able to resolve collision with other symbol.
Parameters: |
|
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising symbol exception if the passed string to accept method is shorter than length of the symbol.
Parameter: | msg (string) – Optional string describing the cause of the exception. Defaults to empty string. |
---|
Bases: netbench.pattern_match.pattern_exceptions.general_pattern_exception
Class for raising unknown parser exception. This exception is used when unknown parser class name is passed to parser mataclass for wrapping any parser under single interface.
Parameter: | msg (string) – Name of the class that caused this exception. |
---|
Bases: netbench.pattern_match.b_state.b_State
State class with asociated colours.
Parameters: |
|
---|
Computes join of two states. Note that new state ID must be set after. Default value is -2.
Parameter: | other (b_State) – Second state. |
---|---|
Returns: | Joined state. |
Return type: | ColouredState |
Raises: | state_colour_operation_not_supported_exception if unsupported join method is set. |
Returns colours of state.
Returns: | Colours of state. |
---|---|
Return type: | set(int) |
Returns join method of state.
Returns: | Join method of state. |
---|---|
Return type: | string |
Returns text description for graph representation.
Returns: | Text description of state |
---|---|
Return type: | string |
Sets colours of state.
Parameter: | colours (set(int)) – New colours of state. |
---|
Sets join method of state.
Parameter: | join_method (string) – New join method of state. |
---|
Base class for the pattern_match experiments. It defines only basic functions.
Reports amount of logic consumed by the algorithm. The meaning of this value can differs depending on the approach (number of cores in multicore versus number of LUTs in FPGA. For specific information check the report function of the child.
Returns: | Amount of logic consumed by the algorithm. |
---|---|
Return type: | Depends on the algorithm. |
Reports amount of the memory consumed by this algorithm. The returned number is in the bytes.
Returns: | Amount of the memory consumed/ |
---|---|
Return type: | int |
This function will find patterns in the given string by the specified approach.
Parameter: | input_string (string) – String in which will be the patterns found. |
---|---|
Returns: | Bitmap of matched patterns. Match is indicated by 1, mismatch by 0. Number of fields in this array is equal to the count of patterns. |
Return type: | list(int) |
Bases: netbench.pattern_match.b_symbol.b_Symbol
A base class to represent a EOF (end of file or input) symbol.
Parameters: |
|
---|
If symbol is at the beginning of the text, is removed from the text and reminder is returned. Otherwise accept_exception is raised.
Parameter: | text (string) – Text to be parsed. |
---|---|
Returns: | Text without begining. |
Return type: | string |
Rises: | accept_exception if symbol is not at the begining of the text. |
Return True if two or more symbols from set_of_symbols can be accepted for the same text.
Parameter: | set_of_symbols (set(b_Symbol)) – Set of symbols. |
---|---|
Returns: | True if at least two symbols are in collision, otherwise False is returned. |
Return type: | boolean |
Compute collision between self and compSymbol.
Parameter: | other (b_Symbol) – Other symbol. |
---|---|
Returns: | Resolved collision - changes to the symbols and new ones, if they are created. |
Return type: | tuple(set(b_Symbol), set(b_Symbol), set(b_Symbol)) |
Compute double stride using self and compSymbol. This method should be called only by double_stride method.
Parameters: |
|
---|---|
Returns: | New strided symbol and unused symbols from local chars. |
Return type: | tuple(b_Symbol, list(set())) |
Compute if two symbols (self and other) are equivalent.
Parameter: | other – Other symbol. |
---|---|
Returns: | True if the symbols are equivalent, otherwise returns False. |
Return type: | boolean |
Returns symbol representation compatible with FSM tools - http://www2.research.att.com/~fsmtools/fsm/man4/fsm.5.html. The symbol is encoded in string without whitespace chars. First char is used for symbol class specification and therefore is not part of the encoded symbol. Symbol class specification char for this symbol class is defined by b_symbol.io_mapper[“b_Sym_char”].
Returns: | Symbol representation compatible with FSM tools. |
---|---|
Return type: | string |
Return supported types of symbols for current type of symbol.
Returns: | Supported types of symbols for current type of symbol. |
---|---|
Return type: | list(int) |
Creates symbol from its string representation compatible with FSM tools. See export method for more datails.
Parameters: |
|
---|
Return True if symbol is empty. False if is not empty.
Returns: | True if symbol is empty. False if is not empty. |
---|---|
Return type: | boolean |
Bases: netbench.pattern_match.b_ptrn_match.b_ptrn_match
A base class for the pattern matching approaches based on finite automaton.
This function is used to create automaton from set of regular expressions.
Parameter: | nfa_parser_class (nfa_parser) – An instation of nfa_parser class. |
---|---|
Returns: | False if creation of automaton failed or True if creation was successful. |
Return type: | boolean |
This method sets _compute to False, and get_compute() will return False until compute() is called.
Creates char classes, if they can be created. Replaced transitions are removed. Unused symbols are removed after creation of char classes.
Flags: | Sets Alphabet collision free flag to False |
---|
This method sets _compute to False, and get_compute() will return False until compute() is called.
Create automaton from nfa_data object.
Parameters: |
|
---|
This method sets _compute to False, and get_compute() will return False until compute() is called.
Compute epsilon closure for selected state.
Parameters: |
|
---|---|
Returns: | Set containing epsilon closure for state. |
Return type: | set(int) |
Return nfa_data object from the automaton.
Parameter: | safe (boolean) – If True return deep copy, otherwise return reference. Default value is True. |
---|---|
Returns: | nfa_data object from the automaton. |
Return type: | nfa_data |
Warning: If safe=False is used, the result should be used as read-only (Only reference on self._automaton object is returned). Otherwise it can cause undefined behavior!
Returns values of flag. If flag is not in dict() of flags, dict() exception is thrown (use has_flag() before every get_flag()).
Parameter: | flag (string) – String key (for example “Deterministic”) |
---|---|
Returns: | Value of flag. |
Return type: | Type depends on the flag |
Returns value of _multilanguage attribute. Determinates behavior of automata minimisation and reduction methods. If set to True, methods preserves relation between final states and coresponding regular expresions (languages). If set to False, methods works exactly by their definitions - final states are joined if possible.
Returns: | Value of the _multilanguage attribute. |
---|---|
Return type: | boolean |
Computes set of nondeterministic states.
Parameter: | mapper (dict(int, dict(int, set(int)))) – Mapping between states and their transitions. If set to None the mapping is computed. |
---|---|
Returns: | Nondeterministic states. |
Return type: | set(int) |
Detects if cycle exists in automaton.
Returns: | True if nfa_data contain cycle. False otherwise. |
---|---|
Return type: | boolean |
Returns existence of flag, not value.
Parameter: | flag (string) – String key (for example “Deterministic”) |
---|---|
Returns: | True if flag exists. Otherwise returns False. |
Return type: | boolean |
Join two automata. Joins current automaton with second. Current automaton is modified.
Parameters: |
|
---|---|
Flags: | Sets flags Deterministic and Epsilon Free to False. |
This method sets _compute to False, and get_compute() will return False until compute() is called.
Reduce alphabet by character classes. This create another type of char class compatibil with DFA. Alphabet should contain only symbols of b_sym_char class. This function reduces the alphabet into its equivalence class. Creates minimal deterministic alphabet.
Flags: | Sets Alphabet collision free flag to True |
---|
This method sets _compute to False, and get_compute() will return False until compute() is called.
Remove char classes from automaton. Removed char classes are substituted with equivalent chars and coresponding transitions are added.
NOTE: This method is deprecated. Use remove_char_classes().
Remove char classes from automaton. Removed char classes are substituted with equivalent chars and coresponding transitions are added.
Flags: | Sets Alphabet collision free flag to True |
---|
This method sets _compute to False, and get_compute() will return False until compute() is called.
Remove all epsilon transitions from automaton. Also removes all isolated, unreachable and blind states.
Flags: | Sets flag Epsilon Free to True. |
---|
This method sets _compute to False, and get_compute() will return False until compute() is called.
Removes isolated, unreachable and blind states.
This method sets _compute to False, and get_compute() will return False until compute() is called.
Report consumed memory in bytes. Naive mapping algorithm is used (2D array).
Returns: | Returns number of bytes. |
---|---|
Return type: | int |
Report consumed memory in bytes. Optimal mapping algorithm is used (with oracle).
Returns: | Returns number of bytes. |
---|---|
Return type: | int |
Alphabet collision free. Example: Alphabet {1:[“a”, “b”, “c”], 2: [“a”, “d”]}.
After resolve_alphabet() will be Alphabet {1:[“b”, “c”], 2:[“d”], 3:[“a]}.
This method sets _compute to False, and get_compute() will return False until compute() is called.
This function will find patterns in the given string by the specified approach. This default version uses nfa_data for it. Approaches should reimplement this function.
Parameters: |
|
---|---|
Returns: | Bitmap of matched regular expressions. |
Return type: | list(int) |
Sets flag to the value.
Parameters: |
|
---|
Sets value of _multilanguage attribute. Determinates behavior of automata minimisation and reduction methods. If set to True, methods preserves relation between final states and coresponding regular expresions (languages). If set to False, methods works exactly by their definitions - final states are joined if possible.
Parameter: | value (boolean) – New value of the _multilanguage attribute. |
---|
Save graphviz dot file, representing graph of automaton.
Parameters: |
|
---|---|
Returns: | True if success, False otherwise. |
Return type: | boolean |
Transform automaton to 2-stride automaton. This new automaton accept 2 chars per transition. If automaton is already strided, then 2*stride automaton will be created. In this case the automaton accept 2*stride chars per cycle. Removes Deterministic Flag. Input automaton must be eps free.
Parameter: | all_chars (set(char)) – set of all chars in alphabet. If Not set, it defaults to ASCII 0 - 255. |
---|---|
Flags: | Sets Strided Flag to True and set Stride Flag to current stride. |
This method sets _compute to False, and get_compute() will return False until compute() is called.
This algorithm is based on article: Brodie et al. A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching, http://dx.doi.org/10.1109/ISCA.2006.7
Bases: netbench.pattern_match.b_automaton.b_Automaton
A base class for NFA automata.
Report consumed memory in bytes. Naive mapping algorithm is used (2D array). Basic algorithm for this variant of mapping is: M = |states| * |alphabet| * ceil(log(|states| + 1, 2) / 8) + cnt * ceil(log(|states| + 1, 2) / 8) where cnt is number of nondeterministic transitions.
Returns: | Returns number of bytes. |
---|---|
Return type: | int |
Report consumed memory in bytes. Optimal mapping algorithm is used (with oracle). Basic algorithm for this variant of mapping is: M = |transitions| * ceil(log(|states|, 2)+1 / 8)
Returns: | Returns number of bytes. |
---|---|
Return type: | int |