Phf_Dfa Documentation

This page contains the Phf_Dfa Package documentation.

The phf_dfa Module

class netbench.pattern_match.algorithms.phf_dfa.phf_dfa.PHF_DFA

Bases: netbench.pattern_match.b_dfa.b_dfa

Class for the DFA based on perfect hashing. Extension with faulty transition table is also implemented. To enable this extension, use method enable_faulty_transitions.

Based on:

Kastil, J., Korenek, J., Lengal, O.: Methodology for Fast Pattern Matching by Deterministic Finite Automaton with Perfect Hashing, In: 12th EUROMICRO Conference on Digital System Design DSD 2009, Patras, GR, IEEE CS, 2009, p. 823-289, ISBN 978-0-7695-3782-5 URL: http://www.fit.vutbr.cz/research/view_pub.php?id=9054

Kastil, J., Korenek, J.: High Speed Pattern Matching Algorithm Based on Deterministic Finite Automata with Faulty Transition Table, In: Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems, La Jolla, US, ACM, 2010, p. 2, ISBN 978-1-4503-0379-8 URL: http://www.fit.vutbr.cz/research/view_pub.php?id=9380

This class uses bitstring module: http://code.google.com/p/python-bitstring/

compute()

This method computes DFA with Perfect hashing.

Note that PHF table generation may fail, depending on PHF class parameters (limit, ratio), so _compute may not be set to True.

decode_symbol(input, fromState=-1)

Compute the symbol accepting the beggining of input string.

Parameters:
  • input (string) – Input string.
  • fromState (int) – The active state of automaton. This would be required with nondeterministic alphabet (not implemented).
Returns:

ID of found symbol or -1 if no symbol was found and the remainder of input string.

Return type:

tuple(string, int)

disable_fallback_state()
Fallback state is disabled. _compute is set to False.
disable_faulty_transitions()
Disable faulty transition table. _compute is set to False.
enable_fallback_state(state=-1, warning=True)

This function sets the fallback state. This requires fully defined automaton! If state is not specified, the one with most incoming transitions is used. Warning about fully defined automaton is printed on stdout, unless the parameter warning is set to False. _compute is set to False.

Parameters:
  • state (int) – ID of state to be set as fallback state. All transitions to this state will be removed.
  • warning (boolean) – Disables the printing of warning about need of fully defined automaton.
enable_faulty_check()
Experimental method - enables generation of nonfaulty table.
enable_faulty_transitions(hash_size, compress_hash=None)

Enable faulty transition table. Uniform hash function is used to compress the transition representation. If small hash size is chosen, collisions and faults in pattern matching may occure. PHF table must be (re)generated after enabling faulty transitions (method generate_PHF_table() or compute()). This method sets _compute to False.

Parameters:
  • hash_size (int) – Size in bits of hash function output.
  • compress_hash (b_hash) – Instance of class implementing the compress hash function. In not specified, jenkins_compress class is created with output size of hash_size.
generate_PHF_table()

This method generates the PHF table containing the transition table.

PHF table has 4 columns:
  • 2 bits required by bdz algorithm

  • transition representation (state and symbol)

    (empty lines contain maximum symbol and state id, computed from symbol_bits and state_bits)

  • next state

  • compressed value of transition representation (faulty trans)

Returns:True if table was generated succesfully. False otherwise.
Return type:Boolean
get_alpha_num()
Number of symbols in alphabet in nfa_data object
get_state_num()
Number of states of nfa_data object
get_table_parameters()

This function returns 2-tuple of integers. First integer is the number of bit used for the representation of the state. The second integer is a number of bits representing symbol in the transition table. State and symbol together represents the key to the transition table.

Returns:Table parameters - number of bits used for the representation of state and symbol.
Return type:tuple(int, int)
get_trans_num()
Number of transitions of nfa_data object
print_conf_file(FileName)

Experimental code, intended only for internal use. This function creates text file containing binary representation of the transition table.

Parameter:FileName (string) – The name of output file.
remove_fallback_transitions()

Remove all transitions to given state in nfa_data _automaton1. _automaton is unchanged, so dfa algorithms used in compute (like determinization or minimize) are working. This method should only be called from compute() and it sets _compute to False.

Parameter:state (int) – ID of the fallback state. All transitions to this state will be removed.
report_memory_naive()

Report consumed memory in bytes. Naive mapping algorithm is used (2D array). Basic algorithm for this variant of mapping is: M = |states| * |alphabet| * ceil(log(|states| + 1, 2) / 8)

Returns:Returns number of bytes.
Return type:int
report_memory_optimal()

Report consumed memory in bytes. Optimal mapping algorithm is used (with oracle). Basic algorithm for this variant of mapping is: M = |transitions| * ceil(log(|states|, 2) / 8)

Returns:Returns number of bytes.
Return type:int
report_memory_real()

Report consumed memory in bytes - the size of PHF table.

Returns:Size of PHF table in bytes.
Return type:int
search(input_string)

Search for the pattern in the given string.

Parameters:
  • input_string – Input string.
  • input_string – string
Returns:

Bitmap of matched regular expressions.

Return type:

list(int)

set_PHF_class(phf_class)

Set the PHF class for PHF table generation. _compute is set to False.

Parameter:phf_class (bdz()) – Class implementing the perfect hash function
set_table_parameters(bit_nums)

This function sets the number of bits representing the state and symbol in the transition table. The parameter is a 2-tuple of integers. The first one sets the size of the state representation and the second position represents the size of symbol. NOTE: number of states/symbols must be less than 2 ** state_bits/symbol_bits !

Parameter:bit_nums (tuple(int, int)) – Table parameters - number of bits used for the representation of state and symbol.
validate_transition(tran)

Validate given transition. If faulty transitions are enabled, validation is made based on the compressed value and the result may be faulty. In that case, counter self.bad_transitions is incremented and collision is stored in dictionary self.collisions.

Parameter:tran (tuple(string, string)) – Representation of transition to be validated.
Returns:True if the transition is valid in this automaton. Otherwise False is returned.
Return type:boolean

The test_phf_dfa Module

class netbench.pattern_match.algorithms.phf_dfa.test_phf_dfa.test_PHF_DFA(methodName='runTest')

Bases: unittest.TestCase

This class tests implementation of the algorithm PHF_DFA.

test___init__()
__init__()
test_compute()
compute()
test_decode_symbol()
decode_symbol()
test_disable_fallback_state()
disable_fallback_state()
test_disable_faulty_transitions()
disable_faulty_transitions()
test_enable_fallback_state()
enable_fallback_state()
test_enable_faulty_transitions()
enable_faulty_transitions()
test_generate_PHF_table()
generate_PHF_table()
test_get_alpha_num()
get_alpha_num()
test_get_state_num()
get_state_num()
test_get_table_parameters()
get_table_parameters
test_get_trans_num()
get_trans_num()
test_remove_fallback_transitions()
remove_fallback_transitions()
test_report_memory_naive()
report_memory_naive()
test_report_memory_optimal()
report_memory_optimal()
test_report_memory_real()
report_memory_real()
search()
test_set_PHF_class()
set_PHF_class()
test_set_table_parameters()
set_table_parameters()
test_validate_transition()
validate_transition()

Table Of Contents

Previous topic

History_Counting_Fa Documentation

Next topic

Clark_Nfa Documentation

This Page