AhoCorasick
PHP implementation of the Aho-Corasick string search algorithm
Loading...
Searching...
No Matches
AhoCorasick\MultiStringMatcher Class Reference

Represents a finite state machine that can find all occurrences of a set of search keywords in a body of text. More...

+ Inheritance diagram for AhoCorasick\MultiStringMatcher:

Public Member Functions

 __construct (array $searchKeywords)
 Constructor.
 
 getKeywords ()
 Accessor for the search keywords.
 
 nextState ( $currentState, $inputChar)
 Map the current state and input character to the next state.
 
 searchIn ( $text)
 Locate the search keywords in some text.
 

Protected Member Functions

 computeYesTransitions ()
 Get the state transitions which the string-matching automaton shall make as it advances through input text.
 
 computeNoTransitions ()
 Get the state transitions which the string-matching automaton shall make when a partial match proves false.
 

Protected Attributes

 $searchKeywords = []
 
 $numStates = 1
 
 $outputs = []
 
 $noTransitions = []
 
 $yesTransitions = []
 

Detailed Description

Represents a finite state machine that can find all occurrences of a set of search keywords in a body of text.

The time it takes to construct the finite state machine is proportional to the sum of the lengths of the search keywords. Once constructed, the machine can locate all occurences of all search keywords in a body of text in a single pass, making exactly one state transition per input character.

This is an implementation of the Aho-Corasick string matching algorithm.

Alfred V. Aho and Margaret J. Corasick, "Efficient string matching: an aid to bibliographic search", CACM, 18(6):333-340, June 1975.

http://xlinux.nist.gov/dads/HTML/ahoCorasick.html

Constructor & Destructor Documentation

◆ __construct()

AhoCorasick\MultiStringMatcher::__construct ( array $searchKeywords)

Constructor.

Parameters
string[]$searchKeywordsThe set of keywords to be matched.

Reimplemented in AhoCorasick\MultiStringReplacer.

Member Function Documentation

◆ computeYesTransitions()

AhoCorasick\MultiStringMatcher::computeYesTransitions ( )
protected

Get the state transitions which the string-matching automaton shall make as it advances through input text.

Constructs a directed tree with a root node which represents the initial state of the string-matching automaton and from which a path exists which spells out each search keyword.

◆ getKeywords()

AhoCorasick\MultiStringMatcher::getKeywords ( )

Accessor for the search keywords.

Returns
string[] Search keywords.

◆ nextState()

AhoCorasick\MultiStringMatcher::nextState ( $currentState,
$inputChar )

Map the current state and input character to the next state.

Parameters
int$currentStateThe current state of the string-matching automaton.
string$inputCharThe character the string-matching automaton is currently processing.
Returns
int The state the automaton should transition to.

◆ searchIn()

AhoCorasick\MultiStringMatcher::searchIn ( $text)

Locate the search keywords in some text.

Parameters
string$textThe string to search in.
Returns
array[] An array of matches. Each match is a vector containing an integer offset and the matched keyword.
Example:
$keywords = new MultiStringMatcher( array( 'ore', 'hell' ) );
$keywords->searchIn( 'She sells sea shells by the sea shore.' );
// result: array( array( 15, 'hell' ), array( 34, 'ore' ) )
Represents a finite state machine that can find all occurrences of a set of search keywords in a body...
Definition MultiStringMatcher.php:41

The documentation for this class was generated from the following file: