PHP lexer code snarfed from the CVS tree for the lamplib project at http://sourceforge.net/projects/lamplib This project is administered by Markus Baker, Harry Fuecks and Matt Mitchell, and the project code is in the public domain. Thanks, guys!
Copyright: | Markus Baker, Harry Fuecks and Matt Mitchell |
License: | Public Domain {@link http://sourceforge.net/projects/lamplib} |
File Size: | 443 lines (16 kb) |
Included or required: | 0 times |
Referenced: | 0 times |
Includes or requires: | 0 files |
ParallelRegex:: (2 methods):
__construct()
ParallelRegex()
name:: (16 methods):
addPattern()
match()
_getCompoundedRegex()
_getPerlMatchingFlags()
getCurrent()
enter()
leave()
addPattern()
addEntryPattern()
addExitPattern()
addSpecialPattern()
mapHandler()
parse()
_dispatchTokens()
_invokeParser()
_reduce()
StateStack:: (2 methods):
__construct()
StateStack()
name:: (16 methods):
addPattern()
match()
_getCompoundedRegex()
_getPerlMatchingFlags()
getCurrent()
enter()
leave()
addPattern()
addEntryPattern()
addExitPattern()
addSpecialPattern()
mapHandler()
parse()
_dispatchTokens()
_invokeParser()
_reduce()
Lexer:: (2 methods):
__construct()
Lexer()
name:: (16 methods):
addPattern()
match()
_getCompoundedRegex()
_getPerlMatchingFlags()
getCurrent()
enter()
leave()
addPattern()
addEntryPattern()
addExitPattern()
addSpecialPattern()
mapHandler()
parse()
_dispatchTokens()
_invokeParser()
_reduce()
Class: ParallelRegex - X-Ref
Compounded regular expression. Any of__construct($case) X-Ref |
Constructor. Starts with no patterns. param: bool $case True for case sensitive, false |
ParallelRegex($case) X-Ref |
Old syntax of class constructor. Deprecated in PHP7. |
addPattern($pattern, $label = true) X-Ref |
Adds a pattern with an optional label. param: string $pattern Perl style regex, but ( and ) param: string $label Label of regex to be returned |
match($subject, &$match) X-Ref |
Attempts to match all patterns at once against a string. param: string $subject String to match against. param: string $match First matched portion of return: bool True on success. |
_getCompoundedRegex() X-Ref |
Compounds the patterns into a single regular expression separated with the "or" operator. Caches the regex. Will automatically escape (, ) and / tokens. |
_getPerlMatchingFlags() X-Ref |
Accessor for perl regex mode flags to use. return: string Flags as string. |
getCurrent() X-Ref |
Accessor for current state. return: string State as string. |
enter($state) X-Ref |
Adds a state to the stack and sets it to be the current state. param: string $state New state. |
leave() X-Ref |
Leaves the current state and reverts to the previous one. return: bool False if we drop off |
addPattern($pattern, $mode = "accept") X-Ref |
Adds a token search pattern for a particular parsing mode. The pattern does not change the current mode. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this |
addEntryPattern($pattern, $mode, $new_mode) X-Ref |
Adds a pattern that will enter a new parsing mode. Useful for entering parenthesis, strings, tags, etc. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this param: string $new_mode Change parsing to this new |
addExitPattern($pattern, $mode) X-Ref |
Adds a pattern that will exit the current mode and re-enter the previous one. param: string $pattern Perl style regex, but ( and ) param: string $mode Mode to leave. |
addSpecialPattern($pattern, $mode, $special) X-Ref |
Adds a pattern that has a special mode. Acts as an entry and exit pattern in one go. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this param: string $special Use this mode for this one token. |
mapHandler($mode, $handler) X-Ref |
Adds a mapping from a mode to another handler. param: string $mode Mode to be remapped. param: string $handler New target handler. |
parse($raw) X-Ref |
Splits the page text into tokens. Will fail if the handlers report an error or if no content is consumed. If successful then each unparsed and parsed token invokes a call to the held listener. param: string $raw Raw HTML text. return: bool True on success, else false. |
_dispatchTokens($unmatched, $matched, $mode = false) X-Ref |
Sends the matched token and any leading unmatched text to the parser changing the lexer to a new mode if one is listed. param: string $unmatched Unmatched leading portion. param: string $matched Actual token match. param: string $mode Mode after match. The "_exit" return: bool False if there was any error |
_invokeParser($content, $is_match) X-Ref |
Calls the parser method named after the current mode. Empty content will be ignored. param: string $content Text parsed. param: string $is_match Token is recognised rather |
_reduce(&$raw) X-Ref |
Tries to match a chunk of text and if successful removes the recognised chunk and any leading unparsed data. Empty strings will not be matched. param: string $raw The subject to parse. This is the return: bool|array Three item list of unparsed |
Class: StateStack - X-Ref
States for a stack machine.__construct($start) X-Ref |
Constructor. Starts in named state. param: string $start Starting state name. |
StateStack($start) X-Ref |
Old syntax of class constructor. Deprecated in PHP7. |
addPattern($pattern, $label = true) X-Ref |
Adds a pattern with an optional label. param: string $pattern Perl style regex, but ( and ) param: string $label Label of regex to be returned |
match($subject, &$match) X-Ref |
Attempts to match all patterns at once against a string. param: string $subject String to match against. param: string $match First matched portion of return: bool True on success. |
_getCompoundedRegex() X-Ref |
Compounds the patterns into a single regular expression separated with the "or" operator. Caches the regex. Will automatically escape (, ) and / tokens. |
_getPerlMatchingFlags() X-Ref |
Accessor for perl regex mode flags to use. return: string Flags as string. |
getCurrent() X-Ref |
Accessor for current state. return: string State as string. |
enter($state) X-Ref |
Adds a state to the stack and sets it to be the current state. param: string $state New state. |
leave() X-Ref |
Leaves the current state and reverts to the previous one. return: bool False if we drop off |
addPattern($pattern, $mode = "accept") X-Ref |
Adds a token search pattern for a particular parsing mode. The pattern does not change the current mode. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this |
addEntryPattern($pattern, $mode, $new_mode) X-Ref |
Adds a pattern that will enter a new parsing mode. Useful for entering parenthesis, strings, tags, etc. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this param: string $new_mode Change parsing to this new |
addExitPattern($pattern, $mode) X-Ref |
Adds a pattern that will exit the current mode and re-enter the previous one. param: string $pattern Perl style regex, but ( and ) param: string $mode Mode to leave. |
addSpecialPattern($pattern, $mode, $special) X-Ref |
Adds a pattern that has a special mode. Acts as an entry and exit pattern in one go. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this param: string $special Use this mode for this one token. |
mapHandler($mode, $handler) X-Ref |
Adds a mapping from a mode to another handler. param: string $mode Mode to be remapped. param: string $handler New target handler. |
parse($raw) X-Ref |
Splits the page text into tokens. Will fail if the handlers report an error or if no content is consumed. If successful then each unparsed and parsed token invokes a call to the held listener. param: string $raw Raw HTML text. return: bool True on success, else false. |
_dispatchTokens($unmatched, $matched, $mode = false) X-Ref |
Sends the matched token and any leading unmatched text to the parser changing the lexer to a new mode if one is listed. param: string $unmatched Unmatched leading portion. param: string $matched Actual token match. param: string $mode Mode after match. The "_exit" return: bool False if there was any error |
_invokeParser($content, $is_match) X-Ref |
Calls the parser method named after the current mode. Empty content will be ignored. param: string $content Text parsed. param: string $is_match Token is recognised rather |
_reduce(&$raw) X-Ref |
Tries to match a chunk of text and if successful removes the recognised chunk and any leading unparsed data. Empty strings will not be matched. param: string $raw The subject to parse. This is the return: bool|array Three item list of unparsed |
__construct(&$parser, $start = "accept", $case = false) X-Ref |
Sets up the lexer in case insensitive matching by default. param: object $parser Handling strategy by param: string $start Starting handler. param: bool $case True for case sensitive. |
Lexer(&$parser, $start = "accept", $case = false) X-Ref |
Old syntax of class constructor. Deprecated in PHP7. |
addPattern($pattern, $label = true) X-Ref |
Adds a pattern with an optional label. param: string $pattern Perl style regex, but ( and ) param: string $label Label of regex to be returned |
match($subject, &$match) X-Ref |
Attempts to match all patterns at once against a string. param: string $subject String to match against. param: string $match First matched portion of return: bool True on success. |
_getCompoundedRegex() X-Ref |
Compounds the patterns into a single regular expression separated with the "or" operator. Caches the regex. Will automatically escape (, ) and / tokens. |
_getPerlMatchingFlags() X-Ref |
Accessor for perl regex mode flags to use. return: string Flags as string. |
getCurrent() X-Ref |
Accessor for current state. return: string State as string. |
enter($state) X-Ref |
Adds a state to the stack and sets it to be the current state. param: string $state New state. |
leave() X-Ref |
Leaves the current state and reverts to the previous one. return: bool False if we drop off |
addPattern($pattern, $mode = "accept") X-Ref |
Adds a token search pattern for a particular parsing mode. The pattern does not change the current mode. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this |
addEntryPattern($pattern, $mode, $new_mode) X-Ref |
Adds a pattern that will enter a new parsing mode. Useful for entering parenthesis, strings, tags, etc. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this param: string $new_mode Change parsing to this new |
addExitPattern($pattern, $mode) X-Ref |
Adds a pattern that will exit the current mode and re-enter the previous one. param: string $pattern Perl style regex, but ( and ) param: string $mode Mode to leave. |
addSpecialPattern($pattern, $mode, $special) X-Ref |
Adds a pattern that has a special mode. Acts as an entry and exit pattern in one go. param: string $pattern Perl style regex, but ( and ) param: string $mode Should only apply this param: string $special Use this mode for this one token. |
mapHandler($mode, $handler) X-Ref |
Adds a mapping from a mode to another handler. param: string $mode Mode to be remapped. param: string $handler New target handler. |
parse($raw) X-Ref |
Splits the page text into tokens. Will fail if the handlers report an error or if no content is consumed. If successful then each unparsed and parsed token invokes a call to the held listener. param: string $raw Raw HTML text. return: bool True on success, else false. |
_dispatchTokens($unmatched, $matched, $mode = false) X-Ref |
Sends the matched token and any leading unmatched text to the parser changing the lexer to a new mode if one is listed. param: string $unmatched Unmatched leading portion. param: string $matched Actual token match. param: string $mode Mode after match. The "_exit" return: bool False if there was any error |
_invokeParser($content, $is_match) X-Ref |
Calls the parser method named after the current mode. Empty content will be ignored. param: string $content Text parsed. param: string $is_match Token is recognised rather |
_reduce(&$raw) X-Ref |
Tries to match a chunk of text and if successful removes the recognised chunk and any leading unparsed data. Empty strings will not be matched. param: string $raw The subject to parse. This is the return: bool|array Three item list of unparsed |