Parsing and lexing

目录

简介

Single state lexer class. Lexemes can be defined on the fly. If the particular lexer instance is meant to be used with Parle\Parser, the token IDs need to be taken from there. Otherwise, arbitrary token IDs can be supplied. This lexer can give a certain performance advantage over Parle\RLexer, if no multiple states are required. Note, that Parle\RParser is not compatible with this lexer.

类摘要

Parle\Lexer

class Parle\Lexer {

/* Constants */

const integer Parle\Lexer::ICASE = 1 ;

const integer Parle\Lexer::DOT_NOT_LF = 2 ;

const integer Parle\Lexer::DOT_NOT_CRLF = 4 ;

const integer Parle\Lexer::SKIP_WS = 8 ;

const integer Parle\Lexer::MATCH_ZERO_LEN = 16 ;

/* 属性 */

public boolean $bol = FALSE ;

public integer $flags = 0 ;

public integer $state = 0 ;

public integer $marker = 0 ;

public integer $cursor = 0 ;

/* 方法 */

public void advance ( void )

public void build ( void )

public void callout ( int $id , callable $callback )

public void consume ( string $data )

public void dump ( void )

public Parle\Token getToken ( void )

public void insertMacro ( string $name , string $regex )

public void push ( string $regex , int $id )

public void reset ( int $pos )

}

预定义常量

Parle\Lexer::ICASE

Parle\Lexer::DOT_NOT_LF

Parle\Lexer::DOT_NOT_CRLF

Parle\Lexer::SKIP_WS

Parle\Lexer::MATCH_ZERO_LEN

属性

bol
Start of input flag.

flags
Lexer flags.

state
Current lexer state, readonly.

marker
Position of the latest token match, readonly.

cursor
Current input offset, readonly.

Parle\Lexer::advance

Process next lexer rule

说明

public void Parle\Lexer::advance ( void )

Processes the next rule and prepares the resulting token data.

参数

此函数没有参数。

返回值

没有返回值。

Parle\Lexer::build

Finalize the lexer rule set

说明

public void Parle\Lexer::build ( void )

Rules, previously added with Parle\Lexer::push are finalized. This method call has to be done after all the necessary rules was pushed. The rule set becomes read only. The lexing can begin.

参数

此函数没有参数。

返回值

没有返回值。

Parle\Lexer::callout

Define token callback

说明

public void Parle\Lexer::callout ( int $id , callable $callback )

Define a callback to be invoked once lexer encounters a particular token.

参数

id
Token id.

callback
Callable to be invoked. The callable doesn't receive any arguments and its return value is ignored.

返回值

没有返回值。

Parle\Lexer::consume

Pass the data for processing

说明

public void Parle\Lexer::consume ( string $data )

Consume the data for lexing.

参数

data
Data to be lexed.

返回值

没有返回值。

Parle\Lexer::dump

Dump the state machine

说明

public void Parle\Lexer::dump ( void )

Dump the current state machine to stdout.

参数

此函数没有参数。

返回值

没有返回值。

Parle\Lexer::getToken

Retrieve the current token

说明

public Parle\Token Parle\Lexer::getToken ( void )

Retrieve the current token.

参数

此函数没有参数。

返回值

Returns an instance of Parle\Token.

Parle\Lexer::insertMacro

Insert regex macro

说明

public void Parle\Lexer::insertMacro ( string $name , string $regex )

Insert a regex macro, that can be later used as a shortcut and included in other regular expressions.

参数

name
Name of the macros.

regex
Regular expression.

返回值

没有返回值。

Parle\Lexer::push

Add a lexer rule

说明

public void Parle\Lexer::push ( string $regex , int $id )

Push a pattern for lexeme recognition.

参数

regex
Regular expression used for token matching.

id
Token id. If the lexer instance is meant to be used standalone, this can be an arbitrary number. If the lexer instance is going to be passed to the parser, it has to be an id returned by Parle\Parser::tokenid.

返回值

没有返回值。

Parle\Lexer::reset

Reset lexer

说明

public void Parle\Lexer::reset ( int $pos )

Reset lexing optionally supplying the desired offset.

参数

pos
Reset position.

返回值

没有返回值。

简介

Multistate lexer class. Lexemes can be defined on the fly. If the particular lexer instance is meant to be used with Parle\RParser, the token IDs need to be taken from there. Otherwise, arbitrary token IDs can be supplied. Note, that Parle\Parser is not compatible with this lexer.

类摘要

Parle\RLexer

class Parle\RLexer {

/* Constants */

const integer Parle\RLexer::ICASE = 1 ;

const integer Parle\RLexer::DOT_NOT_LF = 2 ;

const integer Parle\RLexer::DOT_NOT_CRLF = 4 ;

const integer Parle\RLexer::SKIP_WS = 8 ;

const integer Parle\RLexer::MATCH_ZERO_LEN = 16 ;

/* 属性 */

public boolean $bol = FALSE ;

public integer $flags = 0 ;

public integer $state = 0 ;

public integer $marker = 0 ;

public integer $cursor = 0 ;

/* 方法 */

public void advance ( void )

public void build ( void )

public void callout ( int $id , callable $callback )

public void consume ( string $data )

public void dump ( void )

public Parle\Token getToken ( void )

public void insertMacro ( string $name , string $regex )

public void push ( string $regex , int $id )

public void push ( string $state , string $regex , int $id , string $newState )

public void push ( string $state , string $regex , string $newState )

public int pushState ( string $state )

public void reset ( int $pos )

}

预定义常量

Parle\RLexer::ICASE

Parle\RLexer::DOT_NOT_LF

Parle\RLexer::DOT_NOT_CRLF

Parle\RLexer::SKIP_WS

Parle\RLexer::MATCH_ZERO_LEN

属性

bol
Start of input flag.

flags
Lexer flags.

state
Current lexer state, readonly.

marker
Position of the latest token match, readonly.

cursor
Current input offset, readonly.

Parle\RLexer::advance

Process next lexer rule

说明

public void Parle\RLexer::advance ( void )

Processes the next rule and prepares the resulting token data.

参数

此函数没有参数。

返回值

没有返回值。

Parle\RLexer::build

Finalize the lexer rule set

说明

public void Parle\RLexer::build ( void )

Rules, previously added with Parle\RLexer::push are finalized. This method call has to be done after all the necessary rules was pushed. The rule set becomes read only. The lexing can begin.

参数

此函数没有参数。

返回值

没有返回值。

Parle\RLexer::callout

Define token callback

说明

public void Parle\RLexer::callout ( int $id , callable $callback )

Define a callback to be invoked once lexer encounters a particular token.

参数

id
Token id.

callback
Callable to be invoked. The callable doesn't receive any arguments and its return value is ignored.

返回值

没有返回值。

Parle\RLexer::consume

Pass the data for processing

说明

public void Parle\RLexer::consume ( string $data )

Consume the data for lexing.

参数

data
Data to be lexed.

返回值

没有返回值。

Parle\RLexer::dump

Dump the state machine

说明

public void Parle\RLexer::dump ( void )

Dump the current state machine to stdout.

参数

此函数没有参数。

返回值

没有返回值。

Parle\RLexer::getToken

Retrieve the current token

说明

public Parle\Token Parle\RLexer::getToken ( void )

Retrive the current token.

参数

此函数没有参数。

返回值

Returns an instance of Parle\Token.

Parle\RLexer::insertMacro

Insert regex macro

说明

public void Parle\RLexer::insertMacro ( string $name , string $regex )

Insert a regex macro, that can be later used as a shortcut and included in other regular expressions.

参数

name
Name of the macros.

regex
Regular expression.

返回值

没有返回值。

Parle\RLexer::push

Add a lexer rule

说明

public void Parle\RLexer::push ( string $regex , int $id )

public void Parle\RLexer::push ( string $state , string $regex , int $id , string $newState )

public void Parle\RLexer::push ( string $state , string $regex , string $newState )

Push a pattern for lexeme recognition.

A 'start state' and 'exit state' can be specified by using a suitable signature.

参数

regex
Regular expression used for token matching.

id
Token id. If the lexer instance is meant to be used standalone, this can be an arbitrary number. If the lexer instance is going to be passed to the parser, it has to be an id returned by Parle\RParser::tokenid.

state
State name. If '*' is used as start state, then the rule is applied to all lexer states.

newState
New state name, after the rule was applied.

If '.' is specified as the exit state, then the lexer state is unchanged when that rule matches. An exit state with '>' before the name means push. Use the signature without id for either continuation or to start matching, when a continuation or recursion is required.

If '<' is specified as exit state, it means pop. In that case, the signature containing the id can be used to identify the match. Note that even in the case an id is specified, the rule will finish first when all the previous pushes popped.

返回值

没有返回值。

Parle\RLexer::pushState

Push a new start state

说明

public int Parle\RLexer::pushState ( string $state )

This lexer type can have more than one state machine. This allows you to lex different tokens depending on context, thus allowing simple parsing to take place. Once a state pushed, it can be used with a suitable Parle\RLexer::push signature variant.

参数

state
Name of the state.

返回值

Parle\RLexer::reset

Reset lexer

说明

public void Parle\RLexer::reset ( int $pos )

Reset lexing optionally supplying the desired offset.

参数

pos
Reset position.

返回值

没有返回值。

简介

Parser class. Rules can be defined on the fly. Once finalized, a Parle\Lexer instance is required to deliver the token stream.

类摘要

Parle\Parser

class Parle\Parser {

/* Constants */

const integer Parle\Parser::ACTION_ERROR = 0 ;

const integer Parle\Parser::ACTION_SHIFT = 1 ;

const integer Parle\Parser::ACTION_REDUCE = 2 ;

const integer Parle\Parser::ACTION_GOTO = 3 ;

const integer Parle\Parser::ACTION_ACCEPT = 4 ;

const integer Parle\Parser::ERROR_SYNTAX = 0 ;

const integer Parle\Parser::ERROR_NON_ASSOCIATIVE = 1 ;

const integer Parle\Parser::ERROR_UNKNOWN_TOKEN = 2 ;

/* 属性 */

public integer $action = 0 ;

public integer $reduceId = 0 ;

/* 方法 */

public void advance ( void )

public void build ( void )

public void consume ( string $data , Parle\Lexer $lexer )

public void dump ( void )

public Parle\ErrorInfo errorInfo ( void )

public void left ( string $tok )

public void nonassoc ( string $tok )

public void precedence ( string $tok )

public int push ( string $name , string $rule )

public void reset ([ int $tokenId ] )

public void right ( string $tok )

public string sigil ( int $idx )

public void token ( string $tok )

public int tokenId ( string $tok )

public string trace ( void )

public bool validate ( string $data , Parle\Lexer $lexer )

}

预定义常量

Parle\Parser::ACTION_ERROR

Parle\Parser::ACTION_SHIFT

Parle\Parser::ACTION_REDUCE

Parle\Parser::ACTION_GOTO

Parle\Parser::ACTION_ACCEPT

Parle\Parser::ERROR_SYNTAX

Parle\Parser::ERROR_NON_ASSOCIATIVE

Parle\Parser::ERROR_UNKNOWN_TOKEN

属性

action
Current parser action that matches one of the action class constants, readonly.

reduceId
Grammar rule id just processed in the reduce action. The value corresponds either to a token or to a production id. Readonly.

Parle\Parser::advance

Process next parser rule

说明

public void Parle\Parser::advance ( void )

Process next parser rule.

参数

此函数没有参数。

返回值

没有返回值。

Parle\Parser::build

Finalize the grammar rules

说明

public void Parle\Parser::build ( void )

Any tokens and grammar rules previously added are finalized. The rule set becomes readonly and the parser is ready to start.

参数

此函数没有参数。

返回值

没有返回值。

Parle\Parser::consume

Consume the data for processing

说明

public void Parle\Parser::consume ( string $data , Parle\Lexer $lexer )

Consume the data for parsing.

参数

data
Data to be parsed.

lexer
A lexer object containing the lexing rules prepared for the particular grammar.

返回值

没有返回值。

Parle\Parser::dump

Dump the grammar

说明

public void Parle\Parser::dump ( void )

Dump the current grammar to stdout.

参数

此函数没有参数。

返回值

没有返回值。

Parle\Parser::errorInfo

Retrieve the error information

说明

public Parle\ErrorInfo Parle\Parser::errorInfo ( void )

Retrieve the error information in case Parle\Parser::action returned the error action.

参数

此函数没有参数。

返回值

Returns an instance of Parle\ErrorInfo.

Parle\Parser::left

Declare a token with left-associativity

说明

public void Parle\Parser::left ( string $tok )

Declare a terminal with left associativity.

参数

tok
Token name.

返回值

没有返回值。

Parle\Parser::nonassoc

Declare a token with no associativity

说明

public void Parle\Parser::nonassoc ( string $tok )

Declare a terminal, that cannot appear more than once in the row.

参数

tok
Token name.

返回值

没有返回值。

Parle\Parser::precedence

Declare a precedence rule

说明

public void Parle\Parser::precedence ( string $tok )

Declares a precedence rule for a fictitious terminal symbol. This rule can be later used in the specific grammar rules.

参数

tok
Token name.

返回值

没有返回值。

Parle\Parser::push

Add a grammar rule

说明

public int Parle\Parser::push ( string $name , string $rule )

Push a grammar rule. The production id returned can be used later in the parsing process to identify the rule matched.

参数

name
Rule name.

rule
The rule to be added. The syntax is Bison compatible.

返回值

Returns integer representing the rule index.

Parle\Parser::reset

Reset parser state

说明

public void Parle\Parser::reset ([ int $tokenId ] )

Reset parser state using the given token id.

参数

tokenId
Token id.

返回值

没有返回值。

Parle\Parser::right

Declare a token with right-associativity

说明

public void Parle\Parser::right ( string $tok )

Declare a terminal with right associativity.

参数

tok
Token name.

返回值

没有返回值。

Parle\Parser::sigil

Retrieve a matching part of a rule

说明

public string Parle\Parser::sigil ( int $idx )

Retrieve a part of the match by a rule. This method is equivalent to the pseudo variable functionality in Bison.

参数

idx
Match index, zero based.

返回值

Returns a string with the matched part.

Parle\Parser::token

Declare a token

说明

public void Parle\Parser::token ( string $tok )

Declare a terminal to be used in the grammar.

参数

tok
Token name.

返回值

没有返回值。

Parle\Parser::tokenId

Get token id

说明

public int Parle\Parser::tokenId ( string $tok )

Retrieve the id of the named token.

参数

tok
Name of the token as used in Parle\Parser::token.

返回值

Returns integer representing the token id.

Parle\Parser::trace

Trace the parser operation

说明

public string Parle\Parser::trace ( void )

Retrieve the current parser operation description. This can be especially useful for studying the parser and to optimize the grammar.

参数

此函数没有参数。

返回值

Returns a string with the trace information.

Parle\Parser::validate

Validate input

说明

public bool Parle\Parser::validate ( string $data , Parle\Lexer $lexer )

Validate an input string. The string is parsed internally, thus this method is useful for the quick input validation.

参数

data
String to be validated.

lexer
A lexer object containing the lexing rules prepared for the particular grammar.

返回值

Returns boolean witnessing whether the input chimes or not with the defined rules.

简介

Parser class. Rules can be defined on the fly. Once finalized, a Parle\RLexer instance is required to deliver the token stream.

类摘要

Parle\RParser

class Parle\RParser {

/* Constants */

const integer Parle\RParser::ACTION_ERROR = 0 ;

const integer Parle\RParser::ACTION_SHIFT = 1 ;

const integer Parle\RParser::ACTION_REDUCE = 2 ;

const integer Parle\RParser::ACTION_GOTO = 3 ;

const integer Parle\RParser::ACTION_ACCEPT = 4 ;

const integer Parle\RParser::ERROR_SYNTAX = 0 ;

const integer Parle\RParser::ERROR_NON_ASSOCIATIVE = 1 ;

const integer Parle\RParser::ERROR_UNKNOWN_TOKEN = 2 ;

/* 属性 */

public integer $action = 0 ;

public integer $reduceId = 0 ;

/* 方法 */

public void advance ( void )

public void build ( void )

public void consume ( string $data , Parle\RLexer $rlexer )

public void dump ( void )

public Parle\ErrorInfo errorInfo ( void )

public void left ( string $tok )

public void nonassoc ( string $tok )

public void precedence ( string $tok )

public int push ( string $name , string $rule )

public void reset ([ int $tokenId ] )

public void right ( string $tok )

public string sigil ([ int $idx ] )

public void token ( string $tok )

public int tokenId ( string $tok )

public string trace ( void )

public bool validate ( string $data , Parle\RLexer $lexer )

}

预定义常量

Parle\RParser::ACTION_ERROR

Parle\RParser::ACTION_SHIFT

Parle\RParser::ACTION_REDUCE

Parle\RParser::ACTION_GOTO

Parle\RParser::ACTION_ACCEPT

Parle\RParser::ERROR_SYNTAX

Parle\RParser::ERROR_NON_ASSOCIATIVE

Parle\RParser::ERROR_UNKNOWN_TOKEN

属性

action
Current parser action that matches one of the action class constants, readonly.

reduceId
Grammar rule id just processed in the reduce action. The value corresponds either to a token or to a production id. Readonly.

Parle\RParser::advance

Process next parser rule

说明

public void Parle\RParser::advance ( void )

Prosess next parser rule.

参数

此函数没有参数。

返回值

没有返回值。

Parle\RParser::build

Finalize the grammar rules

说明

public void Parle\RParser::build ( void )

Any tokens and grammar rules previously added are finalized. The rule set becomes readonly and the parser is ready to start.

参数

此函数没有参数。

返回值

没有返回值。

Parle\RParser::consume

Consume the data for processing

说明

public void Parle\RParser::consume ( string $data , Parle\RLexer $rlexer )

Consume the data for parsing.

参数

data
Data to be parsed.

lexer
A lexer object containing the lexing rules prepared for the particular grammar.

返回值

没有返回值。

Parle\RParser::dump

Dump the grammar

说明

public void Parle\RParser::dump ( void )

Dump the current grammar to stdout.

参数

此函数没有参数。

返回值

没有返回值。

Parle\RParser::errorInfo

Retrieve the error information

说明

public Parle\ErrorInfo Parle\RParser::errorInfo ( void )

Retrieve the error information in case Parle\RParser::action returned the error action.

参数

此函数没有参数。

返回值

Returns an instance of Parle\ErrorInfo.

Parle\RParser::left

Declare a token with left-associativity

说明

public void Parle\RParser::left ( string $tok )

Declare a terminal with left associativity.

参数

tok
Token name.

返回值

没有返回值。

Parle\RParser::nonassoc

Declare a token with no associativity

说明

public void Parle\RParser::nonassoc ( string $tok )

Declare a terminal, that cannot appear more than once in the row.

参数

tok
Token name.

返回值

没有返回值。

Parle\RParser::precedence

Declare a precedence rule

说明

public void Parle\RParser::precedence ( string $tok )

Declares a precedence rule for a fictious terminal symbol. This rule can be later used in the specific grammar rules.

参数

tok
Token name.

返回值

没有返回值。

Parle\RParser::push

Add a grammar rule

说明

public int Parle\RParser::push ( string $name , string $rule )

Push a grammar rule. The production id returned can be used later in the parsing process to identify the rule matched.

参数

name
Rule name.

rule
The rule to be added. The syntax is Bison compatible.

返回值

Returns integer representing the rule index.

Parle\RParser::reset

Reset parser state

说明

public void Parle\RParser::reset ([ int $tokenId ] )

Reset parser state using the given token id.

参数

tokenId
Token id.

返回值

没有返回值。

Parle\RParser::right

Declare a token with right-associativity

说明

public void Parle\RParser::right ( string $tok )

Declare a terminal with right associativity.

参数

tok
Token name.

返回值

没有返回值。

Parle\RParser::sigil

Retrieve a matching part of a rule

说明

public string Parle\RParser::sigil ([ int $idx ] )

Retrieve a part of the match by a rule. This method is equivalent to the pseudo variable functionality in Bison.

参数

idx
Match index, zero based.

返回值

Returns a string with the matched part.

Parle\RParser::token

Declare a token

说明

public void Parle\RParser::token ( string $tok )

Declare a terminal to be used in the grammar.

参数

tok
Token name.

返回值

没有返回值。

Parle\RParser::tokenId

Get token id

说明

public int Parle\RParser::tokenId ( string $tok )

Retrieve the id of the named token.

参数

tok
Name of the token as used in Parle\RParser::token.

返回值

Returns integer representing the token id.

Parle\RParser::trace

Trace the parser operation

说明

public string Parle\RParser::trace ( void )

Retrieve the current parser operation description. This can be especially useful to study the parser and to optimize the grammar.

参数

此函数没有参数。

返回值

Returns a string with the trace information.

Parle\RParser::validate

Validate input

说明

public bool Parle\RParser::validate ( string $data , Parle\RLexer $lexer )

Validate an input string. The string is parsed internally, thus this method is useful for the quick input validation.

参数

data
String to be validated.

lexer
A lexer object containing the lexing rules prepared for the particular grammar.

返回值

Returns boolean whitnessing whether the input chimes or not with the defined rules.

简介

Parle\Stack is a LIFO stack. The elements are inserted and removed only from one end.

类摘要

Parle\Stack

class Parle\Stack {

/* 属性 */

public boolean $empty = TRUE ;

public integer $size = 0 ;

public mixed $top ;

/* 方法 */

public void pop ( void )

public void push ( mixed $item )

}

属性

empty
Whether the stack is empty, readonly.

size
Stack size, readonly.

top
Element on the top of the stack.

Parle\Stack::pop

Pop an item from the stack

说明

public void Parle\Stack::pop ( void )

参数

此函数没有参数。

返回值

没有返回值。

Parle\Stack::push

Push an item into the stack

说明

public void Parle\Stack::push ( mixed $item )

参数

item
Variable to be pushed.

返回值

没有返回值。

简介

This class represents a token. Lexer returns instances of this class.

类摘要

Parle\Token

class Parle\Token {

/* Constants */

const integer Parle\Token::EOI = 0 ;

const integer Parle\Token::UNKNOWN = -1 ;

const integer Parle\Token::SKIP = -2 ;

/* 属性 */

public integer $id ;

public string $value ;

/* 方法 */

}

属性

id
Token id.

value
Token value.

预定义常量

Parle\Token::EOI
End of input token id.

Parle\Token::UNKNOWN
Unknown token id.

Parle\Token::SKIP
Skip token id.

简介

The class represents detailed error information as supplied by Parle\Parser::errorInfo

类摘要

Parle\ErrorInfo

class Parle\ErrorInfo {

/* 属性 */

public integer $id ;

public integer $position ;

public mixed $token ;

/* 方法 */

}

属性

id
Error id.

position
Position in the input, where the error occurred.

token
If applicable - the Parle\Token related to the error, otherwise NULL.

简介

类摘要

Parle\LexerException

class Parle\LexerException extends Exception implements Throwable {

/* 继承的属性 */

protected string $message ;

protected int $code ;

protected string $file ;

protected int $line ;

/* 方法 */

/* 继承的方法 */

final public string Exception::getMessage ( void )

final public Throwable Exception::getPrevious ( void )

final public int Exception::getCode ( void )

final public string Exception::getFile ( void )

final public int Exception::getLine ( void )

final public array Exception::getTrace ( void )

final public string Exception::getTraceAsString ( void )

public string Exception::__toString ( void )

final private void Exception::__clone ( void )

}

简介

类摘要

Parle\ParserException

class Parle\ParserException extends Exception implements Throwable {

/* 继承的属性 */

protected string $message ;

protected int $code ;

protected string $file ;

protected int $line ;

/* 方法 */

/* 继承的方法 */

final public string Exception::getMessage ( void )

final public Throwable Exception::getPrevious ( void )

final public int Exception::getCode ( void )

final public string Exception::getFile ( void )

final public int Exception::getLine ( void )

final public array Exception::getTrace ( void )

final public string Exception::getTraceAsString ( void )

public string Exception::__toString ( void )

final private void Exception::__clone ( void )

}