Struct xxcalc::tokenizer::Tokenizer
[−]
[src]
pub struct Tokenizer { /* fields omitted */ }
Tokenizer
performs the very first step of parsing mathematical
expression into Tokens
. These tokens can be then processed by
TokensProcessor
.
Tokenizer
is a state machine, which can be reused multiple
times. Internally it stores a buffer of Tokens, which can
be reused multiple times without requesting new memory from
the operating system. If Tokenizer lives long enough this
behaviour can greatly reduce time wasted on mallocs.
Examples
let mut tokenizer = Tokenizer::default(); { let tokens = tokenizer.process("2.0+2"); assert_eq!(tokens[0], (0, Token::Number(2.0))); assert_eq!(tokens[1], (3, Token::Operator('+'))); assert_eq!(tokens[2], (4, Token::Number(2.0))); } { let tokens = tokenizer.process("x+log10(100)+x"); assert_eq!(tokens[0], (0, Token::Identifier(0))); assert_eq!(tokens.identifiers[0], "x"); assert_eq!(tokens[1], (1, Token::Operator('+'))); assert_eq!(tokens[2], (2, Token::Identifier(1))); assert_eq!(tokens.identifiers[1], "log10"); assert_eq!(tokens[3], (7, Token::BracketOpening)); assert_eq!(tokens[4], (8, Token::Number(100.0))); assert_eq!(tokens[5], (11, Token::BracketClosing)); assert_eq!(tokens[6], (12, Token::Operator('+'))); assert_eq!(tokens[7], (13, Token::Identifier(0))); }Run
Trait Implementations
impl Default for Tokenizer
[src]
Creates a new default Tokenizer.
Such tokenizer is optimized (but not limited) for values up to 10 characters and up to 10 tokens. However these are default space capacities and they can extend dynamically.
impl StringProcessor for Tokenizer
[src]
This is a main processing unit in the tokenizer. It takes a string expression and creates a list of tokens representing this string using a state machine.
This tokenizer supports floating point numbers in traditional
and scientific notation (as well as shorthand point notation),
text identifiers and operators such as +
, -
, *
, /
, ^
and =
. Parentheses ()
and comma ,
are supported too.
Whitespaces are always skipped, not recognized characters
are wrapped into Unknown token.
Signed numbers are detected when they cannot be mistaken
for operators +
or -
. Implicit multiplication before an
identifier or a parantheses is replaced with explicit multiplication
with *
operator.
Extending
New features can be add to tokenizer by either embedding this
tokenizer into new one and replacing Unknown tokens with some
other tokens or by implementing a TokensProcessor
which takes
output of this tokenizer and replaces Unknown tokens or some
combination of tokens with other ones.
State machine
Complete, hand-designed state machine used by this StringProcessor
can be seen in the image below: