| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
This commit increases the maximum number of symbol positions per pattern to 2^15 (= 32,768).
When the limit is exceeded, the parse method returns an error.
|
|
|
|
|
| |
* Add cases test the parse method.
* Fix the parser to pass the cases.
|
|
|
|
|
| |
compile command writes logs out to the maleeni-compile.log file.
When you use compiler.Compile(), you can choose whether the lexer writes logs or not.
|
|
|
|
| |
[^a-z] matches any character that is not in the range a-z.
|
|
|
|
|
|
| |
* Remove token field from symbolNode
* Simplify notation of nested nodes
* Simplify arguments of newSymbolNode()
|
|
|
|
|
| |
* a+ matches 'a' one or more times. This is equivalent to aa*.
* a? matches 'a' zero or one time.
|
| |
|
|
|
|
|
|
|
|
|
| |
The dot symbol matches any single character. When the dot symbol appears, the parser generates an AST matching all of the well-formed UTF-8 byte sequences.
Refelences:
* https://www.unicode.org/versions/Unicode13.0.0/ch03.pdf#G7404
* Table 3-6. UTF-8 Bit Distribution
* Table 3-7. Well-Formed UTF-8 Byte Sequences
|
|
The compiler takes a lexical specification expressed by regular expressions and generates a DFA accepting the tokens.
Operators that you can use in the regular expressions are concatenation, alternation, repeat, and grouping.
|