| Commit message (Expand) | Author | Files | Lines |
| 2021-04-30 | Add character property expression (Meet RL1.2 of UTS #18 partially)•••\p{property name=property value} matches a character has the property.
When the property name is General_Category, it can be omitted.
That is, \p{Letter} equals \p{General_Category=Letter}.
Currently, only General_Category is supported.
This feature meets RL1.2 of UTS #18 partially.
RL1.2 Properties: https://unicode.org/reports/tr18/#RL1.2
| Ryo Nihei | 10 | -27/+4748 |
| 2021-04-24 | Add code point expression (Meet RL1.1 of UTS #18)•••\u{hex string} matches a character has the code point represented by the hex string.
For instance, \u{3042} matches hiragana あ (U+3042). The hex string must have 4 or 6 digits.
This feature meets RL1.1 of UTS #18.
RL1.1 Hex Notation: https://unicode.org/reports/tr18/#RL1.1
| Ryo Nihei | 6 | -18/+512 |
| 2021-04-17 | Add validation of lexical specs and improve error messages | Ryo Nihei | 6 | -75/+174 |
| 2021-04-17 | Change the lexical specs of regexp and define concrete syntax error values•••* Make the lexer treat ']' as an ordinary character in default mode
* Define values of the syntax error type that represents error information concretely
| Ryo Nihei | 7 | -446/+603 |
| 2021-04-12 | Increase the maximum number of symbol positions per pattern•••This commit increases the maximum number of symbol positions per pattern to 2^15 (= 32,768).
When the limit is exceeded, the parse method returns an error.
| Ryo Nihei | 5 | -29/+139 |
| 2021-04-11 | Fix grammar the parser accepts•••* Add cases test the parse method.
* Fix the parser to pass the cases.
| Ryo Nihei | 6 | -98/+1192 |
| 2021-04-08 | Add logging to compile command•••compile command writes logs out to the maleeni-compile.log file.
When you use compiler.Compile(), you can choose whether the lexer writes logs or not.
| Ryo Nihei | 4 | -49/+133 |
| 2021-04-06 | Print the result of the lex command in JSON format•••* Print the result of the lex command in JSON format.
* Print the EOF token.
| Ryo Nihei | 3 | -140/+185 |
| 2021-04-01 | Add logical inverse expression•••[^a-z] matches any character that is not in the range a-z.
| Ryo Nihei | 7 | -37/+786 |
| 2021-03-07 | Pass values in error type to panic()•••Because parser.parse() expects that recover() returns a value in error type, apply this change.
| Ryo Nihei | 1 | -2/+2 |
| 2021-02-25 | Refactoring•••* Remove token field from symbolNode
* Simplify notation of nested nodes
* Simplify arguments of newSymbolNode()
| Ryo Nihei | 5 | -502/+351 |
| 2021-02-24 | Add range expression•••[a-z] matches any one character from a to z. The order of the characters depends on Unicode code points.
| Ryo Nihei | 4 | -9/+977 |
| 2021-02-20 | Add + and ? operators•••* a+ matches 'a' one or more times. This is equivalent to aa*.
* a? matches 'a' zero or one time.
| Ryo Nihei | 6 | -21/+117 |
| 2021-02-17 | Fix computation of last positions | Ryo Nihei | 2 | -0/+122 |
| 2021-02-16 | Add logging to lex command•••lex command writes logs out to the maleeni-lex.log file.
When you generate a lexer using driver.NewLexer(), you can choose whether the lexer writes logs or not.
| Ryo Nihei | 3 | -5/+126 |
| 2021-02-16 | Add CLI | Ryo Nihei | 6 | -0/+433 |
| 2021-02-16 | Add types of lexical specifications•••APIs of compiler and driver packages use these types. Because CompiledLexSpec struct a lexer takes has kind names of lexical specification entries, the lexer sets them to tokens.
| Ryo Nihei | 5 | -90/+133 |
| 2021-02-14 | Add bracket expression matching specified character•••The bracket expression matches any single character specified in it. In the bracket expression, the special characters like ., *, and so on are also handled as normal characters.
| Ryo Nihei | 4 | -9/+127 |
| 2021-02-14 | Add dot symbol matching any single character•••The dot symbol matches any single character. When the dot symbol appears, the parser generates an AST matching all of the well-formed UTF-8 byte sequences.
Refelences:
* https://www.unicode.org/versions/Unicode13.0.0/ch03.pdf#G7404
* Table 3-6. UTF-8 Bit Distribution
* Table 3-7. Well-Formed UTF-8 Byte Sequences
| Ryo Nihei | 7 | -21/+201 |
| 2021-02-14 | Add driver•••The driver takes a DFA and an input text and generates a lexer. The lexer tokenizes the input text according to the lexical specification that the DFA expresses.
| Ryo Nihei | 2 | -0/+309 |
| 2021-02-14 | Add compiler•••The compiler takes a lexical specification expressed by regular expressions and generates a DFA accepting the tokens.
Operators that you can use in the regular expressions are concatenation, alternation, repeat, and grouping.
| Ryo Nihei | 9 | -0/+1268 |