tre - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Expand)	Author	Files	Lines
2021-05-13	Rename fields of driver.Token	Ryo Nihei	1	-1/+1
2021-05-12	Use go fmt instead of gofmt	Ryo Nihei	1	-1/+1
2021-05-11	Add --compression-level option to compile command•••--compression-level specifies a compression level. The default value is 2.	Ryo Nihei	6	-45/+119
2021-05-11	Fix a text representation of an error token•••This commit fixes a bug that caused the second and subsequent characters of the text representation of an error token to be missing.	Ryo Nihei	2	-22/+51
2021-05-10	Update README and godoc	Ryo Nihei	2	-8/+227
2021-05-08	Change package structure•••The executable can be installed using `go install ./cmd/maleeni`.	Ryo Nihei	6	-7/+5
2021-05-08	Add --break-on-error option to lex command•••As you use --break-on-error option, break lexical analysis with exit status 1 immediately when an error token appears.	Ryo Nihei	2	-3/+9
2021-05-08	Add CLI options	Ryo Nihei	4	-56/+117
2021-05-07	Change type of acceping_states to slice	Ryo Nihei	3	-5/+9
2021-05-07	Add transition table compressor	Ryo Nihei	6	-18/+431
2021-05-05	Remove Peek* functions	Ryo Nihei	2	-86/+0
2021-05-04	Improve performance of the symbolPositionSet•••When using a map to represent a set, performance degrades due to the increased number of calls of runtime.mapassign. Especially when the number of symbols is large, as in compiling a pattern that contains character properties like \p{Letter}, adding elements to the set alone may take several tens of seconds of CPU time. Therefore, this commit solves this problem by changing the representation of the set from map to array.	Ryo Nihei	4	-63/+98
2021-05-04	Add lex mode•••lex mode is a feature that separates transition tables per each mode. The lexer starts from an initial state indicated by `initial_state` field and transitions between modes according to `push` and `pop` fields. The initial state will always be `default`. Currently, maleeni doesn't provide the ability to change the initial state. You can specify the modes of each lex entry using `modes` field. When the mode isn't indicated explicitly, the entries have `default` mode.	Ryo Nihei	4	-211/+504
2021-05-02	Generate an invalid token from incompleted input.•••When the lexer's buffer has unaccepted data and reads the EOF, the lexer treats the buffered data as an invalid token.	Ryo Nihei	1	-0/+5
2021-05-02	Fix parser to recognize property expressions in bracket expressions	Ryo Nihei	2	-0/+14
2021-05-02	Improve compilation time a little•••A pattern like \p{Letter} generates an AST with many symbols concatenated by alt operators, which results in a large number of symbol positions in one state of the DFA. Such a pattern increases the compilation time. This commit improves the compilation time a little better. - To avoid calling astNode#first and astNode#last recursively, memoize the result of them. - Use a byte sequence that symbol positions are encoded to as a hash value to avoid using fmt.Fprintf function. - Implement a sort function for symbol positions instead of using sort.Slice function.	Ryo Nihei	3	-174/+269
2021-04-30	Add character property expression (Meet RL1.2 of UTS #18 partially)•••\p{property name=property value} matches a character has the property. When the property name is General_Category, it can be omitted. That is, \p{Letter} equals \p{General_Category=Letter}. Currently, only General_Category is supported. This feature meets RL1.2 of UTS #18 partially. RL1.2 Properties: https://unicode.org/reports/tr18/#RL1.2	Ryo Nihei	10	-27/+4748
2021-04-24	Add code point expression (Meet RL1.1 of UTS #18)•••\u{hex string} matches a character has the code point represented by the hex string. For instance, \u{3042} matches hiragana あ (U+3042). The hex string must have 4 or 6 digits. This feature meets RL1.1 of UTS #18. RL1.1 Hex Notation: https://unicode.org/reports/tr18/#RL1.1	Ryo Nihei	6	-18/+512
2021-04-17	Add validation of lexical specs and improve error messages	Ryo Nihei	6	-75/+174
2021-04-17	Change the lexical specs of regexp and define concrete syntax error values•••* Make the lexer treat ']' as an ordinary character in default mode * Define values of the syntax error type that represents error information concretely	Ryo Nihei	7	-446/+603
2021-04-12	Increase the maximum number of symbol positions per pattern•••This commit increases the maximum number of symbol positions per pattern to 2^15 (= 32,768). When the limit is exceeded, the parse method returns an error.	Ryo Nihei	5	-29/+139
2021-04-11	Fix grammar the parser accepts•••* Add cases test the parse method. * Fix the parser to pass the cases.	Ryo Nihei	6	-98/+1192
2021-04-08	Add logging to compile command•••compile command writes logs out to the maleeni-compile.log file. When you use compiler.Compile(), you can choose whether the lexer writes logs or not.	Ryo Nihei	4	-49/+133
2021-04-06	Print the result of the lex command in JSON format•••* Print the result of the lex command in JSON format. * Print the EOF token.	Ryo Nihei	3	-140/+185
2021-04-01	Add logical inverse expression•••[^a-z] matches any character that is not in the range a-z.	Ryo Nihei	7	-37/+786
2021-03-07	Pass values in error type to panic()•••Because parser.parse() expects that recover() returns a value in error type, apply this change.	Ryo Nihei	1	-2/+2
2021-02-25	Refactoring•••* Remove token field from symbolNode * Simplify notation of nested nodes * Simplify arguments of newSymbolNode()	Ryo Nihei	5	-502/+351
2021-02-24	Add range expression•••[a-z] matches any one character from a to z. The order of the characters depends on Unicode code points.	Ryo Nihei	4	-9/+977
2021-02-20	Add + and ? operators•••* a+ matches 'a' one or more times. This is equivalent to aa. a? matches 'a' zero or one time.	Ryo Nihei	6	-21/+117
2021-02-17	Fix computation of last positions	Ryo Nihei	2	-0/+122
2021-02-16	Add logging to lex command•••lex command writes logs out to the maleeni-lex.log file. When you generate a lexer using driver.NewLexer(), you can choose whether the lexer writes logs or not.	Ryo Nihei	3	-5/+126
2021-02-16	Add CLI	Ryo Nihei	6	-0/+433
2021-02-16	Add types of lexical specifications•••APIs of compiler and driver packages use these types. Because CompiledLexSpec struct a lexer takes has kind names of lexical specification entries, the lexer sets them to tokens.	Ryo Nihei	5	-90/+133
2021-02-14	Add bracket expression matching specified character•••The bracket expression matches any single character specified in it. In the bracket expression, the special characters like ., *, and so on are also handled as normal characters.	Ryo Nihei	4	-9/+127
2021-02-14	Add dot symbol matching any single character•••The dot symbol matches any single character. When the dot symbol appears, the parser generates an AST matching all of the well-formed UTF-8 byte sequences. Refelences: * https://www.unicode.org/versions/Unicode13.0.0/ch03.pdf#G7404 * Table 3-6. UTF-8 Bit Distribution * Table 3-7. Well-Formed UTF-8 Byte Sequences	Ryo Nihei	7	-21/+201
2021-02-14	Add driver•••The driver takes a DFA and an input text and generates a lexer. The lexer tokenizes the input text according to the lexical specification that the DFA expresses.	Ryo Nihei	2	-0/+309
2021-02-14	Add compiler•••The compiler takes a lexical specification expressed by regular expressions and generates a DFA accepting the tokens. Operators that you can use in the regular expressions are concatenation, alternation, repeat, and grouping.	Ryo Nihei	9	-0/+1268