tre - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Make character properties unavailable in bracket expressions	Ryo Nihei	2021-12-11	5	-33/+105
\|
*	Simplify process that generates UTF-8 byte sequences from a code point range	Ryo Nihei	2021-12-11	1	-1/+1
\|
*	Use new parser and DFA compiler	Ryo Nihei	2021-12-10	15	-5140/+154
\|
*	Add a new DFA compiler that generates DFA from a set of CPTree	Ryo Nihei	2021-12-10	6	-0/+1402
\|
*	Add a new parser that constructs a tree representing characters as code ↵	Ryo Nihei	2021-12-10	7	-0/+3134
\| \| \| \|	points, not byte sequences
*	Move UTF8-related processes to utf8 package	Ryo Nihei	2021-12-01	2	-702/+128
\|
*	Make contributory properties unavailable except internal use	Ryo Nihei	2021-11-28	2	-1/+62
\| \| \| \| \| \| \| \| \| \| \| \|	This change follows [UAX #44 5.13 Property APIs]. > The following subtypes of Unicode character properties should generally not be exposed in APIs, > except in limited circumstances. They may not be useful, particularly in public API collections, > and may instead prove misleading to the users of such API collections. > > * Contributory properties are not recommended for public APIs. > ... https://unicode.org/reports/tr44/#Property_APIs
*	Move all UCD-related processes to ucd package	Ryo Nihei	2021-11-27	4	-4777/+5
\|
*	Support Alphabetic property (Meet RL1.2 of UTS #18 partially)	Ryo Nihei	2021-11-26	3	-1/+420
\|
*	Make character properties available in an inverse expression (Make ↵	Ryo Nihei	2021-11-25	1	-0/+4
\| \| \| \|	[^\p{...}] available)
*	Support Lowercase and Uppercase property (Meet RL1.2 of UTS #18 partially)	Ryo Nihei	2021-11-25	4	-21/+153
\|
*	Support White_Space property (Meet RL1.2 of UTS #18 partially)	Ryo Nihei	2021-11-24	4	-25/+110
\|
*	Fix key of generalCategoryCodePoints map	Ryo Nihei	2021-11-23	1	-696/+696
\| \| \| \|	Use the abbreviation `cn` of the general category value `unassigned` as a key of the `generalCategoryCodePoints` map.
*	Remove --debug option from compile command	Ryo Nihei	2021-09-23	1	-36/+1
\|
*	Keep the order of AST nodes constant	Ryo Nihei	2021-09-22	4	-20/+50
\|
*	Add name field to the lexical specification	Ryo Nihei	2021-09-18	2	-0/+4
\|
*	Change APIs	Ryo Nihei	2021-08-01	7	-70/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change fields of tokens, results of lexical analysis, as follows: - Rename: mode -> mode_id - Rename: kind_id -> mode_kind_id - Add: kind_id The kind ID is unique across all modes, but the mode kind ID is unique only within a mode. Change fields of a transition table as follows: - Rename: initial_mode -> initial_mode_id - Rename: modes -> mode_names - Rename: kinds -> kind_names - Rename: specs[].kinds -> specs[].kind_names - Rename: specs[].dfa.initial_state -> specs[].dfa.initial_state_id Change public types defined in the spec package as follows: - Rename: LexModeNum -> LexModeID - Rename: LexKind -> LexKindName - Add: LexKindID - Add: StateID
*	Add unique kind IDs to tokens	Ryo Nihei	2021-08-01	1	-0/+38
\|
*	Allow duplicate names between fragments and non-fragments	Ryo Nihei	2021-05-27	1	-0/+103
\|
*	Add fragment expression	Ryo Nihei	2021-05-25	8	-49/+440
\| \| \| \|	A fragment entry is defined by an entry whose `fragment` field is `true`, and is referenced by a fragment expression (`\f{...}`).
*	Fix the initial state number	Ryo Nihei	2021-05-19	1	-1/+5
\| \| \| \|	Since 0 represents an invalid value in a transition table, assign a number greater than or equal to 1 to states.
*	Use go fmt instead of gofmt	Ryo Nihei	2021-05-12	1	-1/+1
\|
*	Add --compression-level option to compile command	Ryo Nihei	2021-05-11	1	-7/+57
\| \| \| \|	--compression-level specifies a compression level. The default value is 2.
*	Change package structure	Ryo Nihei	2021-05-08	1	-1/+1
\| \| \| \|	The executable can be installed using `go install ./cmd/maleeni`.
*	Add CLI options	Ryo Nihei	2021-05-08	1	-3/+3
\|
*	Change type of acceping_states to slice	Ryo Nihei	2021-05-07	1	-2/+6
\|
*	Add transition table compressor	Ryo Nihei	2021-05-07	2	-9/+60
\|
*	Improve performance of the symbolPositionSet	Ryo Nihei	2021-05-04	4	-63/+98
\| \| \| \| \| \| \| \| \| \|	When using a map to represent a set, performance degrades due to the increased number of calls of runtime.mapassign. Especially when the number of symbols is large, as in compiling a pattern that contains character properties like \p{Letter}, adding elements to the set alone may take several tens of seconds of CPU time. Therefore, this commit solves this problem by changing the representation of the set from map to array.
*	Add lex mode	Ryo Nihei	2021-05-04	1	-2/+82
\| \| \| \| \| \| \| \| \| \|	lex mode is a feature that separates transition tables per each mode. The lexer starts from an initial state indicated by `initial_state` field and transitions between modes according to `push` and `pop` fields. The initial state will always be `default`. Currently, maleeni doesn't provide the ability to change the initial state. You can specify the modes of each lex entry using `modes` field. When the mode isn't indicated explicitly, the entries have `default` mode.
*	Fix parser to recognize property expressions in bracket expressions	Ryo Nihei	2021-05-02	2	-0/+14
\|
*	Improve compilation time a little	Ryo Nihei	2021-05-02	3	-174/+269
\| \| \| \| \| \| \| \| \| \|	A pattern like \p{Letter} generates an AST with many symbols concatenated by alt operators, which results in a large number of symbol positions in one state of the DFA. Such a pattern increases the compilation time. This commit improves the compilation time a little better. - To avoid calling astNode#first and astNode#last recursively, memoize the result of them. - Use a byte sequence that symbol positions are encoded to as a hash value to avoid using fmt.Fprintf function. - Implement a sort function for symbol positions instead of using sort.Slice function.
*	Add character property expression (Meet RL1.2 of UTS #18 partially)	Ryo Nihei	2021-04-30	8	-27/+4440
\| \| \| \| \| \| \| \| \| \|	\p{property name=property value} matches a character has the property. When the property name is General_Category, it can be omitted. That is, \p{Letter} equals \p{General_Category=Letter}. Currently, only General_Category is supported. This feature meets RL1.2 of UTS #18 partially. RL1.2 Properties: https://unicode.org/reports/tr18/#RL1.2
*	Add code point expression (Meet RL1.1 of UTS #18)	Ryo Nihei	2021-04-24	5	-16/+477
\| \| \| \| \| \| \| \|	\u{hex string} matches a character has the code point represented by the hex string. For instance, \u{3042} matches hiragana あ (U+3042). The hex string must have 4 or 6 digits. This feature meets RL1.1 of UTS #18. RL1.1 Hex Notation: https://unicode.org/reports/tr18/#RL1.1
*	Add validation of lexical specs and improve error messages	Ryo Nihei	2021-04-17	1	-2/+8
\|
*	Change the lexical specs of regexp and define concrete syntax error values	Ryo Nihei	2021-04-17	5	-425/+568
\| \| \| \| \|	* Make the lexer treat ']' as an ordinary character in default mode * Define values of the syntax error type that represents error information concretely
*	Increase the maximum number of symbol positions per pattern	Ryo Nihei	2021-04-12	5	-29/+139
\| \| \| \| \|	This commit increases the maximum number of symbol positions per pattern to 2^15 (= 32,768). When the limit is exceeded, the parse method returns an error.
*	Fix grammar the parser accepts	Ryo Nihei	2021-04-11	6	-98/+1192
\| \| \| \| \|	* Add cases test the parse method. * Fix the parser to pass the cases.
*	Add logging to compile command	Ryo Nihei	2021-04-08	3	-47/+108
\| \| \| \| \|	compile command writes logs out to the maleeni-compile.log file. When you use compiler.Compile(), you can choose whether the lexer writes logs or not.
*	Add logical inverse expression	Ryo Nihei	2021-04-01	6	-34/+766
\| \| \| \|	[^a-z] matches any character that is not in the range a-z.
*	Pass values in error type to panic()	Ryo Nihei	2021-03-07	1	-2/+2
\| \| \| \|	Because parser.parse() expects that recover() returns a value in error type, apply this change.
*	Refactoring	Ryo Nihei	2021-02-25	5	-502/+351
\| \| \| \| \| \|	* Remove token field from symbolNode * Simplify notation of nested nodes * Simplify arguments of newSymbolNode()
*	Add range expression	Ryo Nihei	2021-02-24	3	-8/+717
\| \| \| \|	[a-z] matches any one character from a to z. The order of the characters depends on Unicode code points.
*	Add + and ? operators	Ryo Nihei	2021-02-20	5	-16/+82
\| \| \| \| \|	* a+ matches 'a' one or more times. This is equivalent to aa. a? matches 'a' zero or one time.
*	Fix computation of last positions	Ryo Nihei	2021-02-17	2	-0/+122
\|
*	Add types of lexical specifications	Ryo Nihei	2021-02-16	2	-11/+27
\| \| \| \|	APIs of compiler and driver packages use these types. Because CompiledLexSpec struct a lexer takes has kind names of lexical specification entries, the lexer sets them to tokens.
*	Add bracket expression matching specified character	Ryo Nihei	2021-02-14	3	-9/+109
\| \| \| \|	The bracket expression matches any single character specified in it. In the bracket expression, the special characters like ., *, and so on are also handled as normal characters.
*	Add dot symbol matching any single character	Ryo Nihei	2021-02-14	6	-20/+158
\| \| \| \| \| \| \| \| \|	The dot symbol matches any single character. When the dot symbol appears, the parser generates an AST matching all of the well-formed UTF-8 byte sequences. Refelences: * https://www.unicode.org/versions/Unicode13.0.0/ch03.pdf#G7404 * Table 3-6. UTF-8 Bit Distribution * Table 3-7. Well-Formed UTF-8 Byte Sequences
*	Add compiler	Ryo Nihei	2021-02-14	8	-0/+1265
	The compiler takes a lexical specification expressed by regular expressions and generates a DFA accepting the tokens. Operators that you can use in the regular expressions are concatenation, alternation, repeat, and grouping.