diff options
| author | EuAndreh <eu@euandre.org> | 2025-07-16 17:37:25 -0300 |
|---|---|---|
| committer | EuAndreh <eu@euandre.org> | 2025-07-16 17:55:51 -0300 |
| commit | d7247c4e860d407248633874e24c65328c0f375b (patch) | |
| tree | 54e403bb53f43688fd17587a7dcc5bc8deb5a6e1 /Makefile | |
| parent | Build NFA nodes for "." and "class" metacharacters (diff) | |
| download | paca-d7247c4e860d407248633874e24c65328c0f375b.tar.gz paca-d7247c4e860d407248633874e24c65328c0f375b.tar.xz | |
Compress character class when compiling NFA.
Do not change any observable behaviour outside of `characterClass()`, as
the new output is 100% semantically compatible, but faster and most
importantly, much smaller.
* src/paca.mjs
(characterClass): Leverage `compressCharacterRanges()` when processing
the given set. Also use the same compressed range to filter out set
matches when they are already within the range.
(compareRange): Add function that sorts based on the first numeric
element of the range. When they're equal, use the second element as
the tie breaker.
(compressRangeStep): Add function that takes ranges in sorted order
tries to mush them together when the start of the second (`from`) is
contained by the first (`curr[1]`). Since the ranges are given to
us sorted, we already know that the start of the second is greater
than or equal to the start of the first. When this is the case, we
pick the largest ending to merge the ranges, otherwise we just place
them one next to the other, sequentially.
(compressCharacterRanges): Add function that sorts the ranges and
reduces them with `compressRangeStep()`. Here we have to give an
empty array as the initial value to prevent `compressRangeStep()` to
be given ranges as `[m, n]`, instead of either `[]` or `[[m, n]]`.
This is also why there's a check for `!curr` in
`compressRangeStep()` - to plug the other side of this adaptation.
(inRange): Add function that checks if the given character is
contained by any of the ranges of the given object. At the end,
instead of looking through every `from`, all we need is a single
match, so we use `.some()` instead.
* tests/paca.mjs
(test_characterClass): Add test case for a range that collapses many
literal character matches and many ranges into a single range.
(test_{compareRange,compressRangeStep,compressCharacterRanges,inRange):
Add routine sample-based test cases.
Diffstat (limited to 'Makefile')
0 files changed, 0 insertions, 0 deletions
