| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The character class `[a-z]`, and specially the wildcard `.`, aren't
operators: they really do represent themselves with their own special
semantics, and they take no operands. So instead of have the "operator"
type behave in two ways, with and without arguments, we instead have
this new type, the "meta" character. In equivalence to the literal
character, the metacharacter represents itself, and also takes no
argument. We also can not touch the precedence parsing of operators by
tainting it with special conditions for "." and "class", since they
should behave just like literal characters: be pushed directly onto the
stack.
As of now, there are only 2 meta characters: "class" and ".".
* src/paca.mjs
(operatorChars): Remove "." from the set of operator characters.
(classStateStep): Return `{ meta: "class" }` instead of
`{ operator: "class" }`.
(isMeta): Add equivalent to `isTransition()` and `isOperator()`.
(opFor, tokenizeRegexStep): Add new `opFor()` function for classifying
a given character, choosing between an operator, a metacharacter and
a literal character, and use this function in the body of
`tokenizeRegexStep()`.
(PRECEDENCE): Remove early entry of precedence values for "class" and
".".
(toPostfixStep): Instead of just checking if a character is a literal
one before pushing it onto the stack, check that it isn't an
operator just by checking if it is an object that has the `operator`
attribute.
* tests/paca.mjs
(test_isOperator): Remove test case for ".", as it is no longer
considered an operator.
(classStateStep): Update to rename from `{ operator: "class" }` to
`{ meta: "class" }`.
(test_toPostfixStep, test_toPostfix): Add test cases for meta
characters.
(test_OPERATOR_FNS): BONUS - Use direct assignment to reset the array
to an empty value instead of `arr.splice(0)`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* src/paca.mjs
(escapingStateStep): Return an error when escaping non-metacharacters.
This way cases like \d, which is syntax for [0-9] which will
eventually be recognized, will not change its behaviour from a noop
escape of "d" to matching digits.
(operatorChars, isOperator): Hoist both of these up before their usage
in `escapingStateStep()`.
* tests/paca.mjs
(test_isOperator): Hoist its definition and position inside the
`runTests([...])` array to match src/paca.mjs.
(test_escapingStateStep): Adjust existing cases and add test case for
good/bad escapes.
(test_tokenizeRegexStep): Fix bad starting escape, that broke because
it was escaping a non-metacharacter.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* src/paca.mjs
(isTransition): Add new function as an improved version of the raw
usage of `stateTransitionOperators`, equivalent to `isAnchor()` and
`isOperator()`.
(operatorChars, isOperator): Add new static set `operatorChars` as
backing data of `isOperator()`, instead of ad-hoc conditional in its
implementation. Also now add the `.` character as an operator by
including it in the `operatorChars` set.
(tokenizeRegexStep): Use the new `isTransition()` function instead of
checking the set directly. Also tweak ternary to fit in 80 columns.
(PRECEDENCE): Add `.` operator with lowest precedence, as it is not
really operating on anything, and is instead a target to be operated
on.
* tests/paca.mjs
(test_isTransition): Add obligatory test cases.
(test_isOperator): Include test case for `.` wildcard operator.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* src/paca.mjs
(ANCHOR_FNS): Add simple handlers for ^ and $ anchors, that only look
for the position of the character in the pattern as validation
during tokenization.
(isAnchor): Add simple boolean function to identify anchor characters.
(tokenizeRegexStep): Include check if character `isAnchor()`, and call
the appropriate `ANCHOR_FNS[char]` when true.
* tests/paca.mjs
(test_ANCHOR_FNS): Add test with 4 cases - 2 for success and 2 for
errors for ^ and $.
(test_isAnchor): Add obligatory simple test cases.
(test_tokenizeRegexStep): Include test case for tokenizing patterns
with character class.
|
| |
|
|
| |
position in runTests
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
* src/paca.mjs
(escapingStateStep): Use `shouldConcat()` instead of only checking if
we're on the last char. We abuse it a bit by passing `null` as the
first argument, since it is being escaped.
(nonConcatOperators, shouldConcat): Hoist the definition of both above
`escapingStateStep()`, so that they're defined before being used.
* tests/paca.mjs (test_shouldConcat): Add test case where `null` is
explicitly passed as the first argument.
|
| |
|
|
|
|
|
|
|
| |
* src/paca.mjs (classStateStep): New function equivalent to
`rangeStateStep()` for character class expressions. For now it knowns
how to handle escaping ([abc\-_]), simple ranges ([a-z]), negation
([^abc]) and the hyphen literal as the first char ([-a-z_]).
* tests.paca.mjs (test_classStateStep): New test entry has a test case
each scenario described above.
|
| | |
|
| |
|
|
|
| |
* tests/paca.mjs (test_rangeStateStep): Add first test case, for when we
find a closing "}" when no comma was seen.
|
| |
|
|
|
|
|
|
|
| |
* src/paca.mjs (numFromDigits): Move it to before the *StateStep
functions, as it is now used in `rangeStateStep()` function. So
instead of letting it be defined afters its usage, move it up.
* tests/paca.mjs: Do the same hoisting to the import of the
`numFromDigits` name, to the definition of `test_numFromDigits` and
its inclusion in the order of the call to `runTests()`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introducing "[" now we will start to write the code to parse the
character class expressions, i.e. [a-z0-9]. The `context` key will
contain a `set` with all the literal characters that were found, and all
the ranges too. For parsing the ranges, a `range` key equivalent to the
one for the {m,n} range is used. Despite the superficial syntax being
simmilar, its logic, semantic and implementation will be different.
* src/paca.mjs (TRANSITION_FNS) <"[">: Add new transition function for
handling the start of a character class expression.
* tests/paca.mjs (TRANSITION_FNS): Add a singular test entry, that
exercises the conditionless body of the function.
|
| |
|
|
|
|
|
|
|
|
| |
* src/paca.mjs (TRANSITION_FNS): Add trailing underscore to ignored
arguments, even though it breaks the name of the `_state` and
`context` destructuring arguments.
* tests/paca.mjs (test_TRANSITION_FNS): Add new test function with a
single case for each transition character. Since these transitions
are unconditional and contain no logic, this single sample test is
enough to cover for all of its behaviour.
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
|