git mv src/content/* src/content/en/

author: EuAndreh <eu@euandre.org> 2025-04-18 02:17:12 -0300
committer: EuAndreh <eu@euandre.org> 2025-04-18 02:48:42 -0300
commit: 020c1e77489b772f854bb3288b9c8d2818a6bf9d (patch)
tree: 142aec725a52162a446ea7d947cb4347c9d573c9 /src/content/en/tils/2021
parent: Makefile: Remove security.txt.gz (diff)
download: euandre.org-020c1e77489b772f854bb3288b9c8d2818a6bf9d.tar.gz
euandre.org-020c1e77489b772f854bb3288b9c8d2818a6bf9d.tar.xz
7 files changed, 685 insertions, 0 deletions
diff --git a/src/content/en/tils/2021/01/12/curl-awk-emails.adoc b/src/content/en/tils/2021/01/12/curl-awk-emails.adoc
new file mode 100644
index 0000000..d432da2
--- /dev/null
+++ b/src/content/en/tils/2021/01/12/curl-awk-emails.adoc
@@ -0,0 +1,148 @@
+= Awk snippet: send email to multiple recipients with cURL
+
+:neomutt: https://neomutt.org/
+:found-out-article: https://blog.edmdesigner.com/send-email-from-linux-command-line/
+:curl: https://curl.se/
+
+As I experiment with {neomutt}[Neomutt], I wanted to keep being able to enqueue
+emails for sending later like my previous setup, so that I didn't rely on having
+an internet connection.
+
+My requirements for the `sendmail` command were:
+
+. store the email in a file, and send it later;
+. send from different addresses, using different SMTP servers.
+
+I couldn't find an MTA that could accomplish that, but I was able to quickly
+write a solution.
+
+The first part was the easiest: store the email in a file:
+
+[source,sh]
+----
+# ~/.config/mutt/muttrc:
+set sendmail=~/bin/enqueue-email.sh
+
+# ~/bin/enqueue-email.sh:
+#!/bin/sh -eu
+
+cat - > "$HOME/mbsync/my-queued-emails/$(date -Is)"
+----
+
+Now that I had the email file store locally, I needed a program to send the
+email from the file, so that I could create a cronjob like:
+
+[source,sh]
+----
+for f in ~/mbsync/my-queued-emails/*; do
+  ~/bin/dispatch-email.sh "$f" && rm "$f"
+done
+----
+
+The `dispatch-email.sh` would have to look at the `From:` header and decide
+which SMTP server to use.  As I {found-out-article}[found out] that {curl}[curl]
+supports SMTP and is able to send emails, this is what I ended up with:
+
+[source,sh]
+----
+#!/bin/sh -eu
+
+F="$1"
+
+rcpt="$(awk '
+  match($0, /^(To|Cc|Bcc): (.*)$/, m) {
+    split(m[2], tos, ",")
+    for (i in tos) {
+      print "--mail-rcpt " tos[i]
+    }
+  }
+'  "$F")"
+
+if grep -qE '^From: .*<addr@server1\.org>$' "$F"; then
+  curl                                                      \
+    -s                                                      \
+    --url smtp://smtp.server1.org:587                       \
+    --ssl-reqd                                              \
+    --mail-from addr@server1.org                            \
+    $rcpt                                                   \
+    --user 'addr@server1.org:my-long-and-secure-passphrase' \
+    --upload-file "$F"
+elif grep -qE '^From: .*<addr@server2\.org>$' "$F"; then
+  curl                                                      \
+    -s                                                      \
+    --url smtp://smtp.server2.org:587                       \
+    --ssl-reqd                                              \
+    --mail-from addr@server2.org                            \
+    $rcpt                                                   \
+    --user 'addr@server2.org:my-long-and-secure-passphrase' \
+    --upload-file "$F"
+else
+  echo 'Bad "From: " address'
+  exit 1
+fi
+----
+
+Most of curl flags used are self-explanatory, except for `$rcpt`.
+
+curl connects to the SMTP server, but doesn't set the recipient address by
+looking at the message.  My solution was to generate the curl flags, store them
+in `$rcpt` and use it unquoted to leverage shell word splitting.
+
+To me, the most interesting part was building the `$rcpt` flags.  My first
+instinct was to try grep, but it couldn't print only matches in a regex.  As I
+started to turn towards sed, I envisioned needing something else to loop over
+the sed output, and I then moved to Awk.
+
+In the short Awk snippet, 3 things were new to me: the `match(...)`,
+`split(...)` and `for () {}`.  The only other function I have ever used was
+`gsub(...)`, but these new ones felt similar enough that I could almost guess
+their behaviour and arguments.  `match(...)` stores the matches of a regex on
+the given array positionally, and `split(...)` stores the chunks in the given
+array.
+
+I even did it incrementally:
+
+[source,sh]
+----
+$ H='To: to@example.com, to2@example.com\nCc: cc@example.com, cc2@example.com\nBcc: bcc@example.com,bcc2@example.com\n'
+$ printf "$H" | awk '/^To: .*$/ { print $0 }'
+To: to@example.com, to2@example.com
+$ printf "$H" | awk 'match($0, /^To: (.*)$/, m) { print m }'
+awk: ligne de commande:1: (FILENAME=- FNR=1) fatal : tentative d'utilisation du tableau « m » dans un contexte scalaire
+$ printf "$H" | awk 'match($0, /^To: (.*)$/, m) { print m[0] }'
+To: to@example.com, to2@example.com
+$ printf "$H" | awk 'match($0, /^To: (.*)$/, m) { print m[1] }'
+to@example.com, to2@example.com
+$ printf "$H" | awk 'match($0, /^To: (.*)$/, m) { split(m[1], tos, " "); print tos }'
+awk: ligne de commande:1: (FILENAME=- FNR=1) fatal : tentative d'utilisation du tableau « tos » dans un contexte scalaire
+$ printf "$H" | awk 'match($0, /^To: (.*)$/, m) { split(m[1], tos, " "); print tos[0] }'
+
+$ printf "$H" | awk 'match($0, /^To: (.*)$/, m) { split(m[1], tos, " "); print tos[1] }'
+to@example.com,
+$ printf "$H" | awk 'match($0, /^To: (.*)$/, m) { split(m[1], tos, " "); print tos[2] }'
+to2@example.com
+$ printf "$H" | awk 'match($0, /^To: (.*)$/, m) { split(m[1], tos, " "); print tos[3] }'
+----
+
+(This isn't the verbatim interactive session, but a cleaned version to make it
+more readable.)
+
+At this point, I realized I needed a for loop over the `tos` array, and I moved
+the Awk snippet into the `~/bin/dispatch-email.sh`.  I liked the final thing:
+
+[source,awk]
+----
+match($0, /^(To|Cc|Bcc): (.*)$/, m) {
+  split(m[2], tos, ",")
+  for (i in tos) {
+    print "--mail-rcpt " tos[i]
+  }
+}
+----
+
+As I learn more about Awk, I feel that it is too undervalued, as many people
+turn to Perl or other programming languages when Awk suffices.  The advantage is
+pretty clear: writing programs that run on any POSIX system, without extra
+dependencies required.
+
+Coding to the standards is underrated.
diff --git a/src/content/en/tils/2021/01/17/posix-shebang.adoc b/src/content/en/tils/2021/01/17/posix-shebang.adoc
new file mode 100644
index 0000000..5cf0695
--- /dev/null
+++ b/src/content/en/tils/2021/01/17/posix-shebang.adoc
@@ -0,0 +1,58 @@
+= POSIX sh and shebangs
+
+:awk-1: link:../../../2020/12/15/shellcheck-repo.html
+:awk-2: link:../12/curl-awk-emails.html
+
+As I {awk-1}[keep moving] {awk-2}[towards POSIX], I'm on the process of
+migrating all my Bash scripts to POSIX sh.
+
+As I dropped `[[`, arrays and other Bashisms, I was left staring at the first
+line of every script, wondering what to do: what is the POSIX sh equivalent of
+`#!/usr/bin/env bash`?  I already knew that POSIX says nothing about shebangs,
+and that the portable way to call a POSIX sh script is `sh script.sh`, but
+I didn't know what to do with that first line.
+
+What I had previously was:
+
+[source,sh]
+----
+#!/usr/bin/env bash
+set -Eeuo pipefail
+cd "$(dirname "${BASH_SOURCE[0]}")"
+----
+
+Obviously, the `$BASH_SOURCE` would be gone, and I would have to adapt some of
+my scripts to not rely on the script location.  The `-E` and `-o pipefail`
+options were also gone, and would be replaced by nothing.
+
+I converted all of them to:
+
+[source,sh]
+----
+#!/bin/sh -eu
+----
+
+I moved the `-eu` options to the shebang line itself, striving for conciseness.
+But as I changed callers from `./script.sh` to `sh script.sh`, things started to
+fail.  Some tests that should fail reported errors, but didn't return 1.
+
+My first reaction was to revert back to `./script.sh`, but the POSIX bug I
+caught is a strong strain, and when I went back to it, I figured that the
+callers were missing some flags.  Specifically, `sh -eu script.sh`.
+
+Then it clicked: when running with `sh script.sh`, the shebang line with the sh
+options is ignored, as it is a comment!
+
+Which means that the shebang most friendly with POSIX is:
+
+[source,sh]
+----
+#!/bin/sh
+set -eu
+----
+
+. when running via `./script.sh`, if the system has an executable at `/bin/sh`,
+  it will be used to run the script;
+. when running via `sh script.sh`, the sh options aren't ignored as previously.
+
+TIL.
diff --git a/src/content/en/tils/2021/04/24/cl-generic-precedence.adoc b/src/content/en/tils/2021/04/24/cl-generic-precedence.adoc
new file mode 100644
index 0000000..541afb0
--- /dev/null
+++ b/src/content/en/tils/2021/04/24/cl-generic-precedence.adoc
@@ -0,0 +1,149 @@
+= Common Lisp argument precedence order parameterization of a generic function
+
+When CLOS dispatches a method, it picks the most specific method definition to
+the argument list:
+
+[source,lisp]
+----
+
+* (defgeneric a-fn (x))
+#<STANDARD-GENERIC-FUNCTION A-FN (0) {5815ACB9}>
+
+* (defmethod a-fn (x) :default-method)
+#<STANDARD-METHOD A-FN (T) {581DB535}>
+
+* (defmethod a-fn ((x number)) :a-number)
+#<STANDARD-METHOD A-FN (NUMBER) {58241645}>
+
+* (defmethod a-fn ((x (eql 1))) :number-1)
+#<STANDARD-METHOD A-FN ((EQL 1)) {582A7D75}>
+
+* (a-fn nil)
+:DEFAULT-METHOD
+
+* (a-fn "1")
+:DEFAULT-METHOD
+
+* (a-fn 0)
+:A-NUMBER
+
+* (a-fn 1)
+:NUMBER-1
+----
+
+CLOS uses a similar logic when choosing the method from parent classes, when
+multiple ones are available:
+
+[source,lisp]
+----
+* (defclass class-a () ())
+
+#<STANDARD-CLASS CLASS-A {583E0B25}>
+* (defclass class-b () ())
+
+#<STANDARD-CLASS CLASS-B {583E7F6D}>
+* (defgeneric another-fn (obj))
+
+#<STANDARD-GENERIC-FUNCTION ANOTHER-FN (0) {583DA749}>
+* (defmethod another-fn ((obj class-a)) :class-a)
+; Compiling LAMBDA (.PV-CELL. .NEXT-METHOD-CALL. OBJ):
+; Compiling Top-Level Form:
+
+#<STANDARD-METHOD ANOTHER-FN (CLASS-A) {584523C5}>
+* (defmethod another-fn ((obj class-b)) :class-b)
+; Compiling LAMBDA (.PV-CELL. .NEXT-METHOD-CALL. OBJ):
+; Compiling Top-Level Form:
+
+#<STANDARD-METHOD ANOTHER-FN (CLASS-B) {584B8895}>
+----
+
+Given the above definitions, when inheriting from `class-a` and `class-b`, the
+order of inheritance matters:
+
+[source,lisp]
+----
+* (defclass class-a-coming-first (class-a class-b) ())
+#<STANDARD-CLASS CLASS-A-COMING-FIRST {584BE6AD}>
+
+* (defclass class-b-coming-first (class-b class-a) ())
+#<STANDARD-CLASS CLASS-B-COMING-FIRST {584C744D}>
+
+* (another-fn (make-instance 'class-a-coming-first))
+:CLASS-A
+
+* (another-fn (make-instance 'class-b-coming-first))
+:CLASS-B
+----
+
+Combining the order of inheritance with generic functions with multiple
+arguments, CLOS has to make a choice of how to pick a method given two competing
+definitions, and its default strategy is prioritizing from left to right:
+
+[source,lisp]
+----
+* (defgeneric yet-another-fn (obj1 obj2))
+#<STANDARD-GENERIC-FUNCTION YET-ANOTHER-FN (0) {584D9EC9}>
+
+* (defmethod yet-another-fn ((obj1 class-a) obj2) :first-arg-specialized)
+#<STANDARD-METHOD YET-ANOTHER-FN (CLASS-A T) {5854269D}>
+
+* (defmethod yet-another-fn (obj1 (obj2 class-b)) :second-arg-specialized)
+#<STANDARD-METHOD YET-ANOTHER-FN (T CLASS-B) {585AAAAD}>
+
+* (yet-another-fn (make-instance 'class-a) (make-instance 'class-b))
+:FIRST-ARG-SPECIALIZED
+----
+
+CLOS has to make a choice between the first and the second definition of
+`yet-another-fn`, but its choice is just a heuristic.  What if we want the
+choice to be based on the second argument, instead of the first?
+
+For that, we use the `:argument-precedence-order` option when declaring a
+generic function:
+
+[source,lisp]
+----
+* (defgeneric yet-another-fn (obj1 obj2) (:argument-precedence-order obj2 obj1))
+#<STANDARD-GENERIC-FUNCTION YET-ANOTHER-FN (2) {584D9EC9}>
+
+* (yet-another-fn (make-instance 'class-a) (make-instance 'class-b))
+:SECOND-ARG-SPECIALIZED
+----
+
+I liked that the `:argument-precedence-order` option exists.  We shouldn't have
+to change the arguments from `(obj1 obj2)` to `(obj2 obj1)` just to make CLOS
+pick the method that we want.  We can configure its default behaviour if
+desired, and keep the order of arguments however it best fits the generic
+function.
+
+== Comparison with Clojure
+
+Clojure has an equivalent, when using `defmulti`.
+
+Since when declaring a multi-method with `defmulti` we must define the dispatch
+function, Clojure uses it to pick the method definition.  Since the dispatch
+function is required, there is no need for a default behaviour, such as
+left-to-right.
+
+== Conclusion
+
+Making the argument precedence order configurable for generic functions but not
+for class definitions makes a lot of sense.
+
+When declaring a class, we can choose the precedence order, and that is about
+it.  But when defining a generic function, the order of arguments is more
+important to the function semantics, and the argument precedence being
+left-to-right is just the default behaviour.
+
+One shouldn't change the order of arguments of a generic function for the sake
+of tailoring it to the CLOS priority ranking algorithm, but doing it for a class
+definition is just fine.
+
+TIL.
+
+== References
+
+:clos-wiki: https://en.wikipedia.org/wiki/Object-Oriented_Programming_in_Common_Lisp
+
+. {clos-wiki}[Object-Oriented Programming in Common Lisp: A Programmer's Guide
+  to CLOS], by Sonja E. Keene
diff --git a/src/content/en/tils/2021/04/24/clojure-autocurry.adoc b/src/content/en/tils/2021/04/24/clojure-autocurry.adoc
new file mode 100644
index 0000000..a2c2835
--- /dev/null
+++ b/src/content/en/tils/2021/04/24/clojure-autocurry.adoc
@@ -0,0 +1,135 @@
+= Clojure auto curry
+:sort: 1
+:updatedat: 2021-04-27
+
+:defcurry-orig: https://lorettahe.github.io/clojure/2016/09/22/clojure-auto-curry
+
+Here's a simple macro defined by {defcurry-orig}[Loretta He] to create Clojure
+functions that are curried on all arguments, relying on Clojure's multi-arity
+support:
+
+[source,clojure]
+----
+(defmacro defcurry
+  [name args & body]
+  (let [partials (map (fn [n]
+                        `(~(subvec args 0 n) (partial ~name ~@(take n args))))
+                      (range 1 (count args)))]
+    `(defn ~name
+       (~args ~@body)
+       ~@partials)))
+----
+
+A naive `add` definition, alongside its usage and macroexpansion:
+
+[source,clojure]
+----
+user=> (defcurry add
+         [a b c d e]
+         (+ 1 2 3 4 5))
+#'user/add
+
+user=> (add 1)
+#object[clojure.core$partial$fn__5857 0x2c708440 "clojure.core$partial$fn__5857@2c708440"]
+
+user=> (add 1 2 3 4)
+#object[clojure.core$partial$fn__5863 0xf4c0e4e "clojure.core$partial$fn__5863@f4c0e4e"]
+
+user=> ((add 1) 2 3 4 5)
+15
+
+user=> (((add 1) 2 3) 4 5)
+15
+
+user=> (use 'clojure.pprint)
+nil
+
+user=> (pprint
+        (macroexpand
+         '(defcurry add
+            [a b c d e]
+            (+ 1 2 3 4 5))))
+(def
+ add
+ (clojure.core/fn
+  ([a b c d e] (+ 1 2 3 4 5))
+  ([a] (clojure.core/partial add a))
+  ([a b] (clojure.core/partial add a b))
+  ([a b c] (clojure.core/partial add a b c))
+  ([a b c d] (clojure.core/partial add a b c d))))
+nil
+----
+
+This simplistic `defcurry` definition doesn't support optional parameters,
+multi-arity, `&` rest arguments, docstrings, etc., but it could certainly evolve
+to do so.
+
+I like how `defcurry` is so short, and abdicates the responsability of doing the
+multi-arity logic to Clojure's built-in multi-arity support.  Simple and
+elegant.
+
+Same Clojure as before, now with auto-currying via macros.
+
+== Comparison with Common Lisp
+
+My attempt at writing an equivalent for Common Lisp gives me:
+
+[source,lisp]
+----
+(defun partial (fn &rest args)
+  (lambda (&rest args2)
+    (apply fn (append args args2))))
+
+(defun curry-n (n func)
+  (cond ((< n 0) (error "Too many arguments"))
+        ((zerop n) (funcall func))
+        (t (lambda (&rest rest)
+             (curry-n (- n (length rest))
+                      (apply #'partial func rest))))))
+
+(defmacro defcurry (name args &body body)
+  `(defun ,name (&rest rest)
+     (let ((func (lambda ,args ,@body)))
+       (curry-n (- ,(length args) (length rest))
+                (apply #'partial func rest)))))
+----
+
+Without built-in multi-arity support, we have to do more work, like tracking the
+number of arguments consumed so far.  We also have to write `#'partial`
+ourselves.  That is, without dependending on any library, sticking to ANSI
+Common Lisp.
+
+The usage is pretty similar:
+
+[source,lisp]
+----
+* (defcurry add (a b c d e)
+    (+ a b c d e))
+ADD
+
+* (add 1)
+#<FUNCTION (LAMBDA (&REST REST) :IN CURRY-N) {100216419B}>
+
+* (funcall (add 1) 2 3 4)
+#<FUNCTION (LAMBDA (&REST REST) :IN CURRY-N) {100216537B}>
+
+* (funcall (add 1) 2 3 4 5)
+15
+
+* (funcall (funcall (add 1) 2 3) 4 5)
+15
+
+* (macroexpand-1
+    '(defcurry add (a b c d e)
+       (+ a b c d e)))
+(DEFUN ADD (&REST REST)
+  (LET ((FUNC (LAMBDA (A B C D E) (+ A B C D E))))
+    (CURRY-N (- 5 (LENGTH REST)) (APPLY #'PARTIAL FUNC REST))))
+T
+----
+
+This also require `funcall`s, since we return a `lambda` that doesn't live in
+the function namespace.
+
+Like the Clojure one, it doesn't support optional parameters, `&rest` rest
+arguments, docstrings, etc., but it also could evolve to do so.
diff --git a/src/content/en/tils/2021/04/24/scm-nif.adoc b/src/content/en/tils/2021/04/24/scm-nif.adoc
new file mode 100644
index 0000000..2ea8a6f
--- /dev/null
+++ b/src/content/en/tils/2021/04/24/scm-nif.adoc
@@ -0,0 +1,61 @@
+= Three-way conditional for number signs on Lisp
+:categories: lisp scheme common-lisp
+:sort: 2
+:updatedat: 2021-08-14
+
+:on-lisp: https://www.paulgraham.com/onlisptext.html
+:sicp: https://mitpress.mit.edu/sites/default/files/sicp/index.html
+
+A useful macro from Paul Graham's {on-lisp}[On Lisp] book:
+
+[source,lisp]
+----
+(defmacro nif (expr pos zero neg)
+  (let ((g (gensym)))
+    `(let ((,g ,expr))
+       (cond ((plusp ,g) ,pos)
+             ((zerop ,g) ,zero)
+             (t ,neg)))))
+----
+
+After I looked at this macro, I started seeing opportunities to using it in many
+places, and yet I didn't see anyone else using it.
+
+The latest example I can think of is section 1.3.3 of {sicp}[Structure and
+Interpretation of Computer Programs], which I was reading recently:
+
+[source,scheme]
+----
+(define (search f neg-point pos-point)
+  (let ((midpoint (average neg-point pos-point)))
+    (if (close-enough? neg-point post-point)
+        midpoint
+        (let ((test-value (f midpoint)))
+          (cond ((positive? test-value)
+                 (search f neg-point midpoint))
+                ((negative? test-value)
+                 (search f midpoint pos-point))
+                (else midpoint))))))
+----
+
+Not that the book should introduce such macro this early, but I couldn't avoid
+feeling bothered by not using the `nif` macro, which could even remove the need
+for the intermediate `test-value` variable:
+
+[source,scheme]
+----
+(define (search f neg-point pos-point)
+  (let ((midpoint (average neg-point pos-point)))
+    (if (close-enough? neg-point post-point)
+        midpoint
+        (nif (f midpoint)
+             (search f neg-point midpoint)
+             (midpoint)
+             (search f midpoint pos-point)))))
+----
+
+It also avoids `cond`'s extra clunky parentheses for grouping, which is
+unnecessary but built-in.
+
+As a macro, I personally feel it tilts the balance towards expressivenes despite
+its extra cognitive load toll.
diff --git a/src/content/en/tils/2021/07/23/git-tls-gpg.adoc b/src/content/en/tils/2021/07/23/git-tls-gpg.adoc
new file mode 100644
index 0000000..f198c2b
--- /dev/null
+++ b/src/content/en/tils/2021/07/23/git-tls-gpg.adoc
@@ -0,0 +1,45 @@
+= GPG verification of Git repositories without TLS
+
+:empty:
+:git-protocol: https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols#_the_git_protocol
+:remembering: https://euandreh.xyz/remembering/
+
+For online Git repositories that use the [Git Protocol] for serving code, you
+can can use GPG to handle authentication, if you have the committer's public
+key.
+
+Here's how I'd verify that I've cloned an authentic version of
+{remembering}[remembering]footnote:not-available[
+  Funnily enough, not available anymore via the Git Protocol, now only with
+  HTTPS.
+]:
+
+[source,sh]
+----
+$ wget -qO- https://euandre.org/public.asc | gpg --import -
+gpg: clef 81F90EC3CD356060 : « EuAndreh <eu@euandre.org> » n'est pas modifiée
+gpg:       Quantité totale traitée : 1
+gpg:                 non modifiées : 1
+$ pushd `mktemp -d`
+$ git clone git://euandreh.xyz/remembering .
+$ git verify-commit HEAD
+gpg: Signature faite le dim. 27 juin 2021 16:50:21 -03
+gpg:                avec la clef RSA 5BDAE9B8B2F6C6BCBB0D6CE581F90EC3CD356060
+gpg: Bonne signature de « EuAndreh <eu@euandre.org> » [ultime]
+----
+
+On the first line we import the public key (funnily enough, available via
+HTTPS), and after cloning the code via the insecure `git://` protocol, we use
+`git verify-commit` to check the signature.
+
+The verification is successful, and we can see that the public key from the
+signature matches the fingerprint of the imported one.  However
+`git verify-commit` doesn't have an option to check which public key you want to
+verify the commit against.  Which means that if a MITM attack happens, the
+attacker could very easily serve a malicious repository with signed commits, and
+you'd have to verify the public key by yourself.  That would need to happen for
+subsequent fetches, too.
+
+Even though this is possible, it is not very convenient, and certainly very
+brittle.  Despite the fact that the Git Protocol is much faster, it being harder
+to make secure is a big downside.
diff --git a/src/content/en/tils/2021/08/11/js-bigint-reviver.adoc b/src/content/en/tils/2021/08/11/js-bigint-reviver.adoc
new file mode 100644
index 0000000..98ee79b
--- /dev/null
+++ b/src/content/en/tils/2021/08/11/js-bigint-reviver.adoc
@@ -0,0 +1,89 @@
+= Encoding and decoding JavaScript BigInt values with reviver
+:updatedat: 2021-08-13
+
+:reviver-fn: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/JSON/parse#using_the_reviver_parameter
+:bigint: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt
+:json-rfc: https://datatracker.ietf.org/doc/html/rfc8259
+
+`JSON.parse()` accepts a second parameter: a {reviver-fn}[`reviver()` function].
+It is a function that can be used to transform the `JSON` values as they're
+being parsed.
+
+As it turns out, when combined with JavaScript's {bigint}[`BigInt`] type, you
+can parse and encode JavaScript `BigInt` numbers via JSON:
+
+[source,javascript]
+----
+const bigIntReviver = (_, value) =>
+    typeof value === "string" && value.match(/^-?[0-9]+n$/)
+        ? BigInt(value.slice(0, value.length - 1))
+        : value;
+----
+
+I chose to interpret strings that contains only numbers and an ending `n`
+suffix as `BigInt` values, similar to how JavaScript interprets `123` (a number)
+differently from `123n` (a `bigint`);
+
+We do those checks before constructing the `BigInt` to avoid throwing needless
+exceptions and catching them on the parsing function, as this could easily
+become a bottleneck when parsing large JSON values.
+
+In order to do the full roundtrip, we now only need the `toJSON()` counterpart:
+
+[source,javascript]
+----
+BigInt.prototype.toJSON = function() {
+    return this.toString() + "n";
+};
+----
+
+With both `bigIntReviver` and `toJSON` defined, we can now successfully parse
+and encode JavaScript objects with `BigInt` values transparently:
+
+[source,javascript]
+----
+const s = `[
+    null,
+    true,
+    false,
+    -1,
+    3.14,
+    "a string",
+    { "a-number": "-123" },
+    { "a-bigint": "-123n" }
+]`;
+
+const parsed = JSON.parse(s, bigIntReviver);
+const s2 = JSON.stringify(parsed);
+
+console.log(parsed);
+console.log(s2);
+
+console.log(typeof parsed[6]["a-number"])
+console.log(typeof parsed[7]["a-bigint"])
+----
+
+The output of the above is:
+
+[source,javascript]
+----
+[
+  null,
+  true,
+  false,
+  -1,
+  3.14,
+  'a string',
+  { 'a-number': '-123' },
+  { 'a-bigint': -123n }
+]
+[null,true,false,-1,3.14,"a string",{"a-number":"-123"},{"a-bigint":"-123n"}]
+string
+bigint
+----
+
+If you're on a web browser, you can probably try copying and pasting the above
+code on the console right now, as is.
+
+Even though {json-rfc}[`JSON`] doesn't include `BigInt` number, encoding and
+decoding them as strings is quite trivial on JavaScript.
author	EuAndreh <eu@euandre.org>	2025-04-18 02:17:12 -0300
committer	EuAndreh <eu@euandre.org>	2025-04-18 02:48:42 -0300
commit	020c1e77489b772f854bb3288b9c8d2818a6bf9d (patch)
tree	142aec725a52162a446ea7d947cb4347c9d573c9 /src/content/en/tils/2021
parent	Makefile: Remove security.txt.gz (diff)
download	euandre.org-020c1e77489b772f854bb3288b9c8d2818a6bf9d.tar.gz euandre.org-020c1e77489b772f854bb3288b9c8d2818a6bf9d.tar.xz