diff options
Diffstat (limited to 'src/content/blog/2020/08')
-rw-r--r-- | src/content/blog/2020/08/10/guix-srht.adoc | 128 | ||||
-rw-r--r-- | src/content/blog/2020/08/31/database-i-wish-i-had.adoc | 299 |
2 files changed, 0 insertions, 427 deletions
diff --git a/src/content/blog/2020/08/10/guix-srht.adoc b/src/content/blog/2020/08/10/guix-srht.adoc deleted file mode 100644 index a89e86e..0000000 --- a/src/content/blog/2020/08/10/guix-srht.adoc +++ /dev/null @@ -1,128 +0,0 @@ -= Guix inside sourcehut builds.sr.ht CI -:updatedat: 2020-08-19 - -:nixos: https://man.sr.ht/builds.sr.ht/compatibility.md#nixos -:guix: https://guix.gnu.org/ -:binary-inst: https://guix.gnu.org/manual/en/guix.html#Binary-Installation -:shell-inst: https://git.savannah.gnu.org/cgit/guix.git/plain/etc/guix-install.sh - -After the release of the {nixos}[NixOS images in builds.sr.ht] and much usage of -it, I also started looking at {guix}[Guix] and wondered if I could get it on the -awesome builds.sr.ht service. - -The Guix manual section on the {binary-inst}[binary installation] is very -thorough, and even a {shell-inst}[shell installer script] is provided, but it is -built towards someone installing Guix on their personal computer, and relies -heavily on interactive input. - -I developed the following set of scripts that I have been using for some time to -run Guix tasks inside builds.sr.ht jobs. First, `install-guix.sh`: - -[source,sh] ----- -#!/usr/bin/env bash -set -x -set -Eeuo pipefail - -VERSION='1.0.1' -SYSTEM='x86_64-linux' -BINARY="guix-binary-${VERSION}.${SYSTEM}.tar.xz" - -cd /tmp -wget "https://ftp.gnu.org/gnu/guix/${BINARY}" -tar -xf "${BINARY}" - -sudo mv var/guix /var/ -sudo mv gnu / -sudo mkdir -p ~root/.config/guix -sudo ln -fs /var/guix/profiles/per-user/root/current-guix ~root/.config/guix/current - -GUIX_PROFILE="$(echo ~root)/.config/guix/current" -source "${GUIX_PROFILE}/etc/profile" - -groupadd --system guixbuild -for i in $(seq -w 1 10); -do - useradd -g guixbuild \ - -G guixbuild \ - -d /var/empty \ - -s "$(command -v nologin)" \ - -c "Guix build user ${i}" --system \ - "guixbuilder${i}"; -done - -mkdir -p /usr/local/bin -cd /usr/local/bin -ln -s /var/guix/profiles/per-user/root/current-guix/bin/guix . -ln -s /var/guix/profiles/per-user/root/current-guix/bin/guix-daemon . - -guix archive --authorize < ~root/.config/guix/current/share/guix/ci.guix.gnu.org.pub ----- - -Almost all of it is taken directly from the {binary-inst}[binary installation] -section from the manual, with the interactive bits stripped out: after -downloading and extracting the Guix tarball, we create some symlinks, add -guixbuild users and authorize the `ci.guix.gnu.org.pub` signing key. - -After installing Guix, we perform a `guix pull` to update Guix inside -`start-guix.sh`: - -[source,sh] ----- -#!/usr/bin/env bash -set -x -set -Eeuo pipefail - -sudo guix-daemon --build-users-group=guixbuild & -guix pull -guix package -u -guix --version ----- - -Then we can put it all together in a sample `.build.yml` configuration file I'm -using myself: - -[source,yaml] ----- -image: debian/stable -packages: - - wget -sources: - - https://git.sr.ht/~euandreh/songbooks -tasks: - - install-guix: | - cd ./songbooks/ - ./scripts/install-guix.sh - ./scripts/start-guix.sh - echo 'sudo guix-daemon --build-users-group=guixbuild &' >> ~/.buildenv - echo 'export PATH="${HOME}/.config/guix/current/bin${PATH:+:}$PATH"' >> ~/.buildenv - - tests: | - cd ./songbooks/ - guix environment -m build-aux/guix.scm -- make check - - docs: | - cd ./songbooks/ - guix environment -m build-aux/guix.scm -- make publish-dist ----- - -We have to add the `guix-daemon` to `~/.buildenv` so it can be started on every -following task run. Also, since we used `wget` inside `install-guix.sh`, we had -to add it to the images package list. - -After the `install-guix` task, you can use Guix to build and test your project, -or run any `guix environment --ad-hoc my-package -- my script` :) - -== Improvements - -:repository: https://git.sr.ht/~sircmpwn/builds.sr.ht - -When I originally created this code I had a reason why to have both a `sudo` -call for `sudo ./scripts/install-guix.sh` and `sudo` usages inside -`install-guix.sh` itself. I couldn't figure out why (it feels like my past self -was a bit smarter 😬), but it feels ugly now. If it is truly required I could -add an explanation for it, or remove this entirely in favor of a more elegant -solution. - -I could also contribute the Guix image upstream to builds.sr.ht, but there -wasn't any build or smoke tests in the original {repository}[repository], so I -wasn't inclined to make something that just ``works on my machine'' or add a -maintainence burden to the author. I didn't look at it again recently, though. diff --git a/src/content/blog/2020/08/31/database-i-wish-i-had.adoc b/src/content/blog/2020/08/31/database-i-wish-i-had.adoc deleted file mode 100644 index 7f010b9..0000000 --- a/src/content/blog/2020/08/31/database-i-wish-i-had.adoc +++ /dev/null @@ -1,299 +0,0 @@ -= The database I wish I had -:categories: mediator -:updatedat: 2020-09-03 - -:empty: -:values-talk: https://vimeo.com/230142234 -:haskell-startup: https://www.youtube.com/watch?v=ZR3Jirqk6W8 - -I watched the talk "{values-talk}[Platform as a Reflection of Values: Joyent, -Node.js and beyond]" by Bryan Cantrill, and I think he was able to put into -words something I already felt for some time: if there's no piece of software -out there that reflects your values, it's time for you to build that -software{empty}footnote:talk-time[ - At the very end, at time 29:49. When talking about the draft of this article - with a friend, he noted that Bryan O'Sullivan (a different Bryan) says a - similar thing on his talk "{haskell-startup}[Running a startup on Haskell]", - at time 4:15. -]. - -I kind of agree with what he said, because this is already happening to me. I -long for a database with a certain set of values, and for a few years I was just -waiting for someone to finally write it. After watching his talk, Bryan is -saying to me: "time to stop waiting, and start writing it yourself". - -So let me try to give an overview of such database, and go over its values. - -== Overview - -I want a database that allows me to create decentralized client-side -applications that can sync data. - -The best one-line description I can give right now is: - -____ -It's sort of like PouchDB, Git, Datomic, SQLite and Mentat. -____ - -A more descriptive version could be: - -____ -An embedded, immutable, syncable relational database. -____ - -Let's go over what I mean by each of those aspects one by one. - -=== Embedded - -:sqlite: https://sqlite.org/index.html -:sqlite-whentouse: https://sqlite.org/whentouse.html -:pouchdb: https://pouchdb.com/ -:couchdb: https://couchdb.apache.org/ -:mentat: https://github.com/mozilla/mentat -:pouchdb-adapters: https://pouchdb.com/adapters.html -:datomic-storage-services: https://docs.datomic.com/on-prem/storage.html -:sqlite-amalgamation: https://www.sqlite.org/amalgamation.html -:pointed-out: https://news.ycombinator.com/item?id=24338881 - -I think the server-side database landscape is diverse and mature enough for my -needs (even though I end up choosing SQLite most of the time), and what I'm -after is a database to be embedded on client-side applications itself, be it -desktop, browser, mobile, _etc._ - -The purpose of such database is not to keep some local cache of data in case of -lost connectivity: we have good solutions for that already. It should serve as -the source of truth, and allow the application to work on top of it. - -{sqlite}[*SQLite*] is a great example of that: it is a very powerful relational -database that runs {sqlite-whentouse}[almost anywhere]. What I miss from it -that SQLite doesn't provide is the ability to run it on the browser: even though -you could compile it to WebAssembly, [line-through]#it assumes a POSIX -filesystem that would have to be -emulated#{empty}footnote:posix-sqlite[ - It was {pointed-out}[pointed out to me] that SQLite doesn't assume the - existence of a POSIX filesystem, as I wrongly stated. Thanks for the - correction. -pass:[</p><p>] - This makes me consider it as a storage backend all by itself. I initially - considered having an SQLite storage backend as one implementation of the POSIX - filesystem storage API that I mentioned. My goal was to rely on it so I could - validate the correctness of the actual implementation, given SQLite's - robustness. -pass:[</p><p>] - However it may even better to just use SQLite, and get an ACID backend without - recreating a big part of SQLite from scratch. In fact, both Datomic and - PouchDB didn't create an storage backend for themselves, they just plugged on - what already existed and already worked. I'm beginning to think that it would - be wiser to just do the same, and drop entirely the from scratch - implementation that I mentioned. -pass:[</p><p>] - That's not to say that adding an IndexedDB compatibility layer to SQLite would - be enough to make it fit the other requirements I mention on this page. SQLite - still is an implementation of a update-in-place, SQL, table-oriented database. - It is probably true that cherry-picking the relevant parts of SQLite (like - storage access, consistency, crash recovery, parser generator, *etc.*) and - leaving out the unwanted parts (SQL, tables, threading, *etc.*) would be - better than including the full SQLite stack, that's simply an optimization. - Both could even coexist, if desired. -pass:[</p><p>] - SQLite would have to be treated similarly to how Datomic treats SQL databases: - instead of having a table for each entities, spread attributes over the - tables, *etc.*, it treats SQL databases as a key-value storage so it doesn't - have to re-implement interacting with the disk that other databases do well. -pass:[</p><p>] - The tables would contain blocks of binary data, so there isn't a difference on - how the SQLite storage backend behaves and how the IndexedDB storage backend - behaves, much like how Datomic works the same regardless of the storage - backend, same for PouchDB. -pass:[</p><p>] - I welcome corrections on what I said above, too. -]. - -{pouchdb}[*PouchDB*] is another great example: it's a full reimplementation of -{couchdb}[CouchDB] that targets JavaScript environments, mainly the browser and -Node.js. However I want a tool that can be deployed anywhere, and not limit its -applications to places that already have a JavaScript runtime environment, or -force the developer to bundle a JavaScript runtime environment with their -application. This is true for GTK+ applications, command line programs, Android -apps, _etc._ - -{mentat}[*Mentat*] was an interesting project, but its reliance on SQLite makes -it inherit most of the downsides (and benefits too) of SQLite itself. - -Having such a requirement imposes a different approach to storage: we have to -decouple the knowledge about the intricacies of storage from the usage of -storage itself, so that a module (say query processing) can access storage -through an API without needing to know about its implementation. This allows -the database to target a POSIX filesystems storage API and an IndexedDB storage -API, and make the rest of the code agnostic about storage. PouchDB has such -mechanism (called {pouchdb-adapters}[adapters]) and Datomic has them too (called -{datomic-storage-services}[storage services]). - -This would allow the database to adapt to where it is embedded: when targeting -the browser the IndexedDB storage API would provide the persistence layer that -the database requires, and similarly the POSIX filesystem storage API would -provide the persistence layer when targeting POSIX systems (like desktops, -mobile, _etc._). - -But there's also an extra restriction that comes from by being embedded: it -needs to provide and embeddable artifact, most likely a binary library object -that exposes a C compatible FFI, similar to {sqlite-amalgamation}[how SQLite -does]. Bundling a full runtime environment is possible, but doesn't make it a -compelling solution for embedding. This rules out most languages, and leaves -us with C, Rust, Zig, and similar options that can target POSIX systems and -WebAssembly. - -=== Immutable - -:datomic: https://www.datomic.com/ -:day-of-datomic: https://vimeo.com/116315075 -:git: https://git-scm.com/ -:sqlite-limits: https://sqlite.org/limits.html -:datomic-no-history: https://docs.datomic.com/cloud/best.html#nohistory-for-high-churn - -Being immutable means that only new information is added, no in-place update -ever happens, and nothing is ever deleted. - -Having an immutable database presents us with similar trade-offs found in -persistent data structures, like lack of coordination when doing reads, caches -being always coherent, and more usage of space. - -{datomic}[*Datomic*] is the go to database example of this: it will only add -information (datoms) and allows you to query them in a multitude of ways. -Stuart Halloway calls it "accumulate-only" over -"append-only"{empty}footnote:accumulate-only[ - Video "{day-of-datomic}[Day of Datomic Part 2]" on Datomic's information - model, at time 12:28. -]: - -____ -It's accumulate-only, it is not append-only. So append-only, most people when -they say that they're implying something physical about what happens. -____ - -Also a database can be append-only and overwrite existing information with new -information, by doing clean-ups of "stale" data. I prefer to adopt the -"accumulate-only" naming and approach. - -{git}[*Git*] is another example of this: new commits are always added on top of -the previous data, and it grows by adding commits instead of replacing existing -ones. - -Git repositories can only grow in size, and that is not only an acceptable -condition, but also one of the reasons to use it. - -All this means that no in-place updates happens on data, and the database will -be much more concerned about how compact and efficiently it stores data than how -fast it does writes to disk. Being embedded, the storage limitation is either -a) how much storage the device has or b) how much storage was designed for the -application to consume. So even though the database could theoretically operate -with hundreds of TBs, a browser page or mobile application wouldn't have access -to this amount of storage. SQLite even {sqlite-limits}[says] that it does -support approximately 280 TBs of data, but those limits are untested. - -The upside of keeping everything is that you can have historical views of your -data, which is very powerful. This also means that applications should turn -this off when not -relevant{empty}footnote:no-history[ - Similar to {datomic-no-history}[Datomic's `:db/noHistory`]. -]. - -=== Syncable - -:3-way-merge: https://en.wikipedia.org/wiki/Merge_(version_control) -:git-remote-gcrypt: https://spwhitton.name/tech/code/git-remote-gcrypt/ - -This is a frequent topic when talking about offline-first solutions. When -building applications that: - -* can fully work offline, -* stores data, -* propagates that data to other application instances, - -then you'll need a conflict resolution strategy to handle all the situations -where different application instances disagree. Those application instances -could be a desktop and a browser version of the same application, or the same -mobile app in different devices. - -A three-way merge seems to be the best approach, on top of which you could add -application specific conflict resolution functions, like: - -* pick the change with higher timestamp; -* if one change is a delete, pick it; -* present the diff on the screen and allow the user to merge them. - -Some databases try to make this "easy", by choosing a strategy for you, but I've -found that different applications require different conflict resolution -strategies. Instead, the database should leave this up to the user to decide, -and provide tools for them to do it. - -{3-way-merge}[*Three-way merges in version control*] are the best example, -performing automatic merges when possible and asking the user to resolve -conflicts when they appear. - -The unit of conflict for a version control system is a line of text. The -database equivalent would probably be a single attribute, not a full entity or a -full row. - -Making all the conflict resolution logic be local should allow the database to -have encrypted remotes similar to how {git-remote-gcrypt}[git-remote-gcrypt] -adds this functionality to Git. This would enable users to sync the application -data across devices using an untrusted intermediary. - -=== Relational - -:datomic-datalog: https://docs.datomic.com/on-prem/query.html -:datomic-model: https://docs.datomic.com/cloud/whatis/data-model.html#datoms - -I want the power of relational queries on the client applications. - -Most of the arguments against traditional table-oriented relational databases -are related to write performance, but those don't apply here. The bottlenecks -for client applications usually aren't write throughput. Nobody is interested -in differentiating between 1 MB/s or 10 MB/s when you're limited to 500 MB -total. - -The relational model of the database could either be based on SQL and tables -like in SQLite, or maybe {datomic-datalog}[datalog] and {datomic-model}[datoms] -like in Datomic. - -== From aspects to values - -Now let's try to translate the aspects above into values, as suggested by Bryan -Cantrill. - -=== Portability - -Being able to target so many different platforms is a bold goal, and the -embedded nature of the database demands portability to be a core value. - -=== Integrity - -When the local database becomes the source of truth of the application, it must -provide consistency guarantees that enables applications to rely on it. - -=== Expressiveness - -The database should empower applications to slice and dice the data in any way -it wants to. - -== Next steps - -Since I can't find any database that fits these requirements, I've finally come -to terms with doing it myself. - -It's probably going to take me a few years to do it, and making it portable -between POSIX and IndexedDB will probably be the biggest challenge. I got -myself a few books on databases to start. - -I wonder if I'll ever be able to get this done. - -== External links - -:reddit: https://old.reddit.com/r/programming/comments/ijwz5b/the_database_i_wish_i_had/ -:lobsters: https://lobste.rs/s/m9vkg4/database_i_wish_i_had -:hn: https://news.ycombinator.com/item?id=24337244 -:list: https://lists.sr.ht/~euandreh/public-inbox/%3C010101744a592b75-1dce9281-f0b8-4226-9d50-fd2c7901fa72-000000%40us-west-2.amazonses.com%3E - -See discussions on {reddit}[Reddit], {lobsters}[lobsters], {hn}[HN] and {list}[a -lengthy email exchange]. |