summaryrefslogtreecommitdiff
path: root/src/content/blog/2020
diff options
context:
space:
mode:
Diffstat (limited to 'src/content/blog/2020')
-rw-r--r--src/content/blog/2020/08/31/database-i-with-i-had.adoc4
-rw-r--r--src/content/blog/2020/10/19/feature-flags.adoc315
-rw-r--r--src/content/blog/2020/10/20/wrong-interviewing.adoc302
-rw-r--r--src/content/blog/2020/11/07/diy-bugs.adoc122
-rw-r--r--src/content/blog/2020/11/08/paradigm-shift-review.adoc163
-rw-r--r--src/content/blog/2020/11/12/database-parsers-trees.adoc227
-rw-r--r--src/content/blog/2020/11/14/local-first-review.adoc218
7 files changed, 661 insertions, 690 deletions
diff --git a/src/content/blog/2020/08/31/database-i-with-i-had.adoc b/src/content/blog/2020/08/31/database-i-with-i-had.adoc
index fdcb56c..7533c8a 100644
--- a/src/content/blog/2020/08/31/database-i-with-i-had.adoc
+++ b/src/content/blog/2020/08/31/database-i-with-i-had.adoc
@@ -62,8 +62,8 @@ the source of truth, and allow the application to work on top of it.
{sqlite}[*SQLite*] is a great example of that: it is a very powerful relational
database that runs {sqlite-whentouse}[almost anywhere]. What I miss from it
that SQLite doesn't provide is the ability to run it on the browser: even though
-you could compile it to WebAssembly, [line-through]*it assumes a POSIX
-filesystem that would have to be emulated*[multiblock footnote omitted].
+you could compile it to WebAssembly, [line-through]#it assumes a POSIX
+filesystem that would have to be emulated#[multiblock footnote omitted FIXME].
{pouchdb}[*PouchDB*] is another great example: it's a full reimplementation of
{couchdb}[CouchDB] that targets JavaScript environments, mainly the browser and
diff --git a/src/content/blog/2020/10/19/feature-flags.adoc b/src/content/blog/2020/10/19/feature-flags.adoc
index c62c2d1..c9adc8a 100644
--- a/src/content/blog/2020/10/19/feature-flags.adoc
+++ b/src/content/blog/2020/10/19/feature-flags.adoc
@@ -1,305 +1,304 @@
----
-title: "Feature flags: differences between backend, frontend and mobile"
-date: 2020-10-19
-updated_at: 2020-11-03
-layout: post
-lang: en
-ref: feature-flags-differences-between-backend-frontend-and-mobile
-eu_categories: presentation
----
-
-*This article is derived from a [presentation][presentation] on the same
-subject.*
-
-When discussing about feature flags, I find that their
-costs and benefits are often well exposed and addressed. Online articles like
-"[Feature Toggle (aka Feature Flags)][feature-flags-article]" do a great job of
+= Feature flags: differences between backend, frontend and mobile
+
+:empty:
+:slides: link:../../../../slides/2020/10/19/feature-flags.html
+:fowler-article: https://martinfowler.com/articles/feature-toggles.html
+
+_This article is derived from a {slides}[presentation] on the same subject._
+
+When discussing about feature flags, I find that their costs and benefits are
+often well exposed and addressed. Online articles like
+"{fowler-article}[Feature Toggle (aka Feature Flags)]" do a great job of
explaining them in detail, giving great general guidance of how to apply
techniques to adopt it.
However the weight of those costs and benefits apply differently on backend,
-frontend or mobile, and those differences aren't covered. In fact, many of them
+frontend or mobile, and those differences aren't covered. In fact, many of them
stop making sense, or the decision of adopting a feature flag or not may change
depending on the environment.
In this article I try to make the distinction between environments and how
- feature flags apply to them, with some final best practices I've acquired when
- using them in production.
+feature flags apply to them, with some final best practices I've acquired when
+using them in production.
-[presentation]: {% link _slides/2020-10-19-rollout-feature-flag-experiment-operational-toggle.slides %}
-[feature-flags-article]: https://martinfowler.com/articles/feature-toggles.html
+== Why feature flags
-## Why feature flags
+:atlassian-cicd: https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment
Feature flags in general tend to be cited on the context of
-[continuous deployment][cd]:
+{atlassian-cicd}[continuous deployment]:
-> A: With continuous deployment, you deploy to production automatically
+____
+A: With continuous deployment, you deploy to production automatically
-> B: But how do I handle deployment failures, partial features, *etc.*?
+B: But how do I handle deployment failures, partial features, _etc._?
-> A: With techniques like canary, monitoring and alarms, feature flags, *etc.*
+A: With techniques like canary, monitoring and alarms, feature flags, _etc._
+____
-Though adopting continuous deployment doesn't force you to use feature
-flags, it creates a demand for it. The inverse is also true: using feature flags
-on the code points you more obviously to continuous deployment. Take the
-following code sample for example, that we will reference later on the article:
+Though adopting continuous deployment doesn't force you to use feature flags, it
+creates a demand for it. The inverse is also true: using feature flags on the
+code points you more obviously to continuous deployment. Take the following
+code sample for example, that we will reference later on the article:
-```javascript
+[source,javascript]
+----
function processTransaction() {
- validate();
- persist();
- // TODO: add call to notifyListeners()
+ validate();
+ persist();
+ // TODO: add call to notifyListeners()
}
-```
+----
While being developed, being tested for suitability or something similar,
-`notifyListeners()` may not be included in the code at once. So instead of
+`notifyListeners()` may not be included in the code at once. So instead of
keeping it on a separate, long-lived branch, a feature flag can decide when the
new, partially implemented function will be called:
-```javascript
+[source,javascript]
+----
function processTransaction() {
- validate();
- persist();
- if (featureIsEnabled("activate-notify-listeners")) {
- notifyListeners();
- }
+ validate();
+ persist();
+ if (featureIsEnabled("activate-notify-listeners")) {
+ notifyListeners();
+ }
}
-```
+----
This allows your code to include `notifyListeners()`, and decide when to call it
-at runtime. For the price of extra things around the code, you get more
+at runtime. For the price of extra things around the code, you get more
dynamicity.
So the fundamental question to ask yourself when considering adding a feature
flag should be:
-> Am I willing to pay with code complexity to get dynamicity?
+____
+Am I willing to pay with code complexity to get dynamicity?
+____
-It is true that you can make the management of feature flags as
-straightforward as possible, but having no feature flags is simpler than having
-any. What you get in return is the ability to parameterize the behaviour of the
-application at runtime, without doing any code changes.
+It is true that you can make the management of feature flags as straightforward
+as possible, but having no feature flags is simpler than having any. What you
+get in return is the ability to parameterize the behaviour of the application at
+runtime, without doing any code changes.
Sometimes this added complexity may tilt the balance towards not using a feature
flag, and sometimes the flexibility of changing behaviour at runtime is
-absolutely worth the added complexity. This can vary a lot by code base, feature, but
-fundamentally by environment: its much cheaper to deploy a new version of a
-service than to release a new version of an app.
+absolutely worth the added complexity. This can vary a lot by code base,
+feature, but fundamentally by environment: its much cheaper to deploy a new
+version of a service than to release a new version of an app.
So the question of which environment is being targeted is key when reasoning
about costs and benefits of feature flags.
-[cd]: https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment
+== Control over the environment
-## Control over the environment
+:fdroid: https://f-droid.org/
+:bad-apple: https://www.paulgraham.com/apple.html
The key differentiator that makes the trade-offs apply differently is how much
control you have over the environment.
-When running a **backend** service, you usually are paying for the servers
-themselves, and can tweak them as you wish. This means you have full control do
-to code changes as you wish. Not only that, you decide when to do it, and for
+When running a *backend* service, you usually are paying for the servers
+themselves, and can tweak them as you wish. This means you have full control do
+to code changes as you wish. Not only that, you decide when to do it, and for
how long the transition will last.
-On the **frontend** you have less control: even though you can choose to make a
-new version available any time you wish, you can't force[^force] clients to
-immediately switch to the new version. That means that a) clients could skip
-upgrades at any time and b) you always have to keep backward and forward
-compatibility in mind.
+On the *frontend* you have less control: even though you can choose to make a
+new version available any time you wish, you can't
+force{empy}footnote:force[
+ Technically you could force a reload with JavaScript using
+ `window.location.reload()`, but that not only is invasive and impolite, but
+ also gives you the illusion that you have control over the client when you
+ actually don't: clients with disabled JavaScript would be immune to such
+ tactics.
+] clients to immediately switch to the new version. That means that a) clients
+could skip upgrades at any time and b) you always have to keep backward and
+forward compatibility in mind.
Even though I'm mentioning frontend directly, it applies to other environment
with similar characteristics: desktop applications, command-line programs,
-*etc*.
+_etc_.
-On **mobile** you have even less control: app stores need to allow your app to
-be updated, which could bite you when least desired. Theoretically you could
-make you APK available on third party stores like [F-Droid][f-droid], or even
-make the APK itself available for direct download, which would give you the same
+On *mobile* you have even less control: app stores need to allow your app to be
+updated, which could bite you when least desired. Theoretically you could make
+you APK available on third party stores like {fdroid}[F-Droid], or even make the
+APK itself available for direct download, which would give you the same
characteristics of a frontend application, but that happens less often.
-On iOS you can't even do that. You have to get Apple's blessing on every single
-update. Even though we already know that is a [bad idea][apple] for over a
-decade now, there isn't a way around it. This is where you have the least
+On iOS you can't even do that. You have to get Apple's blessing on every single
+update. Even though we already know that is a {bad-apple}[bad idea] for over a
+decade now, there isn't a way around it. This is where you have the least
control.
In practice, the amount of control you have will change how much you value
-dynamicity: the less control you have, the more valuable it is. In other words,
+dynamicity: the less control you have, the more valuable it is. In other words,
having a dynamic flag on the backend may or may not be worth it since you could
always update the code immediately after, but on iOS it is basically always
worth it.
-[f-droid]: https://f-droid.org/
-[^force]: Technically you could force a reload with JavaScript using
- `window.location.reload()`, but that not only is invasive and impolite, but
- also gives you the illusion that you have control over the client when you
- actually don't: clients with disabled JavaScript would be immune to such
- tactics.
+== Rollout
-[apple]: http://www.paulgraham.com/apple.html
+:kubernetes-deployment: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#creating-a-deployment
+:play-store-rollout: https://support.google.com/googleplay/android-developer/answer/6346149?hl=en
+:app-store-rolllout: https://help.apple.com/app-store-connect/#/dev3d65fcee1
-## Rollout
-
-A rollout is used to *roll out* a new version of software.
+A rollout is used to _roll out_ a new version of software.
They are usually short-lived, being relevant as long as the new code is being
-deployed. The most common rule is percentages.
+deployed. The most common rule is percentages.
-On the **backend**, it is common to find it on the deployment infrastructure
-itself, like canary servers, blue/green deployments,
-[a kubernetes deployment rollout][k8s], *etc*. You could do those manually, by
-having a dynamic control on the code itself, but rollbacks are cheap enough that
-people usually do a normal deployment and just give some extra attention to the
-metrics dashboard.
+On the *backend*, it is common to find it on the deployment infrastructure
+itself, like canary servers, blue/green deployments, {kubernetes-deployment}[a
+kubernetes deployment rollout], _etc_. You could do those manually, by having a
+dynamic control on the code itself, but rollbacks are cheap enough that people
+usually do a normal deployment and just give some extra attention to the metrics
+dashboard.
Any time you see a blue/green deployment, there is a rollout happening: most
likely a load balancer is starting to direct traffic to the new server, until
-reaching 100% of the traffic. Effectively, that is a rollout.
+reaching 100% of the traffic. Effectively, that is a rollout.
-On the **frontend**, you can selectively pick which user's will be able to
-download the new version of a page. You could use geographical region, IP,
+On the *frontend*, you can selectively pick which user's will be able to
+download the new version of a page. You could use geographical region, IP,
cookie or something similar to make this decision.
-CDN propagation delays and people not refreshing their web
-pages are also rollouts by themselves, since old and new versions of the
-software will coexist.
+CDN propagation delays and people not refreshing their web pages are also
+rollouts by themselves, since old and new versions of the software will coexist.
-On **mobile**, the Play Store allows you to perform
-fine-grained [staged rollouts][staged-rollouts], and the App Store allows you to
-perform limited [phased releases][phased-releases].
+On *mobile*, the Play Store allows you to perform fine-grained
+{play-store-rollout}[staged rollouts], and the App Store allows you to perform
+limited {app-store-rollout}[phased releases].
Both for Android and iOS, the user plays the role of making the download.
In summary: since you control the servers on the backend, you can do rollouts at
-will, and those are often found automated away in base infrastructure. On the
+will, and those are often found automated away in base infrastructure. On the
frontend and on mobile, there are ways to make new versions available, but users
may not download them immediately, and many different versions of the software
end up coexisting.
-[k8s]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#creating-a-deployment
-[staged-rollouts]: https://support.google.com/googleplay/android-developer/answer/6346149?hl=en
-[phased-releases]: https://help.apple.com/app-store-connect/#/dev3d65fcee1
-
-## Feature flag
+== Feature flag
-A feature flag is a *flag* that tells the application on runtime to turn on or
-off a given *feature*. That means that the actual production code will have more
-than one possible code paths to go through, and that a new version of a feature
-coexists with the old version. The feature flag tells which part of the code to
-go through.
+A feature flag is a _flag_ that tells the application on runtime to turn on or
+off a given _feature_. That means that the actual production code will have
+more than one possible code paths to go through, and that a new version of a
+feature coexists with the old version. The feature flag tells which part of the
+code to go through.
They are usually medium-lived, being relevant as long as the new code is being
-developed. The most common rules are percentages, allow/deny lists, A/B groups
+developed. The most common rules are percentages, allow/deny lists, A/B groups
and client version.
-On the **backend**, those are useful for things that have a long development
-cycle, or that needs to done by steps. Consider loading the feature flag rules
-in memory when the application starts, so that you avoid querying a database
-or an external service for applying a feature flag rule and avoid flakiness on
-the result due to intermittent network failures.
+On the *backend*, those are useful for things that have a long development
+cycle, or that needs to done by steps. Consider loading the feature flag rules
+in memory when the application starts, so that you avoid querying a database or
+an external service for applying a feature flag rule and avoid flakiness on the
+result due to intermittent network failures.
-Since on the **frontend** you don't control when to update the client software,
+Since on the *frontend* you don't control when to update the client software,
you're left with applying the feature flag rule on the server, and exposing the
-value through an API for maximum dynamicity. This could be in the frontend code
+value through an API for maximum dynamicity. This could be in the frontend code
itself, and fallback to a "just refresh the page"/"just update to the latest
version" strategy for less dynamic scenarios.
-On **mobile** you can't even rely on a "just update to the latest version"
+On *mobile* you can't even rely on a "just update to the latest version"
strategy, since the code for the app could be updated to a new feature and be
-blocked on the store. Those cases aren't recurrent, but you should always assume
-the store will deny updates on critical moments so you don't find yourself with
-no cards to play. That means the only control you actually have is via
-the backend, by parameterizing the runtime of the application using the API. In
-practice, you should always have a feature flag to control any relevant piece of
-code. There is no such thing as "too small code change for a feature flag". What
-you should ask yourself is:
-
-> If the code I'm writing breaks and stays broken for around a month, do I care?
+blocked on the store. Those cases aren't recurrent, but you should always
+assume the store will deny updates on critical moments so you don't find
+yourself with no cards to play. That means the only control you actually have
+is via the backend, by parameterizing the runtime of the application using the
+API. In practice, you should always have a feature flag to control any relevant
+piece of code. There is no such thing as "too small code change for a feature
+flag". What you should ask yourself is:
+
+____
+If the code I'm writing breaks and stays broken for around a month, do I care?
+____
If you're doing an experimental screen, or something that will have a very small
-impact you might answer "no" to the above question. For everything else, the
+impact you might answer "no" to the above question. For everything else, the
answer will be "yes": bug fixes, layout changes, refactoring, new screen,
-filesystem/database changes, *etc*.
+filesystem/database changes, _etc_.
-## Experiment
+== Experiment
An experiment is a feature flag where you care about analytical value of the
-flag, and how it might impact user's behaviour. A feature flag with analytics.
+flag, and how it might impact user's behaviour. A feature flag with analytics.
They are also usually medium-lived, being relevant as long as the new code is
-being developed. The most common rule is A/B test.
+being developed. The most common rule is A/B test.
-On the **backend**, an experiment rely on an analytical environment that will
-pick the A/B test groups and distributions, which means those can't be held in
-memory easily. That also means that you'll need a fallback value in case
-fetching the group for a given customer fails.
+On the *backend*, an experiment rely on an analytical environment that will pick
+the A/B test groups and distributions, which means those can't be held in memory
+easily. That also means that you'll need a fallback value in case fetching the
+group for a given customer fails.
-On the **frontend** and on **mobile** they are no different from feature flags.
+On the *frontend* and on *mobile* they are no different from feature flags.
-## Operational toggle
+== Operational toggle
An operational toggle is like a system-level manual circuit breaker, where you
-turn on/off a feature, fail over the load to a different server, *etc*. They are
-useful switches to have during an incident.
+turn on/off a feature, fail over the load to a different server, _etc_. They
+are useful switches to have during an incident.
They are usually long-lived, being relevant as long as the code is in
-production. The most common rule is percentages.
+production. The most common rule is percentages.
They can be feature flags that are promoted to operational toggles on the
-**backend**, or may be purposefully put in place preventively or after a
+*backend*, or may be purposefully put in place preventively or after a
postmortem analysis.
-On the **frontend** and on **mobile** they are similar to feature flags, where
-the "feature" is being turned on and off, and the client interprets this value
-to show if the "feature" is available or unavailable.
+On the *frontend* and on *mobile* they are similar to feature flags, where the
+"feature" is being turned on and off, and the client interprets this value to
+show if the "feature" is available or unavailable.
-## Best practices
+== Best practices
-### Prefer dynamic content
+=== Prefer dynamic content
Even though feature flags give you more dynamicity, they're still somewhat
manual: you have to create one for a specific feature and change it by hand.
If you find yourself manually updating a feature flags every other day, or
-tweaking the percentages frequently, consider making it fully dynamic. Try
+tweaking the percentages frequently, consider making it fully dynamic. Try
using a dataset that is generated automatically, or computing the content on the
fly.
Say you have a configuration screen with a list of options and sub-options, and
-you're trying to find how to better structure this list. Instead of using a
-feature flag for switching between 3 and 5 options, make it fully dynamic. This
+you're trying to find how to better structure this list. Instead of using a
+feature flag for switching between 3 and 5 options, make it fully dynamic. This
way you'll be able to perform other tests that you didn't plan, and get more
flexibility out of it.
-### Use the client version to negotiate feature flags
+=== Use the client version to negotiate feature flags
After effectively finishing a feature, the old code that coexisted with the new
one will be deleted, and all traces of the transition will vanish from the code
-base. However if you just remove the feature flags from the API, all of the old
+base. However if you just remove the feature flags from the API, all of the old
versions of clients that relied on that value to show the new feature will go
downgrade to the old feature.
This means that you should avoid deleting client-facing feature flags, and
retire them instead: use the client version to decide when the feature is
stable, and return `true` for every client with a version greater or equal to
-that. This way you can stop thinking about the feature flag, and you don't break
-or downgrade clients that didn't upgrade past the transition.
+that. This way you can stop thinking about the feature flag, and you don't
+break or downgrade clients that didn't upgrade past the transition.
-### Beware of many nested feature flags
+=== Beware of many nested feature flags
Nested flags combine exponentially.
Pick strategic entry points or transitions eligible for feature flags, and
beware of their nesting.
-### Include feature flags in the development workflow
+=== Include feature flags in the development workflow
Add feature flags to the list of things to think about during whiteboarding, and
deleting/retiring a feature flags at the end of the development.
-### Always rely on a feature flag on the app
+=== Always rely on a feature flag on the app
-Again, there is no such thing "too small for a feature flag". Too many feature
-flags is a good problem to have, not the opposite. Automate the process of
+Again, there is no such thing "too small for a feature flag". Too many feature
+flags is a good problem to have, not the opposite. Automate the process of
creating a feature flag to lower its cost.
diff --git a/src/content/blog/2020/10/20/wrong-interviewing.adoc b/src/content/blog/2020/10/20/wrong-interviewing.adoc
index 9cdfefb..89f93b8 100644
--- a/src/content/blog/2020/10/20/wrong-interviewing.adoc
+++ b/src/content/blog/2020/10/20/wrong-interviewing.adoc
@@ -1,51 +1,49 @@
----
-title: How not to interview engineers
-date: 2020-10-20
-updated_at: 2020-10-24
-layout: post
-lang: en
-ref: how-not-to-interview-engineers
----
-This is a response to Slava's
-"[How to interview engineers][how-to-interview-engineers]" article. I initially
-thought it was a satire, [as have others][poes-law-comment], but he has
-[doubled down on it][slava-on-satire]:
-
-> (...) Some parts are slightly exaggerated for sure, but the essay isn't meant
-> as a joke.
+= How not to interview engineers
+
+:bad-article: https://defmacro.substack.com/p/how-to-interview-engineers
+:satire-comment: https://defmacro.substack.com/p/how-to-interview-engineers/comments#comment-599996
+:double-down: https://twitter.com/spakhm/status/1315754730740617216
+:poes-law: https://en.wikipedia.org/wiki/Poe%27s_law
+:hn-comment-1: https://news.ycombinator.com/item?id=24757511
+
+This is a response to Slava's "{bad-article}[How to interview engineers]"
+article. I initially thought it was a satire, {satire-comment}[as have others],
+but he has [doubled down on it]:
+
+____
+(...) Some parts are slightly exaggerated for sure, but the essay isn't meant as
+a joke.
+____
That being true, he completely misses the point on how to improve hiring, and
-proposes a worse alternative on many aspects. It doesn't qualify as provocative,
-it is just wrong.
+proposes a worse alternative on many aspects. It doesn't qualify as
+provocative, it is just wrong.
I was comfortable taking it as a satire, and I would just ignore the whole thing
if it wasn't (except for the technical memo part), but friends of mine
-considered it to be somewhat reasonable. This is a adapted version of parts of
-the discussions we had, risking becoming a gigantic showcase of
-[Poe's law][poes-law-wiki].
+considered it to be somewhat reasonable. This is a adapted version of parts of
+the discussions we had, risking becoming a gigantic showcase of {poes-law}[Poe's
+law].
In this piece, I will argument against his view, and propose an alternative
approach to improve hiring.
It is common to find people saying how broken technical hiring is, as well put
-in words by a phrase on [this comment][hn-satire]:
+in words by a phrase on {hn-comment-1}[this comment]:
-> Everyone loves to read and write about how developer interviewing is flawed,
-> but no one wants to go out on a limb and make suggestions about how to improve
-> it.
+____
+Everyone loves to read and write about how developer interviewing is flawed, but
+no one wants to go out on a limb and make suggestions about how to improve it.
+____
I guess Slava was trying to not fall on this trap, and make a suggestion on how
to improve instead, which all went terribly wrong.
-[how-to-interview-engineers]: https://defmacro.substack.com/p/how-to-interview-engineers
-[poes-law-comment]: https://defmacro.substack.com/p/how-to-interview-engineers/comments#comment-599996
-[slava-on-satire]: https://twitter.com/spakhm/status/1315754730740617216
-[poes-law-wiki]: https://en.wikipedia.org/wiki/Poe%27s_law
-[hn-satire]: https://news.ycombinator.com/item?id=24757511
+== What not to do
-## What not to do
+=== Time candidates
-### Time candidates
+:hammock-driven-talk: https://www.youtube.com/watch?v=f84n5oFoZBc
Timing the candidate shows up on the "talent" and "judgment" sections, and they
are both bad ideas for the same reason: programming is not a performance.
@@ -55,270 +53,280 @@ psychologists.
For a pianist, their state of mind during concerts is crucial: they not only
must be able to deal with stage anxiety, but to become really successful they
-will have to learn how to exploit it. The time window of the concert is what
+will have to learn how to exploit it. The time window of the concert is what
people practice thousands of hours for, and it is what defines one's career,
since how well all the practice went is irrelevant to the nature of the
-profession. Being able to leverage stage anxiety is an actual goal of them.
+profession. Being able to leverage stage anxiety is an actual goal of them.
That is also applicable to athletes, where the execution during a competition
makes them sink or swim, regardless of how all the training was.
-The same cannot be said about composers, though. They are more like book
+The same cannot be said about composers, though. They are more like book
writers, where the value is not on very few moments with high adrenaline, but on
-the aggregate over hours, days, weeks, months and years. A composer may have a
+the aggregate over hours, days, weeks, months and years. A composer may have a
deadline to finish a song in five weeks, but it doesn't really matter if it is
done on a single night, every morning between 6 and 9, at the very last week, or
-any other way. No rigid time structure applies, only whatever fits best to the
+any other way. No rigid time structure applies, only whatever fits best to the
composer.
Programming is more like composing than doing a concert, which is another way of
-saying that programming is not a performance. People don't practice algorithms
+saying that programming is not a performance. People don't practice algorithms
for months to keep them at their fingertips, so that finally in a single
afternoon they can sit down and write everything at once in a rigid 4 hours
window, and launch it immediately after.
Instead software is built iteratively, by making small additions, than
-refactoring the implementation, fixing bugs, writing a lot at once, *etc*.
-all while they get a firmer grasp of the problem, stop to think about it, come
-up with new ideas, *etc*.
+refactoring the implementation, fixing bugs, writing a lot at once, _etc_. all
+while they get a firmer grasp of the problem, stop to think about it, come up
+with new ideas, _etc_.
Some specifically plan for including spaced pauses, and call it
-"[Hammock Driven Development][hammock-driven-development]", which is just
-artist's "creative idleness" for hackers.
+"{hammock-driven-talk}[Hammock Driven Development]", which is just artist's
+"creative idleness" for hackers.
Unless you're hiring for a live coding group, a competitive programming team, or
a professional live demoer, timing the candidate that way is more harmful than
-useful. This type of timing doesn't find good programmers, it finds performant
+useful. This type of timing doesn't find good programmers, it finds performant
programmers, which isn't the same thing, and you'll end up with people who can
do great work on small problems but who might be unable to deal with big
-problems, and loose those who can very well handle huge problems, slowly. If you
-are lucky you'll get performant people who can also handle big problems on the
-long term, but maybe not.
+problems, and loose those who can very well handle huge problems, slowly. If
+you are lucky you'll get performant people who can also handle big problems on
+the long term, but maybe not.
An incident is the closest to a "performance" that it gets, and yet it is still
-dramatically different. Surely it is a high stress scenario, but while people
+dramatically different. Surely it is a high stress scenario, but while people
are trying to find a root cause and solve the problem, only the downtime itself
-is visible to the exterior. It is like being part of the support staff backstage
-during a play: even though execution matters, you're still not on the spot.
-During an incident you're doing debugging in anger rather than live coding.
+is visible to the exterior. It is like being part of the support staff
+backstage during a play: even though execution matters, you're still not on the
+spot. During an incident you're doing debugging in anger rather than live
+coding.
-Although giving a candidate the task to write a "technical memo" has
-potential to get a measure of the written communication skills of someone, doing
-so in a hard time window also misses the point for the same reasons.
+Although giving a candidate the task to write a "technical memo" has potential
+to get a measure of the written communication skills of someone, doing so in a
+hard time window also misses the point for the same reasons.
-[hammock-driven-development]: https://www.youtube.com/watch?v=f84n5oFoZBc
+=== Pay attention to typing speed
-### Pay attention to typing speed
+:dijkstra-typing: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/EWD512.html
+:speech-to-text: https://www.youtube.com/watch?v=Mz3JeYfBTcY
+:j-lang: https://www.jsoftware.com/#/
Typing is speed in never the bottleneck of a programmer, no matter how great
they are.
-As [Dijkstra said][dijkstra-typing]:
+As {dijkstra-typing}[Dijkstra said]:
-> But programming, when stripped of all its circumstantial irrelevancies, boils
-> down to no more and no less than very effective thinking so as to avoid
-> unmastered complexity, to very vigorous separation of your many different
-> concerns.
+____
+But programming, when stripped of all its circumstantial irrelevancies, boils
+down to no more and no less than very effective thinking so as to avoid
+unmastered complexity, to very vigorous separation of your many different
+concerns.
+____
In other words, programming is not about typing, it is about thinking.
Otherwise, the way to get those star programmers that can't type fast enough a
-huge productivity boost is to give them a touch typing course. If they are so
+huge productivity boost is to give them a touch typing course. If they are so
productive with typing speed being a limitation, imagine what they could
accomplish if they had razor sharp touch typing skills?
Also, why stop there? A good touch typist can do 90 WPM (words per minute), and
a great one can do 120 WPM, but with a stenography keyboard they get to 200
-WPM+. That is double the productivity! Why not try
-[speech-to-text][perl-out-loud]? Make them all use [J][j-lang] so they all need
-to type less! How come nobody thought of that?
+WPM+. That is double the productivity! Why not try
+{speech-to-text}[speech-to-text]? Make them all use {j-lang}[J] so they all need
+to type less! How come nobody thought of that?
And if someone couldn't solve the programming puzzle in the given time window,
but could come back in the following day with an implementation that is not only
faster, but uses less memory, was simpler to understand and easier to read than
anybody else? You'd be losing that person too.
-[dijkstra-typing]: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/EWD512.html
-[j-lang]: https://www.jsoftware.com/#/
-[perl-out-loud]: https://www.youtube.com/watch?v=Mz3JeYfBTcY
+=== IQ
-### IQ
+:determination-article: https://www.paulgraham.com/determination.html
+:scihub-article: https://sci-hub.do/https://psycnet.apa.org/doiLanding?doi=10.1037%2F1076-8971.6.1.33
-For "building an extraordinary team at a hard technology startup", intelligence
-is not the most important, [determination is][pg-determination].
+For "building an extraordinary team at a hard technology startup",
+intelligence is not the most important,
+{determination-article}[determination is].
-And talent isn't "IQ specialized for engineers". IQ itself isn't a measure of how
-intelligent someone is. Ever since Alfred Binet with Théodore Simon started to
-formalize what would become IQ tests years later, they already acknowledged
+And talent isn't "IQ specialized for engineers". IQ itself isn't a measure of
+how intelligent someone is. Ever since Alfred Binet with Théodore Simon started
+to formalize what would become IQ tests years later, they already acknowledged
limitations of the technique for measuring intelligence, which is
-[still true today][scihub-paper].
+{scihub-article}[still true today].
So having a high IQ tells only how smart people are for a particular aspect of
-intelligence, which is not representative of programming. There are numerous
+intelligence, which is not representative of programming. There are numerous
aspects of programming that are covered by IQ measurement: how to name variables
and functions, how to create models which are compatible with schema evolution,
how to make the system dynamic for runtime parameterization without making it
fragile, how to measure and observe performance and availability, how to pick
-between acquiring and paying technical debt, *etc*.
+between acquiring and paying technical debt, _etc_.
Not to say about everything else that a programmer does that is not purely
-programming. Saying high IQ correlates with great programming is a stretch, at
+programming. Saying high IQ correlates with great programming is a stretch, at
best.
-[pg-determination]: http://www.paulgraham.com/determination.html
-[scihub-paper]: https://sci-hub.do/https://psycnet.apa.org/doiLanding?doi=10.1037%2F1076-8971.6.1.33
-
-### Ditch HR
+=== Ditch HR
Slava tangentially picks on HR, and I will digress on that a bit:
-> A good rule of thumb is that if a question could be asked by an intern in HR,
-> it's a non-differential signaling question.
+____
+A good rule of thumb is that if a question could be asked by an intern in HR,
+it's a non-differential signaling question.
+____
-Stretching it, this is a rather snobbish view of HR. Why is it that an intern in
-HR can't make signaling questions? Could the same be said of an intern in
+Stretching it, this is a rather snobbish view of HR. Why is it that an intern
+in HR can't make signaling questions? Could the same be said of an intern in
engineering?
-In other words: is the question not signaling because the one
-asking is from HR, or because the one asking is an intern? If the latter, than
-he's just arguing that interns have no place in interviewing, but if the former
-than he was picking on HR.
+In other words: is the question not signaling because the one asking is from HR,
+or because the one asking is an intern? If the latter, than he's just arguing
+that interns have no place in interviewing, but if the former than he was
+picking on HR.
Extrapolating that, it is common to find people who don't value HR's work, and
only see them as inferiors doing unpleasant work, and who aren't capable enough
-(or *smart* enough) to learn programming.
+(or _smart_ enough) to learn programming.
-This is equivalent to people who work primarily on backend, and see others working on
-frontend struggling and say: "isn't it just building views and showing them on
-the browser? How could it possibly be that hard? I bet I could do it better,
-with 20% of code". As you already know, the answer to it is "well, why don't you
-go do it, then?".
+This is equivalent to people who work primarily on backend, and see others
+working on frontend struggling and say: "isn't it just building views and
+showing them on the browser? How could it possibly be that hard? I bet I could
+do it better, with 20% of code". As you already know, the answer to it is
+"well, why don't you go do it, then?".
This sense of superiority ignores the fact that HR have actual professionals
-doing actual hard work, not unlike programmers. If HR is inferior and so easy,
+doing actual hard work, not unlike programmers. If HR is inferior and so easy,
why not automate everything away and get rid of a whole department?
I don't attribute this world view to Slava, this is only an extrapolation of a
snippet of the article.
-### Draconian mistreating of candidates
+=== Draconian mistreating of candidates
+
+:bad-apple: https://www.paulgraham.com/apple.html
+:be-good: https://www.paulgraham.com/good.html
If I found out that people employed theatrics in my interview so that I could
feel I've "earned the privilege to work at your company", I would quit.
If your moral compass is so broken that you are comfortable mistreating me while
I'm a candidate, I immediately assume you will also mistreat me as an employee,
-and that the company is not a good place to work, as
-[evil begets stupidity][evil-begets-stupidity]:
-
-> But the other reason programmers are fussy, I think, is that evil begets
-> stupidity. An organization that wins by exercising power starts to lose the
-> ability to win by doing better work. And it's not fun for a smart person to
-> work in a place where the best ideas aren't the ones that win. I think the
-> reason Google embraced "Don't be evil" so eagerly was not so much to impress
-> the outside world as to inoculate themselves against arrogance.
+and that the company is not a good place to work, as {bad-apple}[evil begets
+stupidity]:
+
+____
+But the other reason programmers are fussy, I think, is that evil begets
+stupidity. An organization that wins by exercising power starts to lose the
+ability to win by doing better work. And it's not fun for a smart person to
+work in a place where the best ideas aren't the ones that win. I think the
+reason Google embraced "Don't be evil" so eagerly was not so much to impress the
+outside world as to inoculate themselves against arrogance.
+____
Paul Graham goes beyond "don't be evil" with a better motto:
-"[be good][pg-be-good]".
+"{be-good}[be good]".
Abusing the asymmetric nature of an interview to increase the chance that the
-candidate will accept the offer is, well, abusive. I doubt a solid team can
+candidate will accept the offer is, well, abusive. I doubt a solid team can
actually be built on such poor foundations, surrounded by such evil measures.
And if you really want to give engineers "the measure of whoever they're going
to be working with", there are plenty of reasonable ways of doing it that don't
include performing fake interviews.
-[pg-be-good]: http://www.paulgraham.com/good.html
-[evil-begets-stupidity]: http://www.paulgraham.com/apple.html
-
-### Personality tests
+=== Personality tests
Personality tests around the world need to be a) translated, b) adapted and c)
-validated. Even though a given test may be applicable and useful in a country,
+validated. Even though a given test may be applicable and useful in a country,
this doesn't imply it will work for other countries.
Not only tests usually come with translation guidelines, but also its
applicability needs to be validated again after the translation and adaptation
is done to see if the test still measures what it is supposed to.
-That is also true within the same language. If a test is shown to work in
-England, it may not work in New Zealand, in spite of both speaking english. The
+That is also true within the same language. If a test is shown to work in
+England, it may not work in New Zealand, in spite of both speaking english. The
cultural context difference is influent to the point of invalidating a test and
making it be no longer valid.
-Irregardless of the validity of the proposed "big five" personality test,
-saying "just use attributes x, y and z this test and you'll be fine" is a rough
+Irregardless of the validity of the proposed "big five" personality test, saying
+"just use attributes x, y and z this test and you'll be fine" is a rough
simplification, much like saying "just use Raft for distributed systems, after
all it has been proven to work" shows he throws all of that background away.
So much as applying personality tests themselves is not a trivial task, and
psychologists do need special training to become able to effectively apply one.
-### More cargo culting
+=== More cargo culting
+
+:cult: https://calteches.library.caltech.edu/51/2/CargoCult.htm
+:cult-archived: https://web.archive.org/web/20201003090303/https://calteches.library.caltech.edu/51/2/CargoCult.htm
He calls the ill-defined "industry standard" to be cargo-culting, but his
proposal isn't sound enough to not become one.
-Even if the ideas were good, they aren't solid enough, or based on solid
-enough things to make them stand out by themselves. Why is it that talent,
-judgment and personality are required to determine the fitness of a good
-candidate? Why not 2, 5, or 20 things? Why those specific 3? Why is talent
-defined like that? Is it just because he found talent to be like that?
+Even if the ideas were good, they aren't solid enough, or based on solid enough
+things to make them stand out by themselves. Why is it that talent, judgment
+and personality are required to determine the fitness of a good candidate? Why
+not 2, 5, or 20 things? Why those specific 3? Why is talent defined like that?
+Is it just because he found talent to be like that?
Isn't that definitionally also
-[cargo-culting][cargo-culting][^cargo-culting-archive]? Isn't he just repeating
-whatever he found to work form him, without understanding why?
+{cult}[cargo-culting]footnote:cargo-cult[
+ {cult-archived}[Archived version].
+]? Isn't he just repeating whatever he found to work form him, without
+understanding why?
What Feynman proposes is actually the opposite:
-> In summary, the idea is to try to give **all** of the information to help others
-> to judge the value of your contribution; not just the information that leads
-> to judgment in one particular direction or another.
+____
+In summary, the idea is to try to give *all* of the information to help others
+to judge the value of your contribution; not just the information that leads to
+judgment in one particular direction or another.
+____
What Slava did was just another form of cargo culting, but this was one that he
believed to work.
-[cargo-culting]: http://calteches.library.caltech.edu/51/2/CargoCult.htm
-[^cargo-culting-archive]: [Archived version](https://web.archive.org/web/20201003090303/http://calteches.library.caltech.edu/51/2/CargoCult.htm).
-
-## What to do
+== What to do
I will not give you a list of things that "worked for me, thus they are
-correct". I won't either critique the current "industry standard", nor what I've
-learned from interviewing engineers.
+correct". I won't either critique the current "industry standard", nor what
+I've learned from interviewing engineers.
Instead, I'd like to invite you to learn from history, and from what other
professionals have to teach us.
Programming isn't an odd profession, where everything about it is different from
-anything else. It is just another episode in the "technology" series, which has
-seasons since before recorded history. It may be an episode where things move a
+anything else. It is just another episode in the "technology" series, which has
+seasons since before recorded history. It may be an episode where things move a
bit faster, but it is fundamentally the same.
-So here is the key idea: what people did *before* software engineering?
+So here is the key idea: what people did _before_ software engineering?
-What hiring is like for engineers in other areas? Don't civil, electrical and
+What hiring is like for engineers in other areas? Don't civil, electrical and
other types of engineering exist for much, much longer than software engineering
-does? What have those centuries of accumulated experience thought the world
+does? What have those centuries of accumulated experience thought the world
about technical hiring?
What studies were performed on the different success rate of interviewing
-strategies? What have they done right and what have they done wrong?
+strategies? What have they done right and what have they done wrong?
What is the purpose of HR? Why do they even exist? Do we need them, and if so,
-what for? What is the value they bring, since everybody insist on building an HR
-department in their companies? Is the existence of HR another form of cargo
+what for? What is the value they bring, since everybody insist on building an
+HR department in their companies? Is the existence of HR another form of cargo
culting?
What is industrial and organizational psychology? What is that field of study?
What do they specialize in? What have they learned since the discipline
-appeared? What have they done right and wrong over history? Is is the current
-academic consensus on that area? What is a hot debate topic in academia on that
-area? What is the current bleeding edge of research? What can they teach us
-about hiring? What can they teach us about technical hiring?
+appeared? What have they done right and wrong over history? Is is the current
+academic consensus on that area? What is a hot debate topic in academia on that
+area? What is the current bleeding edge of research? What can they teach us
+about hiring? What can they teach us about technical hiring?
-## Conclusion
+== Conclusion
If all I've said makes me a "no hire" in the proposed framework, I'm really
glad.
diff --git a/src/content/blog/2020/11/07/diy-bugs.adoc b/src/content/blog/2020/11/07/diy-bugs.adoc
index b1dd117..0f561c1 100644
--- a/src/content/blog/2020/11/07/diy-bugs.adoc
+++ b/src/content/blog/2020/11/07/diy-bugs.adoc
@@ -1,79 +1,67 @@
----
-
-title: DIY an offline bug tracker with text files, Git and email
-
-date: 2020-11-07
-
-updated_at: 2021-08-14
-
-layout: post
-
-lang: en
-
-ref: diy-an-offline-bug-tracker-with-text-files-git-and-email
-
----
-
-When [push comes to shove][youtube-dl-takedown-notice], the operational aspects
-of governance of a software project matter a lot. And everybody likes to chime
-in with their alternative of how to avoid single points of failure in project
+= DIY an offline bug tracker with text files, Git and email
+
+:attack-on-ytdl: https://github.com/github/dmca/blob/master/2020/10/2020-10-23-RIAA.md
+:list-discussions: https://sourcehut.org/blog/2020-10-29-how-mailing-lists-prevent-censorship/
+:docs-in-repo: https://podcast.writethedocs.org/2017/01/25/episode-3-trends/
+:ci-in-notes: link:../../../../tils/2020/11/30/git-notes-ci.html
+:todos-mui: https://man.sr.ht/todo.sr.ht/#email-access
+:git-bug-bridges: https://github.com/MichaelMure/git-bug#bridges
+
+When {attack-on-ytdl}[push comes to shove], the operational aspects of
+governance of a software project matter a lot. And everybody likes to chime in
+with their alternative of how to avoid single points of failure in project
governance, just like I'm doing right now.
The most valuable assets of a project are:
-1. source code
-2. discussions
-3. documentation
-4. builds
-5. tasks and bugs
+. source code
+. discussions
+. documentation
+. builds
+. tasks and bugs
-For **source code**, Git and other DVCS solve that already: everybody gets a
-full copy of the entire source code.
+For *source code*, Git and other DVCS solve that already: everybody gets a full
+copy of the entire source code.
If your code forge is compromised, moving it to a new one takes a couple of
-minutes, if there isn't a secondary remote serving as mirror already. In this
+minutes, if there isn't a secondary remote serving as mirror already. In this
case, no action is required.
-If you're having your **discussions** by email,
-"[taking this archive somewhere else and carrying on is effortless][sourcehut-ml]".
+If you're having your *discussions* by email, "{list-discussions}[taking this
+archive somewhere else and carrying on is effortless]".
Besides, make sure to backup archives of past discussions so that the history is
also preserved when this migration happens.
-The **documentation** should
-[live inside the repository itself][writethedocs-in-repo][^writethedocs-in-repo],
-so that not only it gets first class treatment, but also gets distributed to
-everybody too. Migrating the code to a new forge already migrates the
+The *documentation* should {docs-in-repo}[live inside the repository
+itself]footnote:writethedocs-in-repo[
+ Described as "the ultimate marriage of the two". Starts at time 31:50.
+], so that not only it gets first class treatment, but also gets distributed to
+everybody too. Migrating the code to a new forge already migrates the
documentation with it.
-[^writethedocs-in-repo]: Described as "the ultimate marriage of the two". Starts
- at time 31:50.
-
-As long as you keep the **builds** vendor neutral, the migration should only
+As long as you keep the *builds* vendor neutral, the migration should only
involve adapting how you call your `tests.sh` from the format of
-`provider-1.yml` uses to the format that `provider-2.yml` accepts.
-It isn't valuable to carry the build history with the project, as this data
-quickly decays in value as weeks and months go by, but for simple text logs
-[using Git notes] may be just enough, and they would be replicated with the rest
-of the repository.
-
-[using Git notes]: {% link _tils/2020-11-30-storing-ci-data-on-git-notes.md %}
-
-But for **tasks and bugs** many rely on a vendor-specific service, where you
-register and manage those issues via a web browser. Some provide an
-[interface for interacting via email][todos-srht-email] or an API for
-[bridging local bugs with vendor-specific services][git-bug-bridges]. But
+`provider-1.yml` uses to the format that `provider-2.yml` accepts. It isn't
+valuable to carry the build history with the project, as this data quickly
+decays in value as weeks and months go by, but for simple text logs
+{ci-in-notes}[using Git notes] may be just enough, and they would be replicated
+with the rest of the repository.
+
+But for *tasks and bugs* many rely on a vendor-specific service, where
+you register and manage those issues via a web browser. Some provide an
+{todos-mui}[interface for interacting via email] or an API for
+{git-bug-bridges[bridging local bugs with vendor-specific services]. But
they're all layers around the service, that disguises it as being a central
-point of failure, which when compromised would lead to data loss. When push comes
-to shove, you'd loose data.
+point of failure, which when compromised would lead to data loss. When push
+comes to shove, you'd loose data.
-[youtube-dl-takedown-notice]: https://github.com/github/dmca/blob/master/2020/10/2020-10-23-RIAA.md
-[sourcehut-ml]: https://sourcehut.org/blog/2020-10-29-how-mailing-lists-prevent-censorship/
-[writethedocs-in-repo]: https://podcast.writethedocs.org/2017/01/25/episode-3-trends/
-[todos-srht-email]: https://man.sr.ht/todo.sr.ht/#email-access
-[git-bug-bridges]: https://github.com/MichaelMure/git-bug#bridges
+== Alternative: text files, Git and email
-## Alternative: text files, Git and email
+:todos-example: https://euandre.org/git/remembering/tree/TODOs.md?id=3f727802cb73ab7aa139ca52e729fd106ea916d0
+:todos-script: https://euandre.org/git/remembering/tree/aux/workflow/TODOs.sh?id=3f727802cb73ab7aa139ca52e729fd106ea916d0
+:todos-html: https://euandreh.xyz/remembering/TODOs.html
+:fossil-tickets: https://fossil-scm.org/home/doc/trunk/www/bugtheory.wiki
Why not do the same as documentation, and move tasks and bugs into the
repository itself?
@@ -81,28 +69,24 @@ repository itself?
It requires no extra tool to be installed, and fits right in the already
existing workflow for source code and documentation.
-I like to keep a [`TODOs.md`] file at the repository top-level, with
-two relevant sections: "tasks" and "bugs". Then when building the documentation
-I'll just [generate an HTML file from it], and [publish] it alongside the static
-website. All that is done on the main branch.
+I like to keep a {todos-example}[`TODOs.md`] file at the repository top-level,
+with two relevant sections: "tasks" and "bugs". Then when building the
+documentation I'll just {todos-script}[generate an HTML file from it], and
+{todos-html}[publish] it alongside the static website. All that is done on the
+main branch.
Any issues discussions are done in the mailing list, and a reference to a
-discussion could be added to the ticket itself later on. External contributors
+discussion could be added to the ticket itself later on. External contributors
can file tickets by sending a patch.
The good thing about this solution is that it works for 99% of projects out
there.
-For the other 1%, having Fossil's "[tickets][fossil-tickets]" could be an
+For the other 1%, having Fossil's "{fossil-tickets}[tickets]" could be an
alternative, but you may not want to migrate your project to Fossil to get those
niceties.
Even though I keep a `TODOs.md` file on the main branch, you can have a `tasks`
- branch with a `task-n.md` file for each task, or any other way you like.
+branch with a `task-n.md` file for each task, or any other way you like.
These tools are familiar enough that you can adjust it to fit your workflow.
-
-[`TODOs.md`]: https://euandre.org/git/remembering/tree/TODOs.md?id=3f727802cb73ab7aa139ca52e729fd106ea916d0
-[generate an HTML file from it]: https://euandre.org/git/remembering/tree/aux/workflow/TODOs.sh?id=3f727802cb73ab7aa139ca52e729fd106ea916d0
-[publish]: https://euandreh.xyz/remembering/TODOs.html
-[fossil-tickets]: https://fossil-scm.org/home/doc/trunk/www/bugtheory.wiki
diff --git a/src/content/blog/2020/11/08/paradigm-shift-review.adoc b/src/content/blog/2020/11/08/paradigm-shift-review.adoc
index c98c131..dd31f87 100644
--- a/src/content/blog/2020/11/08/paradigm-shift-review.adoc
+++ b/src/content/blog/2020/11/08/paradigm-shift-review.adoc
@@ -1,164 +1,153 @@
----
+= The Next Paradigm Shift in Programming - video review
-title: The Next Paradigm Shift in Programming - video review
+:reviewed-video: https://www.youtube.com/watch?v=6YbK8o9rZfI
-date: 2020-11-08
+This is a review with comments of "{reviewed-video}[The Next Paradigm Shift in
+Programming]", by Richard Feldman.
-layout: post
-
-lang: en
-
-ref: the-next-paradigm-shift-in-programming-video-review
-
-eu_categories: video review
-
----
-
-This is a review with comments of
-"[The Next Paradigm Shift in Programming][video-link]", by Richard Feldman.
-
-This video was *strongly* suggested to me by a colleague. I wanted to discuss it
-with her, and when drafting my response I figured I could publish it publicly
+This video was _strongly_ suggested to me by a colleague. I wanted to discuss
+it with her, and when drafting my response I figured I could publish it publicly
instead.
Before anything else, let me just be clear: I really like the talk, and I think
-Richard is a great public speaker. I've watched several of his talks over the
+Richard is a great public speaker. I've watched several of his talks over the
years, and I feel I've followed his career at a distance, with much respect.
This isn't a piece criticizing him personally, and I agree with almost
-everything he said. These are just some comments but also nitpicks on a few
+everything he said. These are just some comments but also nitpicks on a few
topics I think he missed, or that I view differently.
-[video-link]: https://www.youtube.com/watch?v=6YbK8o9rZfI
+== Structured programming
-## Structured programming
+:forgotten-art-video: https://www.youtube.com/watch?v=SFv8Wm2HdNM
-The historical overview at the beginning is very good. In fact, the very video I
-watched previously was about structured programming!
+The historical overview at the beginning is very good. In fact, the very video
+I watched previously was about structured programming!
-Kevlin Henney on
-"[The Forgotten Art of Structured Programming][structured-programming]" does a
-deep-dive on the topic of structured programming, and how on his view it is
-still hidden in our code, when we do a `continue` or a `break` in some ways.
-Even though it is less common to see an explicit `goto` in code these days, many
-of the original arguments of Dijkstra against explicit `goto`s is applicable to
-other constructs, too.
+Kevlin Henney on "{forgotten-art-video}[The Forgotten Art of Structured
+Programming]" does a deep-dive on the topic of structured programming, and how
+on his view it is still hidden in our code, when we do a `continue` or a `break`
+in some ways. Even though it is less common to see an explicit `goto` in code
+these days, many of the original arguments of Dijkstra against explicit `goto`s
+is applicable to other constructs, too.
-This is a very mature view, and I like how he goes beyond the
-"don't use `goto`s" heuristic and proposes and a much more nuanced understanding
-of what "structured programming" means.
+This is a very mature view, and I like how he goes beyond the "don't use
+`goto`s" heuristic and proposes and a much more nuanced understanding of what
+"structured programming" means.
In a few minutes, Richard is able to condense most of the significant bits of
-Kevlin's talk in a didactical way. Good job.
+Kevlin's talk in a didactical way. Good job.
-[structured-programming]: https://www.youtube.com/watch?v=SFv8Wm2HdNM
+== OOP like a distributed system
-## OOP like a distributed system
+:joe-oop: https://www.infoq.com/interviews/johnson-armstrong-oop/
+:rich-hickey-oop: https://www.youtube.com/watch?v=ROor6_NGIWU
-Richard extrapolates Alan Kay's original vision of OOP, and he concludes that
-it is more like a distributed system that how people think about OOP these days.
+Richard extrapolates Alan Kay's original vision of OOP, and he concludes that it
+is more like a distributed system that how people think about OOP these days.
But he then states that this is a rather bad idea, and we shouldn't pursue it,
given that distributed systems are known to be hard.
-However, his extrapolation isn't really impossible, bad or an absurd. In fact,
-it has been followed through by Erlang. Joe Armstrong used to say that
-"[Erlang might the only OOP language][erlang-oop]", since it actually adopted
-this paradigm.
+However, his extrapolation isn't really impossible, bad or an absurd. In fact,
+it has been followed through by Erlang. Joe Armstrong used to say that
+"{joe-oop}[Erlang might the only OOP language]", since it actually adopted this
+paradigm.
-But Erlang is a functional language. So this "OOP as a distributed system" view
+But Erlang is a functional language. So this "OOP as a distributed system" view
is more about designing systems in the large than programs in the small.
There is a switch of levels in this comparison I'm making, as can be done with
any language or paradigm: you can have a functional-like system that is built
with an OOP language (like a compiler, that given the same input will produce
-the same output), or an OOP-like system that is built with a functional language
-(Rich Hickey calls it
-"[OOP in the large][langsys]"[^the-language-of-the-system]).
+the same output), or an OOP-like system that is built with a functional
+language (Rich Hickey calls it "{rich-hickey-oop}[OOP in the
+large]"footnote:langsys[
+ From 24:05 to 27:45.
+]).
So this jump from in-process paradigm to distributed paradigm is rather a big
one, and I don't think you he can argue that OOP has anything to say about
-software distribution across nodes. You can still have Erlang actors that run
+software distribution across nodes. You can still have Erlang actors that run
independently and send messages to each other without a network between them.
Any OTP application deployed on a single node effectively works like that.
-I think he went a bit too far with this extrapolation. Even though I agree it is
-a logical a fair one, it isn't evidently bad as he painted. I would be fine
-working with a single-node OTP application and seeing someone call it "a *real*
+I think he went a bit too far with this extrapolation. Even though I agree it
+is a logical a fair one, it isn't evidently bad as he painted. I would be fine
+working with a single-node OTP application and seeing someone call it "a _real_
OOP program".
-[erlang-oop]: https://www.infoq.com/interviews/johnson-armstrong-oop/
-[langsys]: https://www.youtube.com/watch?v=ROor6_NGIWU
-[^the-language-of-the-system]: From 24:05 to 27:45.
+== First class immutability
-## First class immutability
+:immer: https://sinusoid.es/immer/
+:immutable-js: https://immutable-js.github.io/immutable-js/
-I agree with his view of languages moving towards the functional paradigm.
-But I think you can narrow down the "first-class immutability" feature he points
-out as present on modern functional programming languages to "first-class
-immutable data structures".
+I agree with his view of languages moving towards the functional paradigm. But
+I think you can narrow down the "first-class immutability" feature he points out
+as present on modern functional programming languages to "first-class immutable
+data structures".
I wouldn't categorize a language as "supporting functional programming style"
-without a library for functional data structures it. By discipline you can avoid
-side-effects, write pure functions as much as possible, and pass functions as
-arguments around is almost every language these days, but if when changing an
+without a library for functional data structures it. By discipline you can
+avoid side-effects, write pure functions as much as possible, and pass functions
+as arguments around is almost every language these days, but if when changing an
element of a vector mutates things in-place, that is still not functional
programming.
To avoid that, you end-up needing to make clones of objects to pass to a
-function, using freezes or other workarounds. All those cases are when the
+function, using freezes or other workarounds. All those cases are when the
underlying mix of OOP and functional programming fail.
There are some languages with third-party libraries that provide functional data
-structures, like [immer][immer] for C++, or [ImmutableJS][immutablejs] for
+structures, like {immer}[immer] for C++, or {immutable-js}[ImmutableJS] for
JavaScript.
But functional programming is more easily achievable in languages that have them
built-in, like Erlang, Elm and Clojure.
-[immer]: https://sinusoid.es/immer/
-[immutablejs]: https://immutable-js.github.io/immutable-js/
+== Managed side-effects
-## Managed side-effects
+:redux: https://redux.js.org/
+:re-frame: https://github.com/Day8/re-frame
His proposal of adopting managed side-effects as a first-class language concept
is really intriguing.
-This is something you can achieve with a library, like [Redux][redux] for JavaScript or
-[re-frame][re-frame] for Clojure.
+This is something you can achieve with a library, like {redux}[Redux] for
+JavaScript or {re-frame}[re-frame] for Clojure.
I haven't worked with a language with managed side-effects at scale, and I don't
-feel this is a problem with Clojure or Erlang. But is this me finding a flaw in
-his argument or not acknowledging a benefit unknown to me? This is a provocative
-question I ask myself.
+feel this is a problem with Clojure or Erlang. But is this me finding a flaw in
+his argument or not acknowledging a benefit unknown to me? This is a
+provocative question I ask myself.
Also all FP languages with managed side-effects I know are statically-typed, and
-all dynamically-typed FP languages I know don't have managed side-effects baked in.
+all dynamically-typed FP languages I know don't have managed side-effects baked
+in.
-[redux]: https://redux.js.org/
-[re-frame]: https://github.com/Day8/re-frame
+== What about declarative programming?
-## What about declarative programming?
+:tarpit-article: https://curtclifton.net/papers/MoseleyMarks06a.pdf
-In "[Out of the Tar Pit][tar-pit]", B. Moseley and P. Marks go beyond his view
-of functional programming as the basis, and name a possible "functional
-relational programming" as an even better solution. They explicitly call out
+In "{tarpit-article}[Out of the Tar Pit]", B. Moseley and P. Marks go beyond his
+view of functional programming as the basis, and name a possible "functional
+relational programming" as an even better solution. They explicitly call out
some flaws in most of the modern functional programming languages, and instead
pick declarative programming as an even better starting paradigm.
If the next paradigm shift is towards functional programming, will the following
shift be towards declarative programming?
-[tar-pit]: http://curtclifton.net/papers/MoseleyMarks06a.pdf
+== Conclusion
-## Conclusion
+:simple-made-easy: https://www.infoq.com/presentations/Simple-Made-Easy/
Beyond all Richard said, I also hear often bring up functional programming when
talking about utilizing all cores of a computer, and how FP can help with that.
Rich Hickey makes a great case for single-process FP on his famous talk
-"[Simple Made Easy][simple-made-easy]".
-
-[simple-made-easy]: https://www.infoq.com/presentations/Simple-Made-Easy/
+"{simple-made-easy}[Simple Made Easy]".
-<!-- I find this conclusion too short, and it doesn't revisits the main points -->
-<!-- presented on the body of the article. I won't rewrite it now, but it would be an -->
-<!-- improvement to extend it to do so. -->
+////
+I find this conclusion too short, and it doesn't revisits the main points
+presented on the body of the article. I won't rewrite it now, but it would be an
+improvement to extend it to do so.
+////
diff --git a/src/content/blog/2020/11/12/database-parsers-trees.adoc b/src/content/blog/2020/11/12/database-parsers-trees.adoc
index 1870fad..eed785b 100644
--- a/src/content/blog/2020/11/12/database-parsers-trees.adoc
+++ b/src/content/blog/2020/11/12/database-parsers-trees.adoc
@@ -1,99 +1,92 @@
= Durable persistent trees and parser combinators - building a database
-date: 2020-11-12
-
-updated_at: 2021-02-09
-
-layout: post
-
-lang: en
-
-ref: durable-persistent-trees-and-parser-combinators-building-a-database
-
-eu_categories: mediator
-
----
+:empty:
+:db-article: link:../../08/31/database-i-wish-i-had.html
I've received with certain frequency messages from people wanting to know if
-I've made any progress on the database project
-[I've written about]({% link _articles/2020-08-31-the-database-i-wish-i-had.md %}).
+I've made any progress on the database project {db-article}[I've written about].
There are a few areas where I've made progress, and here's a public post on it.
== Proof-of-concept: DAG log
+:mediator-permalink: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n1
+
The main thing I wanted to validate with a concrete implementation was the
concept of modeling a DAG on a sequence of datoms.
-The notion of a *datom* is a rip-off from Datomic, which models data with time
-aware *facts*, which come from RDF. RDF's fact is a triple of
+The notion of a _datom_ is a rip-off from Datomic, which models data with time
+aware _facts_, which come from RDF. RDF's fact is a triple of
subject-predicate-object, and Datomic's datoms add a time component to it:
subject-predicate-object-time, A.K.A. entity-attribute-value-transaction:
-```clojure
+[source,clojure]
+----
[[person :likes "pizza" 0 true]
[person :likes "bread" 1 true]
[person :likes "pizza" 1 false]]
-```
+----
-The above datoms say:
-- at time 0, `person` like pizza;
-- at time 1, `person` stopped liking pizza, and started to like bread.
+The above datoms say: - at time 0, `person` like pizza; - at time 1, `person`
+stopped liking pizza, and started to like bread.
Datomic ensures total consistency of this ever growing log by having a single
writer, the transactor, that will enforce it when writing.
In order to support disconnected clients, I needed a way to allow multiple
-writers, and I chose to do it by making the log not a list, but a
-directed acyclic graph (DAG):
+writers, and I chose to do it by making the log not a list, but a directed
+acyclic graph (DAG):
-```clojure
+[source,clojure]
+----
[[person :likes "pizza" 0 true]
[0 :parent :db/root 0 true]
[person :likes "bread" 1 true]
[person :likes "pizza" 1 false]
[1 :parent 0 1 true]]
-```
+----
The extra datoms above add more information to build the directionality to the
log, and instead of a single consistent log, the DAG could have multiple leaves
that coexist, much like how different Git branches can have different "latest"
commits.
-In order to validate this idea, I started with a Clojure implementation. The
+In order to validate this idea, I started with a Clojure implementation. The
goal was not to write the actual final code, but to make a proof-of-concept that
would allow me to test and stretch the idea itself.
-This code [already exists][clj-poc], but is yet fairly incomplete:
+This code {mediator-permalink}[already exists], but is yet fairly incomplete:
-- the building of the index isn't done yet (with some
- [commented code][clj-poc-index] on the next step to be implemented)
-- the indexing is extremely inefficient, with [more][clj-poc-o2-0]
- [than][clj-poc-o2-1] [one][clj-poc-o2-2] occurrence of `O²` functions;
-- no query support yet.
+:commented-code: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n295
+:more: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n130
+:than: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n146
+:one: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n253
-[clj-poc]: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n1
-[clj-poc-index]: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n295
-[clj-poc-o2-0]: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n130
-[clj-poc-o2-1]: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n146
-[clj-poc-o2-2]: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n253
+* the building of the index isn't done yet (with some {commented-code}[commented
+ code] on the next step to be implemented)
+* the indexing is extremely inefficient, with {more}[more] {than}[than]
+ {one}[one] occurrence of `O²` functions;
+* no query support yet.
-== Top-down *and* bottom-up
+== Top-down _and_ bottom-up
However, as time passed and I started looking at what the final implementation
would look like, I started to consider keeping the PoC around.
The top-down approach (Clojure PoC) was in fact helping guide me with the
bottom-up, and I now have "promoted" the Clojure PoC into a "reference
-implementation". It should now be a finished implementation that says what the
+implementation". It should now be a finished implementation that says what the
expected behaviour is, and the actual code should match the behaviour.
The good thing about a reference implementation is that it has no performance of
-resources boundary, so if it ends up being 1000x slower and using 500× more
-memory, it should be find. The code can be also 10x or 100x simpler, too.
+resources boundary, so if it ends up being 1000× slower and using 500× more
+memory, it should be find. The code can be also 10× or 100× simpler, too.
== Top-down: durable persistent trees
+:pavlo-videos: https://www.youtube.com/playlist?list=PLSE8ODhjZXjbohkNBWQs_otTrBTrjyohi
+:db-book: https://www.databass.dev/
+
In promoting the PoC into a reference implementation, this top-down approach now
needs to go beyond doing everything in memory, and the index data structure now
needs to be disk-based.
@@ -102,115 +95,118 @@ Roughly speaking, most storage engines out there are based either on B-Trees or
LSM Trees, or some variations of those.
But when building an immutable database, update-in-place B-Trees aren't an
-option, as it doesn't accommodate keeping historical views of the tree. LSM Trees
-may seem a better alternative, but duplication on the files with compaction are
-also ways to delete old data which is indeed useful for a historical view.
+option, as it doesn't accommodate keeping historical views of the tree. LSM
+Trees may seem a better alternative, but duplication on the files with
+compaction are also ways to delete old data which is indeed useful for a
+historical view.
I think the thing I'm after is a mix of a Copy-on-Write B-Tree, which would keep
historical versions with the write IO cost amortization of memtables of LSM
-Trees. I don't know of any B-Tree variant out there that resembles this, so I'll
-call it "Flushing Copy-on-Write B-Tree".
+Trees. I don't know of any B-Tree variant out there that resembles this, so
+I'll call it "Flushing Copy-on-Write B-Tree".
I haven't written any code for this yet, so all I have is a high-level view of
what it will look like:
-1. like Copy-on-Write B-Trees, changing a leaf involves creating a new leaf and
- building a new path from root to the leaf. The upside is that writes a lock
+. like Copy-on-Write B-Trees, changing a leaf involves creating a new leaf and
+ building a new path from root to the leaf. The upside is that writes a lock
free, and no coordination is needed between readers and writers, ever;
-
-2. the downside is that a single leaf update means at least `H` new nodes that
- will have to be flushed to disk, where `H` is the height of the tree. To avoid
- that, the writer creates these nodes exclusively on the in-memory memtable, to
- avoid flushing to disk on every leaf update;
-
-3. a background job will consolidate the memtable data every time it hits X MB,
+. the downside is that a single leaf update means at least `H` new nodes that
+ will have to be flushed to disk, where `H` is the height of the tree. To
+ avoid that, the writer creates these nodes exclusively on the in-memory
+ memtable, to avoid flushing to disk on every leaf update;
+. a background job will consolidate the memtable data every time it hits X MB,
and persist it to disk, amortizing the cost of the Copy-on-Write B-Tree;
-
-4. readers than will have the extra job of getting the latest relevant
+. readers than will have the extra job of getting the latest relevant
disk-resident value and merge it with the memtable data.
-The key difference to existing Copy-on-Write B-Trees is that the new trees
-are only periodically written to disk, and the intermediate values are kept in
-memory. Since no node is ever updated, the page utilization is maximum as it
+The key difference to existing Copy-on-Write B-Trees is that the new trees are
+only periodically written to disk, and the intermediate values are kept in
+memory. Since no node is ever updated, the page utilization is maximum as it
doesn't need to keep space for future inserts and updates.
And the key difference to existing LSM Trees is that no compaction is run:
-intermediate values are still relevant as the database grows. So this leaves out
-tombstones and value duplication done for write performance.
+intermediate values are still relevant as the database grows. So this leaves
+out tombstones and value duplication done for write performance.
One can delete intermediate index values to reclaim space, but no data is lost
-on the process, only old B-Tree values. And if the database ever comes back to
+on the process, only old B-Tree values. And if the database ever comes back to
that point (like when doing a historical query), the B-Tree will have to be
-rebuilt from a previous value. After all, the database *is* a set of datoms, and
-everything else is just derived data.
+rebuilt from a previous value. After all, the database _is_ a set of datoms,
+and everything else is just derived data.
Right now I'm still reading about other data structures that storage engines
use, and I'll start implementing the "Flushing Copy-on-Write B-Tree" as I learn
-more[^learn-more-db] and mature it more.
-
-[^learn-more-db]: If you are interested in learning more about this too, the
- very best two resources on this subject are Andy Pavlo's
- "[Intro to Database Systems](https://www.youtube.com/playlist?list=PLSE8ODhjZXjbohkNBWQs_otTrBTrjyohi)"
- course and Alex Petrov's "[Database Internals](https://www.databass.dev/)" book.
+more{empty}footnote:learn-more-db[
+ If you are interested in learning more about this too, the very best two
+ resources on this subject are Andy Pavlo's "{pavlo-videos}[Intro to Database
+ Systems]" course and Alex Petrov's "{db-book}[Database Internals]" book.
+] and mature it more.
== Bottom-up: parser combinators and FFI
+:cbindgen: https://github.com/eqrion/cbindgen
+:cbindgen-next: https://blog.eqrion.net/future-directions-for-cbindgen/
+:syn-crate: https://github.com/dtolnay/syn
+:libedn: https://euandre.org/git/libedn/
+
I chose Rust as it has the best WebAssembly tooling support.
My goal is not to build a Rust database, but a database that happens to be in
-Rust. In order to reach client platforms, the primary API is the FFI one.
+Rust. In order to reach client platforms, the primary API is the FFI one.
I'm not very happy with current tools for exposing Rust code via FFI to the
-external world: they either mix C with C++, which I don't want to do, or provide
-no access to the intermediate representation of the FFI, which would be useful
-for generating binding for any language that speaks FFI.
+external world: they either mix C with C++, which I don't want to do, or
+provide no access to the intermediate representation of the FFI, which would be
+useful for generating binding for any language that speaks FFI.
-I like better the path that the author of [cbindgen][cbindgen-crate]
-crate [proposes][rust-ffi]: emitting an data representation of the Rust C API
+I like better the path that the author of {cbindgen}[cbindgen] crate
+{cbindgen-next}[proposes]: emitting an data representation of the Rust C API
(the author calls is a `ffi.json` file), and than building transformers from the
-data representation to the target language. This way you could generate a C API
-*and* the node-ffi bindings for JavaScript automatically from the Rust code.
+data representation to the target language. This way you could generate a C API
+_and_ the node-ffi bindings for JavaScript automatically from the Rust code.
So the first thing to be done before moving on is an FFI exporter that doesn't
mix C and C++, and generates said `ffi.json`, and than build a few transformers
that take this `ffi.json` and generate the language bindings, be it C, C++,
-JavaScript, TypeScript, Kotlin, Swift, Dart, *etc*[^ffi-langs].
-
-[^ffi-langs]: Those are, specifically, the languages I'm more interested on. My
- goal is supporting client applications, and those languages are the most
- relevant for doing so: C for GTK, C++ for Qt, JavaScript and TypeScript for
- Node.js and browser, Kotlin for Android and Swing, Swift for iOS, and Dart
- for Flutter.
+JavaScript, TypeScript, Kotlin, Swift, Dart,
+_etc_footnote:ffi-langs[
+ Those are, specifically, the languages I'm more interested on. My goal is
+ supporting client applications, and those languages are the most relevant for
+ doing so: C for GTK, C++ for Qt, JavaScript and TypeScript for Node.js and
+ browser, Kotlin for Android and Swing, Swift for iOS, and Dart for Flutter.
+].
I think the best way to get there is by taking the existing code for cbindgen,
-which uses the [syn][syn-crate] crate to parse the Rust code[^rust-syn], and
-adapt it to emit the metadata.
-
-[^rust-syn]: The fact that syn is an external crate to the Rust compiler points
- to a big warning: procedural macros are not first class in Rust. They are
- just like Babel plugins in JavaScript land, with the extra shortcoming that
- there is no specification for the Rust syntax, unlike JavaScript.
-
- As flawed as this may be, it seems to be generally acceptable and adopted,
- which works against building a solid ecosystem for Rust.
-
- The alternative that rust-ffi implements relies on internals of the Rust
- compiler, which isn't actually worst, just less common and less accepted.
-
-I've started a fork of cbindgen: ~~x-bindgen~~[^x-bindgen]. Right now it is
-just a copy of cbindgen verbatim, and I plan to remove all C and C++ emitting
-code from it, and add a IR emitting code instead.
-
-[^x-bindgen]: *EDIT*: now archived, the experimentation was fun. I've started to move more towards C, so this effort became deprecated.
+which uses the {syn-crate}[syn] crate to parse the Rust
+code{empty}footnote:rust-syn[
+ The fact that syn is an external crate to the Rust compiler points to a big
+ warning: procedural macros are not first class in Rust. They are just like
+ Babel plugins in JavaScript land, with the extra shortcoming that there is no
+ specification for the Rust syntax, unlike JavaScript.
+FIXME
+ As flawed as this may be, it seems to be generally acceptable and adopted,
+ which works against building a solid ecosystem for Rust.
+FIXME
+ The alternative that rust-ffi implements relies on internals of the Rust
+ compiler, which isn't actually worst, just less common and less accepted.
+], and adapt it to emit the metadata.
+
+I've started a fork of cbindgen:
+[line-through]#x-bindgen#{empty}footnote:x-bindgen[
+ _EDIT_: now archived, the experimentation was fun. I've started to move more
+ towards C, so this effort became deprecated.
+]. Right now it is just a copy of cbindgen verbatim, and I plan to remove all C
+and C++ emitting code from it, and add a IR emitting code instead.
When starting working on x-bindgen, I realized I didn't know what to look for in
-a header file, as I haven't written any C code in many years. So as I was
-writing [libedn][libedn-repo], I didn't know how to build a good C API to
-expose. So I tried porting the code to C, and right now I'm working on building
-a *good* C API for a JSON parser using parser combinators:
-~~ParsecC~~ [^parsecc].
-
-[^parsecc]: *EDIT*: now also archived.
+a header file, as I haven't written any C code in many years. So as I was
+writing {libedn}[libedn], I didn't know how to build a good C API to expose. So
+I tried porting the code to C, and right now I'm working on building a _good_ C
+API for a JSON parser using parser combinators:
+[line-through]#ParsecC#{empty}footnote:parsecc[
+ _EDIT_: now also archived.
+].
After "finishing" ParsecC I'll have a good notion of what a good C API is, and
I'll have a better direction towards how to expose code from libedn to other
@@ -219,11 +215,6 @@ languages, and work on x-bindgen then.
What both libedn and ParsecC are missing right now are proper error reporting,
and property-based testing for libedn.
-[cbindgen-crate]: https://github.com/eqrion/cbindgen
-[syn-crate]: https://github.com/dtolnay/syn
-[rust-ffi]: https://blog.eqrion.net/future-directions-for-cbindgen/
-[libedn-repo]: https://euandre.org/git/libedn/
-
== Conclusion
I've learned a lot already, and I feel the journey I'm on is worth going
diff --git a/src/content/blog/2020/11/14/local-first-review.adoc b/src/content/blog/2020/11/14/local-first-review.adoc
index c24095a..0dd3bea 100644
--- a/src/content/blog/2020/11/14/local-first-review.adoc
+++ b/src/content/blog/2020/11/14/local-first-review.adoc
@@ -1,23 +1,15 @@
= Local-First Software: You Own Your Data, in spite of the Cloud - article review
-date: 2020-11-14
+:empty:
+:presentation: link:../../../../slides/2020/11/14/local-first.html
+:reviewed-article: https://martin.kleppmann.com/papers/local-first.pdf
-layout: post
+_This article is derived from a {presentation}[presentation] given at a Papers
+We Love meetup on the same subject._
-lang: en
-
-ref: local-first-software-you-own-your-data-in-spite-of-the-cloud-article-review
-
-eu_categories: presentation,article review
-
----
-
-*This article is derived from a [presentation][presentation] given at a Papers
-We Love meetup on the same subject.*
-
-This is a review of the article
-"[Local-First Software: You Own Your Data, in spite of the Cloud][article-pdf]",
-by M. Kleppmann, A. Wiggins, P. Van Hardenberg and M. F. McGranaghan.
+This is a review of the article "{reviewed-article}[Local-First Software: You
+Own Your Data, in spite of the Cloud]", by M. Kleppmann, A. Wiggins, P. Van
+Hardenberg and M. F. McGranaghan.
== Offline-first, local-first
@@ -27,34 +19,34 @@ client, and there are conflict resolution algorithms that reconcile data created
on different instances.
Sometimes I see confusion with this idea and "client-side", "offline-friendly",
-"syncable", etc. I have myself used this terms, also.
+"syncable", etc. I have myself used this terms, also.
There exists, however, already the "offline-first" term, which conveys almost
-all of that meaning. In my view, "local-first" doesn't extend "offline-first" in
-any aspect, rather it gives a well-defined meaning to it instead. I could say
-that "local-first" is just "offline-first", but with 7 well-defined ideals
+all of that meaning. In my view, "local-first" doesn't extend "offline-first"
+in any aspect, rather it gives a well-defined meaning to it instead. I could
+say that "local-first" is just "offline-first", but with 7 well-defined ideals
instead of community best practices.
It is a step forward, and given the number of times I've seen the paper shared
around I think there's a chance people will prefer saying "local-first" in
-*lieu* of "offline-first" from now on.
-
-[presentation]: {% link _slides/2020-11-14-on-local-first-beyond-the-crdt-silver-bullet.slides %}
-[article-pdf]: https://martin.kleppmann.com/papers/local-first.pdf
+_lieu_ of "offline-first" from now on.
== Software licenses
On a footnote of the 7th ideal ("You Retain Ultimate Ownership and Control"),
the authors say:
-> In our opinion, maintaining control and ownership of data does not mean that
-> the software must necessarily be open source. (...) as long as it does not
-> artificially restrict what users can do with their files.
+____
+In our opinion, maintaining control and ownership of data does not mean that the
+software must necessarily be open source. (...) as long as it does not
+artificially restrict what users can do with their files.
+____
They give examples of artificial restrictions, like this artificial restriction
I've come up with:
-```bash
+[source,bash]
+----
#!/bin/sh
TODAY=$(date +%s)
@@ -66,23 +58,24 @@ if [ $TODAY -ge $LICENSE_EXPIRATION ]; then
fi
echo $((2 + 2))
-```
+----
Now when using this very useful program:
-```bash
+[source,bash]
+----
# today
$ ./useful-adder.sh
4
# tomorrow
$ ./useful-adder.sh
License expired!
-```
+----
This is obviously an intentional restriction, and it goes against the 5th ideal
-("The Long Now"). This software would only be useful as long as the embedded
-license expiration allowed. Sure you could change the clock on the computer, but
-there are many other ways that this type of intentional restriction is in
+("The Long Now"). This software would only be useful as long as the embedded
+license expiration allowed. Sure you could change the clock on the computer,
+but there are many other ways that this type of intentional restriction is in
conflict with that ideal.
However, what about unintentional restrictions? What if a software had an equal
@@ -90,7 +83,8 @@ or similar restriction, and stopped working after days pass? Or what if the
programmer added a constant to make the development simpler, and this led to
unintentionally restricting the user?
-```bash
+[source,bash]
+----
# today
$ useful-program
# ...useful output...
@@ -98,81 +92,86 @@ $ useful-program
# tomorrow, with more data
$ useful-program
ERROR: Panic! Stack overflow!
-```
+----
Just as easily as I can come up with ways to intentionally restrict users, I can
-do the same for unintentionally restrictions. A program can stop working for a
+do the same for unintentionally restrictions. A program can stop working for a
variety of reasons.
-If it stops working due do, say, data growth, what are the options? Reverting to
-an earlier backup, and making it read-only? That isn't really a "Long Now", but
-rather a "Long Now as long as the software keeps working as expected".
+If it stops working due do, say, data growth, what are the options? Reverting
+to an earlier backup, and making it read-only? That isn't really a "Long Now",
+but rather a "Long Now as long as the software keeps working as expected".
The point is: if the software isn't free, "The Long Now" isn't achievable
-without a lot of wishful thinking. Maybe the authors were trying to be more
-friendly towards business who don't like free software, but in doing so they've proposed
-a contradiction by reconciling "The Long Now" with proprietary software.
+without a lot of wishful thinking. Maybe the authors were trying to be more
+friendly towards business who don't like free software, but in doing so they've
+proposed a contradiction by reconciling "The Long Now" with proprietary
+software.
-It isn't the same as saying that any free software achieves that ideal,
-either. The license can still be free, but the source code can become
-unavailable due to cloud rot. Or maybe the build is undocumented, or the build
-tools had specific configuration that one has to guess. A piece of free
-software can still fail to achieve "The Long Now". Being free doesn't guarantee
-it, just makes it possible.
+It isn't the same as saying that any free software achieves that ideal, either.
+The license can still be free, but the source code can become unavailable due to
+cloud rot. Or maybe the build is undocumented, or the build tools had specific
+configuration that one has to guess. A piece of free software can still fail to
+achieve "The Long Now". Being free doesn't guarantee it, just makes it
+possible.
A colleague has challenged my view, arguing that the software doesn't really
-need to be free, as long as there is an specification of the file format. This
+need to be free, as long as there is an specification of the file format. This
way if the software stops working, the format can still be processed by other
-programs. But this doesn't apply in practice: if you have a document that you
+programs. But this doesn't apply in practice: if you have a document that you
write to, and software stops working, you still want to write to the document.
An external tool that navigates the content and shows it to you won't allow you
to keep writing, and when it does that tool is now starting to re-implement the
software.
An open specification could serve as a blueprint to other implementations,
-making the data format more friendly to reverse-engineering. But the
-re-implementation still has to exist, at which point the original software failed
-to achieve "The Long Now".
+making the data format more friendly to reverse-engineering. But the
+re-implementation still has to exist, at which point the original software
+failed to achieve "The Long Now".
It is less bad, but still not quite there yet.
== Denial of existing solutions
-When describing "Existing Data Storage and Sharing Models", on a
-footnote[^devil] the authors say:
+:distgit: https://drewdevault.com/2018/07/23/Git-is-already-distributed.html
-[^devil]: This is the second aspect that I'm picking on the article from a
- footnote. I guess the devil really is on the details.
+When describing "Existing Data Storage and Sharing Models", on a
+footnote{empty}footnote:devil[
+ This is the second aspect that I'm picking on the article from a footnote. I
+ guess the devil really is on the details.
+] the authors say:
-> In principle it is possible to collaborate without a repository service,
-> e.g. by sending patch files by email, but the majority of Git users rely
-> on GitHub.
+____
+In principle it is possible to collaborate without a repository service, e.g. by
+sending patch files by email, but the majority of Git users rely on GitHub.
+____
The authors go to a great length to talk about usability of cloud apps, and even
point to research they've done on it, but they've missed learning more from
local-first solutions that already exist.
Say the automerge CRDT proves to be even more useful than what everybody
-imagined. Say someone builds a local-first repository service using it. How will
-it change anything of the Git/GitHub model? What is different about it that
+imagined. Say someone builds a local-first repository service using it. How
+will it change anything of the Git/GitHub model? What is different about it that
prevents people in the future writing a paper saying:
-> In principle it is possible to collaborate without a repository service,
-> e.g. by using automerge and platform X,
-> but the majority of Git users rely on GitHub.
+____
+In principle it is possible to collaborate without a repository service, e.g. by
+using automerge and platform X, but the majority of Git users rely on GitHub.
+____
How is this any better?
-If it is already [possible][git-local-first] to have a local-first development
-workflow, why don't people use it? Is it just fashion, or there's a fundamental
-problem with it? If so, what is it, and how to avoid it?
+If it is already {distgit}[possible] to have a local-first development workflow,
+why don't people use it? Is it just fashion, or there's a fundamental problem
+with it? If so, what is it, and how to avoid it?
If sending patches by emails is perfectly possible but out of fashion, why even
-talk about Git/GitHub? Isn't this a problem that people are putting themselves
-in? How can CRDTs possibly prevent people from doing that?
+talk about Git/GitHub? Isn't this a problem that people are putting themselves
+in? How can CRDTs possibly prevent people from doing that?
My impression is that the authors envision a better future, where development is
-fully decentralized unlike today, and somehow CRDTs will make that happen. If
+fully decentralized unlike today, and somehow CRDTs will make that happen. If
more people think this way, "CRDT" is next in line to the buzzword list that
solves everything, like "containers", "blockchain" or "machine learning".
@@ -180,56 +179,56 @@ Rather than picturing an imaginary service that could be described like
"GitHub+CRDTs" and people would adopt it, I'd rather better understand why
people don't do it already, since Git is built to work like that.
-[git-local-first]: https://drewdevault.com/2018/07/23/Git-is-already-distributed.html
-
== Ditching of web applications
+:pouchdb: https://pouchdb.com/
+:instant-apps: https://developer.android.com/topic/google-play-instant
+
The authors put web application in a worse position for building local-first
application, claiming that:
-> (...) the architecture of web apps remains fundamentally server-centric.
-> Offline support is an afterthought in most web apps, and the result is
-> accordingly fragile.
+____
+(...) the architecture of web apps remains fundamentally server-centric.
+Offline support is an afterthought in most web apps, and the result is
+accordingly fragile.
+____
Well, I disagree.
The problem isn't inherit to the web platform, but instead how people use it.
-I have myself built offline-first applications, leveraging IndexedDB, App Cache,
-*etc*. I wanted to build an offline-first application on the web, and so I did.
+I have myself built offline-first applications, leveraging IndexedDB, App Cache, _etc_. I wanted to build an offline-first application on the web, and so I
+did.
-In fact, many people choose [PouchDB][pouchdb] *because* of that, since it is a
-good tool for offline-first web applications. The problem isn't really the
+In fact, many people choose {pouchdb}[PouchDB] _because_ of that, since it is a
+good tool for offline-first web applications. The problem isn't really the
technology, but how much people want their application to be local-first.
-Contrast it with Android [Instant Apps][instant-apps], where applications are
-sent to the phone in small parts. Since this requires an internet connection to
+Contrast it with Android {instant-apps}[Instant Apps], where applications are
+sent to the phone in small parts. Since this requires an internet connection to
move from a part of the app bundle to another, a subset of the app isn't
local-first, despite being an app.
-The point isn't the technology, but how people are using it. Local-first web
+The point isn't the technology, but how people are using it. Local-first web
applications are perfectly possible, just like non-local-first native
applications are possible.
-[pouchdb]: https://pouchdb.com/
-[instant-apps]: https://developer.android.com/topic/google-play-instant
-
== Costs are underrated
I think the costs of "old-fashioned apps" over "cloud apps" are underrated,
mainly regarding storage, and that this costs can vary a lot by application.
Say a person writes online articles for their personal website, and puts
-everything into Git. Since there isn't supposed to be any collaboration, all
-of the relevant ideals of local-first are achieved.
+everything into Git. Since there isn't supposed to be any collaboration, all of
+the relevant ideals of local-first are achieved.
-Now another person creates videos instead of articles. They could try keeping
+Now another person creates videos instead of articles. They could try keeping
everything local, but after some time the storage usage fills the entire disk.
This person's local-first setup would be much more complex, and would cost much
more on maintenance, backup and storage.
Even though both have similar needs, a local-first video repository is much more
-demanding. So the local-first thinking here isn't "just keep everything local",
+demanding. So the local-first thinking here isn't "just keep everything local",
but "how much time and money am I willing to spend to keep everything local".
The convenience of "cloud apps" becomes so attractive that many don't even have
@@ -245,22 +244,22 @@ If I were the one making the list of ideals, I wouldn't focus so much on
real-time collaboration.
Even though seamless collaboration is desired, it being real-time depends on the
-network being available for that. But ideal 3 states that
-"The Network is Optional", so real-time collaboration is also optional.
+network being available for that. But ideal 3 states that "The Network is
+Optional", so real-time collaboration is also optional.
The fundamentals of a local-first system should enable real-time collaboration
when network is available, but shouldn't focus on it.
On many places when discussing applications being offline, it is common for me
-to find people saying that their application works
-"even on a plane, subway or elevator". That is a reflection of when said
-developers have to deal with networks being unavailable.
+to find people saying that their application works "even on a plane, subway or
+elevator". That is a reflection of when said developers have to deal with
+networks being unavailable.
But this leaves out a big chunk of the world where internet connection is
intermittent, or only works every other day or only once a week, or stops
-working when it rains, *etc*. For this audience, living without network
-connectivity isn't such a discrete moment in time, but part of every day life. I
-like the fact that the authors acknowledge that.
+working when it rains, _etc_. For this audience, living without network
+connectivity isn't such a discrete moment in time, but part of every day life.
+I like the fact that the authors acknowledge that.
When discussing "working offline", I'd rather keep this type of person in mind,
then the subset of people who are offline when on the elevator will naturally be
@@ -268,31 +267,32 @@ included.
== On CRDTs and developer experience
+:archived-article: https://web.archive.org/web/20130116163535/https://labs.oracle.com/techrep/1994/smli_tr-94-29.pdf
+
When discussing developer experience, the authors bring up some questions to be
answered further, like:
-> For an app developer, how does the use of a CRDT-based data layer compare to
-> existing storage layers like a SQL database, a filesystem, or CoreData? Is a
-> distributed system harder to write software for?
+____
+For an app developer, how does the use of a CRDT-based data layer compare to
+existing storage layers like a SQL database, a filesystem, or CoreData? Is a
+distributed system harder to write software for?
+____
That is an easy one: yes.
-A distributed system *is* harder to write software for, being a distributed
+A distributed system _is_ harder to write software for, being a distributed
system.
Adding a large layer of data structures and algorithms will make it more complex
-to write software for, naturally. And if trying to make this layer transparent
+to write software for, naturally. And if trying to make this layer transparent
to the programmer, so they can pretend that layer doesn't exist is a bad idea,
as RPC frameworks have tried, and failed.
-See "[A Note on Distributed Computing][note-dist-comp]" for a critique on RPC
+See "{archived-article}[A Note on Distributed Computing]" for a critique on RPC
frameworks trying to make the network invisible, which I think also applies in
equivalence for making the CRDTs layer invisible.
-[rmi-wiki]: https://en.wikipedia.org/wiki/Java_remote_method_invocation
-[note-dist-comp]: https://web.archive.org/web/20130116163535/http://labs.oracle.com/techrep/1994/smli_tr-94-29.pdf
-
-## Conclusion
+== Conclusion
I liked a lot the article, as it took the "offline-first" philosophy and ran
with it.