diff options
Diffstat (limited to 'src/content/blog/2020/11')
-rw-r--r-- | src/content/blog/2020/11/07/diy-bugs.adoc | 93 | ||||
-rw-r--r-- | src/content/blog/2020/11/08/paradigm-shift-review.adoc | 154 | ||||
-rw-r--r-- | src/content/blog/2020/11/12/database-parsers-trees.adoc | 226 | ||||
-rw-r--r-- | src/content/blog/2020/11/14/local-first-review.adoc | 305 |
4 files changed, 0 insertions, 778 deletions
diff --git a/src/content/blog/2020/11/07/diy-bugs.adoc b/src/content/blog/2020/11/07/diy-bugs.adoc deleted file mode 100644 index 8ab7953..0000000 --- a/src/content/blog/2020/11/07/diy-bugs.adoc +++ /dev/null @@ -1,93 +0,0 @@ -= DIY an offline bug tracker with text files, Git and email -:updatedat: 2021-08-14 - -:attack-on-ytdl: https://github.com/github/dmca/blob/master/2020/10/2020-10-23-RIAA.md -:list-discussions: https://sourcehut.org/blog/2020-10-29-how-mailing-lists-prevent-censorship/ -:docs-in-repo: https://podcast.writethedocs.org/2017/01/25/episode-3-trends/ -:ci-in-notes: link:../../../../tils/2020/11/30/git-notes-ci.html -:todos-mui: https://man.sr.ht/todo.sr.ht/#email-access -:git-bug-bridges: https://github.com/MichaelMure/git-bug#bridges - -When {attack-on-ytdl}[push comes to shove], the operational aspects of -governance of a software project matter a lot. And everybody likes to chime in -with their alternative of how to avoid single points of failure in project -governance, just like I'm doing right now. - -The most valuable assets of a project are: - -. source code -. discussions -. documentation -. builds -. tasks and bugs - -For *source code*, Git and other DVCS solve that already: everybody gets a full -copy of the entire source code. - -If your code forge is compromised, moving it to a new one takes a couple of -minutes, if there isn't a secondary remote serving as mirror already. In this -case, no action is required. - -If you're having your *discussions* by email, "{list-discussions}[taking this -archive somewhere else and carrying on is effortless]". - -Besides, make sure to backup archives of past discussions so that the history is -also preserved when this migration happens. - -The *documentation* should {docs-in-repo}[live inside the repository -itself]footnote:writethedocs-in-repo[ - Described as "the ultimate marriage of the two". Starts at time 31:50. -], so that not only it gets first class treatment, but also gets distributed to -everybody too. Migrating the code to a new forge already migrates the -documentation with it. - -As long as you keep the *builds* vendor neutral, the migration should only -involve adapting how you call your `tests.sh` from the format of -`provider-1.yml` uses to the format that `provider-2.yml` accepts. It isn't -valuable to carry the build history with the project, as this data quickly -decays in value as weeks and months go by, but for simple text logs -{ci-in-notes}[using Git notes] may be just enough, and they would be replicated -with the rest of the repository. - -But for *tasks and bugs* many rely on a vendor-specific service, where -you register and manage those issues via a web browser. Some provide an -{todos-mui}[interface for interacting via email] or an API for -{git-bug-bridges[bridging local bugs with vendor-specific services]. But -they're all layers around the service, that disguises it as being a central -point of failure, which when compromised would lead to data loss. When push -comes to shove, you'd loose data. - -== Alternative: text files, Git and email - -:todos-example: https://euandre.org/git/remembering/tree/TODOs.md?id=3f727802cb73ab7aa139ca52e729fd106ea916d0 -:todos-script: https://euandre.org/git/remembering/tree/aux/workflow/TODOs.sh?id=3f727802cb73ab7aa139ca52e729fd106ea916d0 -:todos-html: https://euandreh.xyz/remembering/TODOs.html -:fossil-tickets: https://fossil-scm.org/home/doc/trunk/www/bugtheory.wiki - -Why not do the same as documentation, and move tasks and bugs into the -repository itself? - -It requires no extra tool to be installed, and fits right in the already -existing workflow for source code and documentation. - -I like to keep a {todos-example}[`TODOs.md`] file at the repository top-level, -with two relevant sections: "tasks" and "bugs". Then when building the -documentation I'll just {todos-script}[generate an HTML file from it], and -{todos-html}[publish] it alongside the static website. All that is done on the -main branch. - -Any issues discussions are done in the mailing list, and a reference to a -discussion could be added to the ticket itself later on. External contributors -can file tickets by sending a patch. - -The good thing about this solution is that it works for 99% of projects out -there. - -For the other 1%, having Fossil's "{fossil-tickets}[tickets]" could be an -alternative, but you may not want to migrate your project to Fossil to get those -niceties. - -Even though I keep a `TODOs.md` file on the main branch, you can have a `tasks` -branch with a `task-n.md` file for each task, or any other way you like. - -These tools are familiar enough that you can adjust it to fit your workflow. diff --git a/src/content/blog/2020/11/08/paradigm-shift-review.adoc b/src/content/blog/2020/11/08/paradigm-shift-review.adoc deleted file mode 100644 index 1110085..0000000 --- a/src/content/blog/2020/11/08/paradigm-shift-review.adoc +++ /dev/null @@ -1,154 +0,0 @@ -= The Next Paradigm Shift in Programming - video review -:categories: video-review - -:reviewed-video: https://www.youtube.com/watch?v=6YbK8o9rZfI - -This is a review with comments of "{reviewed-video}[The Next Paradigm Shift in -Programming]", by Richard Feldman. - -This video was _strongly_ suggested to me by a colleague. I wanted to discuss -it with her, and when drafting my response I figured I could publish it publicly -instead. - -Before anything else, let me just be clear: I really like the talk, and I think -Richard is a great public speaker. I've watched several of his talks over the -years, and I feel I've followed his career at a distance, with much respect. -This isn't a piece criticizing him personally, and I agree with almost -everything he said. These are just some comments but also nitpicks on a few -topics I think he missed, or that I view differently. - -== Structured programming - -:forgotten-art-video: https://www.youtube.com/watch?v=SFv8Wm2HdNM - -The historical overview at the beginning is very good. In fact, the very video -I watched previously was about structured programming! - -Kevlin Henney on "{forgotten-art-video}[The Forgotten Art of Structured -Programming]" does a deep-dive on the topic of structured programming, and how -on his view it is still hidden in our code, when we do a `continue` or a `break` -in some ways. Even though it is less common to see an explicit `goto` in code -these days, many of the original arguments of Dijkstra against explicit `goto`s -is applicable to other constructs, too. - -This is a very mature view, and I like how he goes beyond the "don't use -`goto`s" heuristic and proposes and a much more nuanced understanding of what -"structured programming" means. - -In a few minutes, Richard is able to condense most of the significant bits of -Kevlin's talk in a didactical way. Good job. - -== OOP like a distributed system - -:joe-oop: https://www.infoq.com/interviews/johnson-armstrong-oop/ -:rich-hickey-oop: https://www.youtube.com/watch?v=ROor6_NGIWU - -Richard extrapolates Alan Kay's original vision of OOP, and he concludes that it -is more like a distributed system that how people think about OOP these days. -But he then states that this is a rather bad idea, and we shouldn't pursue it, -given that distributed systems are known to be hard. - -However, his extrapolation isn't really impossible, bad or an absurd. In fact, -it has been followed through by Erlang. Joe Armstrong used to say that -"{joe-oop}[Erlang might the only OOP language]", since it actually adopted this -paradigm. - -But Erlang is a functional language. So this "OOP as a distributed system" view -is more about designing systems in the large than programs in the small. - -There is a switch of levels in this comparison I'm making, as can be done with -any language or paradigm: you can have a functional-like system that is built -with an OOP language (like a compiler, that given the same input will produce -the same output), or an OOP-like system that is built with a functional -language (Rich Hickey calls it "{rich-hickey-oop}[OOP in the -large]"footnote:langsys[ - From 24:05 to 27:45. -]). - -So this jump from in-process paradigm to distributed paradigm is rather a big -one, and I don't think you he can argue that OOP has anything to say about -software distribution across nodes. You can still have Erlang actors that run -independently and send messages to each other without a network between them. -Any OTP application deployed on a single node effectively works like that. - -I think he went a bit too far with this extrapolation. Even though I agree it -is a logical a fair one, it isn't evidently bad as he painted. I would be fine -working with a single-node OTP application and seeing someone call it "a _real_ -OOP program". - -== First class immutability - -:immer: https://sinusoid.es/immer/ -:immutable-js: https://immutable-js.github.io/immutable-js/ - -I agree with his view of languages moving towards the functional paradigm. But -I think you can narrow down the "first-class immutability" feature he points out -as present on modern functional programming languages to "first-class immutable -data structures". - -I wouldn't categorize a language as "supporting functional programming style" -without a library for functional data structures it. By discipline you can -avoid side-effects, write pure functions as much as possible, and pass functions -as arguments around is almost every language these days, but if when changing an -element of a vector mutates things in-place, that is still not functional -programming. - -To avoid that, you end-up needing to make clones of objects to pass to a -function, using freezes or other workarounds. All those cases are when the -underlying mix of OOP and functional programming fail. - -There are some languages with third-party libraries that provide functional data -structures, like {immer}[immer] for C++, or {immutable-js}[ImmutableJS] for -JavaScript. - -But functional programming is more easily achievable in languages that have them -built-in, like Erlang, Elm and Clojure. - -== Managed side-effects - -:redux: https://redux.js.org/ -:re-frame: https://github.com/Day8/re-frame - -His proposal of adopting managed side-effects as a first-class language concept -is really intriguing. - -This is something you can achieve with a library, like {redux}[Redux] for -JavaScript or {re-frame}[re-frame] for Clojure. - -I haven't worked with a language with managed side-effects at scale, and I don't -feel this is a problem with Clojure or Erlang. But is this me finding a flaw in -his argument or not acknowledging a benefit unknown to me? This is a -provocative question I ask myself. - -Also all FP languages with managed side-effects I know are statically-typed, and -all dynamically-typed FP languages I know don't have managed side-effects baked -in. - -== What about declarative programming? - -:tarpit-article: https://curtclifton.net/papers/MoseleyMarks06a.pdf - -In "{tarpit-article}[Out of the Tar Pit]", B. Moseley and P. Marks go beyond his -view of functional programming as the basis, and name a possible "functional -relational programming" as an even better solution. They explicitly call out -some flaws in most of the modern functional programming languages, and instead -pick declarative programming as an even better starting paradigm. - -If the next paradigm shift is towards functional programming, will the following -shift be towards declarative programming? - -== Conclusion - -:simple-made-easy: https://www.infoq.com/presentations/Simple-Made-Easy/ - -Beyond all Richard said, I also hear often bring up functional programming when -talking about utilizing all cores of a computer, and how FP can help with that. - -Rich Hickey makes a great case for single-process FP on his famous talk -"{simple-made-easy}[Simple Made Easy]". - -//// -I find this conclusion too short, and it doesn't revisits the main points -presented on the body of the article. I won't rewrite it now, but it would be -an improvement to extend it to do so. -//// diff --git a/src/content/blog/2020/11/12/database-parsers-trees.adoc b/src/content/blog/2020/11/12/database-parsers-trees.adoc deleted file mode 100644 index 47595e8..0000000 --- a/src/content/blog/2020/11/12/database-parsers-trees.adoc +++ /dev/null @@ -1,226 +0,0 @@ -= Durable persistent trees and parser combinators - building a database -:categories: mediator -:updatedat: 2021-02-09 - -:empty: -:db-article: link:../../08/31/database-i-wish-i-had.html - -I've received with certain frequency messages from people wanting to know if -I've made any progress on the database project {db-article}[I've written about]. - -There are a few areas where I've made progress, and here's a public post on it. - -== Proof-of-concept: DAG log - -:mediator-permalink: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n1 - -The main thing I wanted to validate with a concrete implementation was the -concept of modeling a DAG on a sequence of datoms. - -The notion of a _datom_ is a rip-off from Datomic, which models data with time -aware _facts_, which come from RDF. RDF's fact is a triple of -subject-predicate-object, and Datomic's datoms add a time component to it: -subject-predicate-object-time, A.K.A. entity-attribute-value-transaction: - -[source,clojure] ----- -[[person :likes "pizza" 0 true] - [person :likes "bread" 1 true] - [person :likes "pizza" 1 false]] ----- - -The above datoms say: - at time 0, `person` like pizza; - at time 1, `person` -stopped liking pizza, and started to like bread. - -Datomic ensures total consistency of this ever growing log by having a single -writer, the transactor, that will enforce it when writing. - -In order to support disconnected clients, I needed a way to allow multiple -writers, and I chose to do it by making the log not a list, but a directed -acyclic graph (DAG): - -[source,clojure] ----- -[[person :likes "pizza" 0 true] - [0 :parent :db/root 0 true] - [person :likes "bread" 1 true] - [person :likes "pizza" 1 false] - [1 :parent 0 1 true]] ----- - -The extra datoms above add more information to build the directionality to the -log, and instead of a single consistent log, the DAG could have multiple leaves -that coexist, much like how different Git branches can have different "latest" -commits. - -In order to validate this idea, I started with a Clojure implementation. The -goal was not to write the actual final code, but to make a proof-of-concept that -would allow me to test and stretch the idea itself. - -This code {mediator-permalink}[already exists], but is yet fairly incomplete: - -:commented-code: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n295 -:more: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n130 -:than: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n146 -:one: https://euandre.org/git/mediator/tree/src/core/clojure/src/mediator.clj?id=db4a727bc24b54b50158827b34502de21dbf8948#n253 - -* the building of the index isn't done yet (with some {commented-code}[commented - code] on the next step to be implemented) -* the indexing is extremely inefficient, with {more}[more] {than}[than] - {one}[one] occurrence of `O²` functions; -* no query support yet. - -== Top-down _and_ bottom-up - -However, as time passed and I started looking at what the final implementation -would look like, I started to consider keeping the PoC around. - -The top-down approach (Clojure PoC) was in fact helping guide me with the -bottom-up, and I now have "promoted" the Clojure PoC into a "reference -implementation". It should now be a finished implementation that says what the -expected behaviour is, and the actual code should match the behaviour. - -The good thing about a reference implementation is that it has no performance of -resources boundary, so if it ends up being 1000× slower and using 500× more -memory, it should be find. The code can be also 10× or 100× simpler, too. - -== Top-down: durable persistent trees - -:pavlo-videos: https://www.youtube.com/playlist?list=PLSE8ODhjZXjbohkNBWQs_otTrBTrjyohi -:db-book: https://www.databass.dev/ - -In promoting the PoC into a reference implementation, this top-down approach now -needs to go beyond doing everything in memory, and the index data structure now -needs to be disk-based. - -Roughly speaking, most storage engines out there are based either on B-Trees or -LSM Trees, or some variations of those. - -But when building an immutable database, update-in-place B-Trees aren't an -option, as it doesn't accommodate keeping historical views of the tree. LSM -Trees may seem a better alternative, but duplication on the files with -compaction are also ways to delete old data which is indeed useful for a -historical view. - -I think the thing I'm after is a mix of a Copy-on-Write B-Tree, which would keep -historical versions with the write IO cost amortization of memtables of LSM -Trees. I don't know of any B-Tree variant out there that resembles this, so -I'll call it "Flushing Copy-on-Write B-Tree". - -I haven't written any code for this yet, so all I have is a high-level view of -what it will look like: - -. like Copy-on-Write B-Trees, changing a leaf involves creating a new leaf and - building a new path from root to the leaf. The upside is that writes a lock - free, and no coordination is needed between readers and writers, ever; -. the downside is that a single leaf update means at least `H` new nodes that - will have to be flushed to disk, where `H` is the height of the tree. To - avoid that, the writer creates these nodes exclusively on the in-memory - memtable, to avoid flushing to disk on every leaf update; -. a background job will consolidate the memtable data every time it hits X MB, - and persist it to disk, amortizing the cost of the Copy-on-Write B-Tree; -. readers than will have the extra job of getting the latest relevant - disk-resident value and merge it with the memtable data. - -The key difference to existing Copy-on-Write B-Trees is that the new trees are -only periodically written to disk, and the intermediate values are kept in -memory. Since no node is ever updated, the page utilization is maximum as it -doesn't need to keep space for future inserts and updates. - -And the key difference to existing LSM Trees is that no compaction is run: -intermediate values are still relevant as the database grows. So this leaves -out tombstones and value duplication done for write performance. - -One can delete intermediate index values to reclaim space, but no data is lost -on the process, only old B-Tree values. And if the database ever comes back to -that point (like when doing a historical query), the B-Tree will have to be -rebuilt from a previous value. After all, the database _is_ a set of datoms, -and everything else is just derived data. - -Right now I'm still reading about other data structures that storage engines -use, and I'll start implementing the "Flushing Copy-on-Write B-Tree" as I learn -more{empty}footnote:learn-more-db[ - If you are interested in learning more about this too, the very best two - resources on this subject are Andy Pavlo's "{pavlo-videos}[Intro to Database - Systems]" course and Alex Petrov's "{db-book}[Database Internals]" book. -] and mature it more. - -== Bottom-up: parser combinators and FFI - -:cbindgen: https://github.com/eqrion/cbindgen -:cbindgen-next: https://blog.eqrion.net/future-directions-for-cbindgen/ -:syn-crate: https://github.com/dtolnay/syn -:libedn: https://euandre.org/git/libedn/ - -I chose Rust as it has the best WebAssembly tooling support. - -My goal is not to build a Rust database, but a database that happens to be in -Rust. In order to reach client platforms, the primary API is the FFI one. - -I'm not very happy with current tools for exposing Rust code via FFI to the -external world: they either mix C with C++, which I don't want to do, or -provide no access to the intermediate representation of the FFI, which would be -useful for generating binding for any language that speaks FFI. - -I like better the path that the author of {cbindgen}[cbindgen] crate -{cbindgen-next}[proposes]: emitting an data representation of the Rust C API -(the author calls is a `ffi.json` file), and than building transformers from the -data representation to the target language. This way you could generate a C API -_and_ the node-ffi bindings for JavaScript automatically from the Rust code. - -So the first thing to be done before moving on is an FFI exporter that doesn't -mix C and C++, and generates said `ffi.json`, and than build a few transformers -that take this `ffi.json` and generate the language bindings, be it C, C++, -JavaScript, TypeScript, Kotlin, Swift, Dart, -_etc_footnote:ffi-langs[ - Those are, specifically, the languages I'm more interested on. My goal is - supporting client applications, and those languages are the most relevant for - doing so: C for GTK, C++ for Qt, JavaScript and TypeScript for Node.js and - browser, Kotlin for Android and Swing, Swift for iOS, and Dart for Flutter. -]. - -I think the best way to get there is by taking the existing code for cbindgen, -which uses the {syn-crate}[syn] crate to parse the Rust -code{empty}footnote:rust-syn[ - The fact that syn is an external crate to the Rust compiler points to a big - warning: procedural macros are not first class in Rust. They are just like - Babel plugins in JavaScript land, with the extra shortcoming that there is no - specification for the Rust syntax, unlike JavaScript. -pass:[</p><p>] - As flawed as this may be, it seems to be generally acceptable and adopted, - which works against building a solid ecosystem for Rust. -pass:[</p><p>] - The alternative that rust-ffi implements relies on internals of the Rust - compiler, which isn't actually worst, just less common and less accepted. -], and adapt it to emit the metadata. - -I've started a fork of cbindgen: -[line-through]#x-bindgen#{empty}footnote:x-bindgen[ - _EDIT_: now archived, the experimentation was fun. I've started to move more - towards C, so this effort became deprecated. -]. Right now it is just a copy of cbindgen verbatim, and I plan to remove all C -and C++ emitting code from it, and add a IR emitting code instead. - -When starting working on x-bindgen, I realized I didn't know what to look for in -a header file, as I haven't written any C code in many years. So as I was -writing {libedn}[libedn], I didn't know how to build a good C API to expose. So -I tried porting the code to C, and right now I'm working on building a _good_ C -API for a JSON parser using parser combinators: -[line-through]#ParsecC#{empty}footnote:parsecc[ - _EDIT_: now also archived. -]. - -After "finishing" ParsecC I'll have a good notion of what a good C API is, and -I'll have a better direction towards how to expose code from libedn to other -languages, and work on x-bindgen then. - -What both libedn and ParsecC are missing right now are proper error reporting, -and property-based testing for libedn. - -== Conclusion - -I've learned a lot already, and I feel the journey I'm on is worth going -through. - -If any of those topics interest you, message me to discuss more or contribute! -Patches welcome! diff --git a/src/content/blog/2020/11/14/local-first-review.adoc b/src/content/blog/2020/11/14/local-first-review.adoc deleted file mode 100644 index f9dd4b0..0000000 --- a/src/content/blog/2020/11/14/local-first-review.adoc +++ /dev/null @@ -1,305 +0,0 @@ -= Local-First Software: article review -:categories: presentation article-review - -:empty: -:presentation: link:../../../../slides/2020/11/14/local-first.html FIXME -:reviewed-article: https://martin.kleppmann.com/papers/local-first.pdf - -_This article is derived from a {presentation}[presentation] given at a Papers -We Love meetup on the same subject._ - -This is a review of the article "{reviewed-article}[Local-First Software: You -Own Your Data, in spite of the Cloud]", by M. Kleppmann, A. Wiggins, P. Van -Hardenberg and M. F. McGranaghan. - -== Offline-first, local-first - -The "local-first" term they use isn't new, and I have used it myself in the past -to refer to this types of application, where the data lives primarily on the -client, and there are conflict resolution algorithms that reconcile data created -on different instances. - -Sometimes I see confusion with this idea and "client-side", "offline-friendly", -"syncable", etc. I have myself used this terms, also. - -There exists, however, already the "offline-first" term, which conveys almost -all of that meaning. In my view, "local-first" doesn't extend "offline-first" -in any aspect, rather it gives a well-defined meaning to it instead. I could -say that "local-first" is just "offline-first", but with 7 well-defined ideals -instead of community best practices. - -It is a step forward, and given the number of times I've seen the paper shared -around I think there's a chance people will prefer saying "local-first" in -_lieu_ of "offline-first" from now on. - -== Software licenses - -On a footnote of the 7th ideal ("You Retain Ultimate Ownership and Control"), -the authors say: - -____ -In our opinion, maintaining control and ownership of data does not mean that the -software must necessarily be open source. (...) as long as it does not -artificially restrict what users can do with their files. -____ - -They give examples of artificial restrictions, like this artificial restriction -I've come up with: - -[source,sh] ----- -#!/bin/sh - -TODAY=$(date +%s) -LICENSE_EXPIRATION=$(date -d 2020-11-15 +%s) - -if [ $TODAY -ge $LICENSE_EXPIRATION ]; then - echo 'License expired!' - exit 1 -fi - -echo $((2 + 2)) ----- - -Now when using this very useful program: - -[source,sh] ----- -# today -$ ./useful-adder.sh -4 -# tomorrow -$ ./useful-adder.sh -License expired! ----- - -This is obviously an intentional restriction, and it goes against the 5th ideal -("The Long Now"). This software would only be useful as long as the embedded -license expiration allowed. Sure you could change the clock on the computer, -but there are many other ways that this type of intentional restriction is in -conflict with that ideal. - -However, what about unintentional restrictions? What if a software had an equal -or similar restriction, and stopped working after days pass? Or what if the -programmer added a constant to make the development simpler, and this led to -unintentionally restricting the user? - -[source,sh] ----- -# today -$ useful-program -# ...useful output... - -# tomorrow, with more data -$ useful-program -ERROR: Panic! Stack overflow! ----- - -Just as easily as I can come up with ways to intentionally restrict users, I can -do the same for unintentionally restrictions. A program can stop working for a -variety of reasons. - -If it stops working due do, say, data growth, what are the options? Reverting -to an earlier backup, and making it read-only? That isn't really a "Long Now", -but rather a "Long Now as long as the software keeps working as expected". - -The point is: if the software isn't free, "The Long Now" isn't achievable -without a lot of wishful thinking. Maybe the authors were trying to be more -friendly towards business who don't like free software, but in doing so they've -proposed a contradiction by reconciling "The Long Now" with proprietary -software. - -It isn't the same as saying that any free software achieves that ideal, either. -The license can still be free, but the source code can become unavailable due to -cloud rot. Or maybe the build is undocumented, or the build tools had specific -configuration that one has to guess. A piece of free software can still fail to -achieve "The Long Now". Being free doesn't guarantee it, just makes it -possible. - -A colleague has challenged my view, arguing that the software doesn't really -need to be free, as long as there is an specification of the file format. This -way if the software stops working, the format can still be processed by other -programs. But this doesn't apply in practice: if you have a document that you -write to, and software stops working, you still want to write to the document. -An external tool that navigates the content and shows it to you won't allow you -to keep writing, and when it does that tool is now starting to re-implement the -software. - -An open specification could serve as a blueprint to other implementations, -making the data format more friendly to reverse-engineering. But the -re-implementation still has to exist, at which point the original software -failed to achieve "The Long Now". - -It is less bad, but still not quite there yet. - -== Denial of existing solutions - -:distgit: https://drewdevault.com/2018/07/23/Git-is-already-distributed.html - -When describing "Existing Data Storage and Sharing Models", on a -footnote{empty}footnote:devil[ - This is the second aspect that I'm picking on the article from a footnote. I - guess the devil really is on the details. -] the authors say: - -____ -In principle it is possible to collaborate without a repository service, e.g. by -sending patch files by email, but the majority of Git users rely on GitHub. -____ - -The authors go to a great length to talk about usability of cloud apps, and even -point to research they've done on it, but they've missed learning more from -local-first solutions that already exist. - -Say the automerge CRDT proves to be even more useful than what everybody -imagined. Say someone builds a local-first repository service using it. How -will it change anything of the Git/GitHub model? What is different about it -that prevents people in the future writing a paper saying: - -____ -In principle it is possible to collaborate without a repository service, e.g. by -using automerge and platform X, but the majority of Git users rely on GitHub. -____ - -How is this any better? - -If it is already {distgit}[possible] to have a local-first development workflow, -why don't people use it? Is it just fashion, or there's a fundamental problem -with it? If so, what is it, and how to avoid it? - -If sending patches by emails is perfectly possible but out of fashion, why even -talk about Git/GitHub? Isn't this a problem that people are putting themselves -in? How can CRDTs possibly prevent people from doing that? - -My impression is that the authors envision a better future, where development is -fully decentralized unlike today, and somehow CRDTs will make that happen. If -more people think this way, "CRDT" is next in line to the buzzword list that -solves everything, like "containers", "blockchain" or "machine learning". - -Rather than picturing an imaginary service that could be described like -"GitHub+CRDTs" and people would adopt it, I'd rather better understand why -people don't do it already, since Git is built to work like that. - -== Ditching of web applications - -:pouchdb: https://pouchdb.com/ -:instant-apps: https://developer.android.com/topic/google-play-instant - -The authors put web application in a worse position for building local-first -application, claiming that: - -____ -(...) the architecture of web apps remains fundamentally server-centric. -Offline support is an afterthought in most web apps, and the result is -accordingly fragile. -____ - -Well, I disagree. - -The problem isn't inherit to the web platform, but instead how people use it. - -I have myself built offline-first applications, leveraging IndexedDB, App Cache, -_etc_. I wanted to build an offline-first application on the web, and so I did. - -In fact, many people choose {pouchdb}[PouchDB] _because_ of that, since it is a -good tool for offline-first web applications. The problem isn't really the -technology, but how much people want their application to be local-first. - -Contrast it with Android {instant-apps}[Instant Apps], where applications are -sent to the phone in small parts. Since this requires an internet connection to -move from a part of the app bundle to another, a subset of the app isn't -local-first, despite being an app. - -The point isn't the technology, but how people are using it. Local-first web -applications are perfectly possible, just like non-local-first native -applications are possible. - -== Costs are underrated - -I think the costs of "old-fashioned apps" over "cloud apps" are underrated, -mainly regarding storage, and that this costs can vary a lot by application. - -Say a person writes online articles for their personal website, and puts -everything into Git. Since there isn't supposed to be any collaboration, all of -the relevant ideals of local-first are achieved. - -Now another person creates videos instead of articles. They could try keeping -everything local, but after some time the storage usage fills the entire disk. -This person's local-first setup would be much more complex, and would cost much -more on maintenance, backup and storage. - -Even though both have similar needs, a local-first video repository is much more -demanding. So the local-first thinking here isn't "just keep everything local", -but "how much time and money am I willing to spend to keep everything local". - -The convenience of "cloud apps" becomes so attractive that many don't even have -a local copy of their videos, and rely exclusively on service providers to -maintain, backup and store their content. - -The dial measuring "cloud apps" and "old-fashioned apps" needs to be specific to -use-cases. - -== Real-time collaboration is optional - -If I were the one making the list of ideals, I wouldn't focus so much on -real-time collaboration. - -Even though seamless collaboration is desired, it being real-time depends on the -network being available for that. But ideal 3 states that "The Network is -Optional", so real-time collaboration is also optional. - -The fundamentals of a local-first system should enable real-time collaboration -when network is available, but shouldn't focus on it. - -On many places when discussing applications being offline, it is common for me -to find people saying that their application works "even on a plane, subway or -elevator". That is a reflection of when said developers have to deal with -networks being unavailable. - -But this leaves out a big chunk of the world where internet connection is -intermittent, or only works every other day or only once a week, or stops -working when it rains, _etc_. For this audience, living without network -connectivity isn't such a discrete moment in time, but part of every day life. -I like the fact that the authors acknowledge that. - -When discussing "working offline", I'd rather keep this type of person in mind, -then the subset of people who are offline when on the elevator will naturally be -included. - -== On CRDTs and developer experience - -:archived-article: https://web.archive.org/web/20130116163535/https://labs.oracle.com/techrep/1994/smli_tr-94-29.pdf - -When discussing developer experience, the authors bring up some questions to be -answered further, like: - -____ -For an app developer, how does the use of a CRDT-based data layer compare to -existing storage layers like a SQL database, a filesystem, or CoreData? Is a -distributed system harder to write software for? -____ - -That is an easy one: yes. - -A distributed system _is_ harder to write software for, being a distributed -system. - -Adding a large layer of data structures and algorithms will make it more complex -to write software for, naturally. And if trying to make this layer transparent -to the programmer, so they can pretend that layer doesn't exist is a bad idea, -as RPC frameworks have tried, and failed. - -See "{archived-article}[A Note on Distributed Computing]" for a critique on RPC -frameworks trying to make the network invisible, which I think also applies in -equivalence for making the CRDTs layer invisible. - -== Conclusion - -I liked a lot the article, as it took the "offline-first" philosophy and ran -with it. - -But I think the authors' view of adding CRDTs and things becoming local-first is -a bit too magical. - -This particular area is one that I have large interest on, and I wish to see -more being done on the "local-first" space. |