summaryrefslogtreecommitdiff
path: root/src/content/en/blog/2020/11/14/local-first-review.adoc
diff options
context:
space:
mode:
Diffstat (limited to 'src/content/en/blog/2020/11/14/local-first-review.adoc')
-rw-r--r--src/content/en/blog/2020/11/14/local-first-review.adoc305
1 files changed, 305 insertions, 0 deletions
diff --git a/src/content/en/blog/2020/11/14/local-first-review.adoc b/src/content/en/blog/2020/11/14/local-first-review.adoc
new file mode 100644
index 0000000..f9dd4b0
--- /dev/null
+++ b/src/content/en/blog/2020/11/14/local-first-review.adoc
@@ -0,0 +1,305 @@
+= Local-First Software: article review
+:categories: presentation article-review
+
+:empty:
+:presentation: link:../../../../slides/2020/11/14/local-first.html FIXME
+:reviewed-article: https://martin.kleppmann.com/papers/local-first.pdf
+
+_This article is derived from a {presentation}[presentation] given at a Papers
+We Love meetup on the same subject._
+
+This is a review of the article "{reviewed-article}[Local-First Software: You
+Own Your Data, in spite of the Cloud]", by M. Kleppmann, A. Wiggins, P. Van
+Hardenberg and M. F. McGranaghan.
+
+== Offline-first, local-first
+
+The "local-first" term they use isn't new, and I have used it myself in the past
+to refer to this types of application, where the data lives primarily on the
+client, and there are conflict resolution algorithms that reconcile data created
+on different instances.
+
+Sometimes I see confusion with this idea and "client-side", "offline-friendly",
+"syncable", etc. I have myself used this terms, also.
+
+There exists, however, already the "offline-first" term, which conveys almost
+all of that meaning. In my view, "local-first" doesn't extend "offline-first"
+in any aspect, rather it gives a well-defined meaning to it instead. I could
+say that "local-first" is just "offline-first", but with 7 well-defined ideals
+instead of community best practices.
+
+It is a step forward, and given the number of times I've seen the paper shared
+around I think there's a chance people will prefer saying "local-first" in
+_lieu_ of "offline-first" from now on.
+
+== Software licenses
+
+On a footnote of the 7th ideal ("You Retain Ultimate Ownership and Control"),
+the authors say:
+
+____
+In our opinion, maintaining control and ownership of data does not mean that the
+software must necessarily be open source. (...) as long as it does not
+artificially restrict what users can do with their files.
+____
+
+They give examples of artificial restrictions, like this artificial restriction
+I've come up with:
+
+[source,sh]
+----
+#!/bin/sh
+
+TODAY=$(date +%s)
+LICENSE_EXPIRATION=$(date -d 2020-11-15 +%s)
+
+if [ $TODAY -ge $LICENSE_EXPIRATION ]; then
+ echo 'License expired!'
+ exit 1
+fi
+
+echo $((2 + 2))
+----
+
+Now when using this very useful program:
+
+[source,sh]
+----
+# today
+$ ./useful-adder.sh
+4
+# tomorrow
+$ ./useful-adder.sh
+License expired!
+----
+
+This is obviously an intentional restriction, and it goes against the 5th ideal
+("The Long Now"). This software would only be useful as long as the embedded
+license expiration allowed. Sure you could change the clock on the computer,
+but there are many other ways that this type of intentional restriction is in
+conflict with that ideal.
+
+However, what about unintentional restrictions? What if a software had an equal
+or similar restriction, and stopped working after days pass? Or what if the
+programmer added a constant to make the development simpler, and this led to
+unintentionally restricting the user?
+
+[source,sh]
+----
+# today
+$ useful-program
+# ...useful output...
+
+# tomorrow, with more data
+$ useful-program
+ERROR: Panic! Stack overflow!
+----
+
+Just as easily as I can come up with ways to intentionally restrict users, I can
+do the same for unintentionally restrictions. A program can stop working for a
+variety of reasons.
+
+If it stops working due do, say, data growth, what are the options? Reverting
+to an earlier backup, and making it read-only? That isn't really a "Long Now",
+but rather a "Long Now as long as the software keeps working as expected".
+
+The point is: if the software isn't free, "The Long Now" isn't achievable
+without a lot of wishful thinking. Maybe the authors were trying to be more
+friendly towards business who don't like free software, but in doing so they've
+proposed a contradiction by reconciling "The Long Now" with proprietary
+software.
+
+It isn't the same as saying that any free software achieves that ideal, either.
+The license can still be free, but the source code can become unavailable due to
+cloud rot. Or maybe the build is undocumented, or the build tools had specific
+configuration that one has to guess. A piece of free software can still fail to
+achieve "The Long Now". Being free doesn't guarantee it, just makes it
+possible.
+
+A colleague has challenged my view, arguing that the software doesn't really
+need to be free, as long as there is an specification of the file format. This
+way if the software stops working, the format can still be processed by other
+programs. But this doesn't apply in practice: if you have a document that you
+write to, and software stops working, you still want to write to the document.
+An external tool that navigates the content and shows it to you won't allow you
+to keep writing, and when it does that tool is now starting to re-implement the
+software.
+
+An open specification could serve as a blueprint to other implementations,
+making the data format more friendly to reverse-engineering. But the
+re-implementation still has to exist, at which point the original software
+failed to achieve "The Long Now".
+
+It is less bad, but still not quite there yet.
+
+== Denial of existing solutions
+
+:distgit: https://drewdevault.com/2018/07/23/Git-is-already-distributed.html
+
+When describing "Existing Data Storage and Sharing Models", on a
+footnote{empty}footnote:devil[
+ This is the second aspect that I'm picking on the article from a footnote. I
+ guess the devil really is on the details.
+] the authors say:
+
+____
+In principle it is possible to collaborate without a repository service, e.g. by
+sending patch files by email, but the majority of Git users rely on GitHub.
+____
+
+The authors go to a great length to talk about usability of cloud apps, and even
+point to research they've done on it, but they've missed learning more from
+local-first solutions that already exist.
+
+Say the automerge CRDT proves to be even more useful than what everybody
+imagined. Say someone builds a local-first repository service using it. How
+will it change anything of the Git/GitHub model? What is different about it
+that prevents people in the future writing a paper saying:
+
+____
+In principle it is possible to collaborate without a repository service, e.g. by
+using automerge and platform X, but the majority of Git users rely on GitHub.
+____
+
+How is this any better?
+
+If it is already {distgit}[possible] to have a local-first development workflow,
+why don't people use it? Is it just fashion, or there's a fundamental problem
+with it? If so, what is it, and how to avoid it?
+
+If sending patches by emails is perfectly possible but out of fashion, why even
+talk about Git/GitHub? Isn't this a problem that people are putting themselves
+in? How can CRDTs possibly prevent people from doing that?
+
+My impression is that the authors envision a better future, where development is
+fully decentralized unlike today, and somehow CRDTs will make that happen. If
+more people think this way, "CRDT" is next in line to the buzzword list that
+solves everything, like "containers", "blockchain" or "machine learning".
+
+Rather than picturing an imaginary service that could be described like
+"GitHub+CRDTs" and people would adopt it, I'd rather better understand why
+people don't do it already, since Git is built to work like that.
+
+== Ditching of web applications
+
+:pouchdb: https://pouchdb.com/
+:instant-apps: https://developer.android.com/topic/google-play-instant
+
+The authors put web application in a worse position for building local-first
+application, claiming that:
+
+____
+(...) the architecture of web apps remains fundamentally server-centric.
+Offline support is an afterthought in most web apps, and the result is
+accordingly fragile.
+____
+
+Well, I disagree.
+
+The problem isn't inherit to the web platform, but instead how people use it.
+
+I have myself built offline-first applications, leveraging IndexedDB, App Cache,
+_etc_. I wanted to build an offline-first application on the web, and so I did.
+
+In fact, many people choose {pouchdb}[PouchDB] _because_ of that, since it is a
+good tool for offline-first web applications. The problem isn't really the
+technology, but how much people want their application to be local-first.
+
+Contrast it with Android {instant-apps}[Instant Apps], where applications are
+sent to the phone in small parts. Since this requires an internet connection to
+move from a part of the app bundle to another, a subset of the app isn't
+local-first, despite being an app.
+
+The point isn't the technology, but how people are using it. Local-first web
+applications are perfectly possible, just like non-local-first native
+applications are possible.
+
+== Costs are underrated
+
+I think the costs of "old-fashioned apps" over "cloud apps" are underrated,
+mainly regarding storage, and that this costs can vary a lot by application.
+
+Say a person writes online articles for their personal website, and puts
+everything into Git. Since there isn't supposed to be any collaboration, all of
+the relevant ideals of local-first are achieved.
+
+Now another person creates videos instead of articles. They could try keeping
+everything local, but after some time the storage usage fills the entire disk.
+This person's local-first setup would be much more complex, and would cost much
+more on maintenance, backup and storage.
+
+Even though both have similar needs, a local-first video repository is much more
+demanding. So the local-first thinking here isn't "just keep everything local",
+but "how much time and money am I willing to spend to keep everything local".
+
+The convenience of "cloud apps" becomes so attractive that many don't even have
+a local copy of their videos, and rely exclusively on service providers to
+maintain, backup and store their content.
+
+The dial measuring "cloud apps" and "old-fashioned apps" needs to be specific to
+use-cases.
+
+== Real-time collaboration is optional
+
+If I were the one making the list of ideals, I wouldn't focus so much on
+real-time collaboration.
+
+Even though seamless collaboration is desired, it being real-time depends on the
+network being available for that. But ideal 3 states that "The Network is
+Optional", so real-time collaboration is also optional.
+
+The fundamentals of a local-first system should enable real-time collaboration
+when network is available, but shouldn't focus on it.
+
+On many places when discussing applications being offline, it is common for me
+to find people saying that their application works "even on a plane, subway or
+elevator". That is a reflection of when said developers have to deal with
+networks being unavailable.
+
+But this leaves out a big chunk of the world where internet connection is
+intermittent, or only works every other day or only once a week, or stops
+working when it rains, _etc_. For this audience, living without network
+connectivity isn't such a discrete moment in time, but part of every day life.
+I like the fact that the authors acknowledge that.
+
+When discussing "working offline", I'd rather keep this type of person in mind,
+then the subset of people who are offline when on the elevator will naturally be
+included.
+
+== On CRDTs and developer experience
+
+:archived-article: https://web.archive.org/web/20130116163535/https://labs.oracle.com/techrep/1994/smli_tr-94-29.pdf
+
+When discussing developer experience, the authors bring up some questions to be
+answered further, like:
+
+____
+For an app developer, how does the use of a CRDT-based data layer compare to
+existing storage layers like a SQL database, a filesystem, or CoreData? Is a
+distributed system harder to write software for?
+____
+
+That is an easy one: yes.
+
+A distributed system _is_ harder to write software for, being a distributed
+system.
+
+Adding a large layer of data structures and algorithms will make it more complex
+to write software for, naturally. And if trying to make this layer transparent
+to the programmer, so they can pretend that layer doesn't exist is a bad idea,
+as RPC frameworks have tried, and failed.
+
+See "{archived-article}[A Note on Distributed Computing]" for a critique on RPC
+frameworks trying to make the network invisible, which I think also applies in
+equivalence for making the CRDTs layer invisible.
+
+== Conclusion
+
+I liked a lot the article, as it took the "offline-first" philosophy and ran
+with it.
+
+But I think the authors' view of adding CRDTs and things becoming local-first is
+a bit too magical.
+
+This particular area is one that I have large interest on, and I wish to see
+more being done on the "local-first" space.