From da68ac2ae2457196360df43fcc4943db672a193b Mon Sep 17 00:00:00 2001 From: EuAndreh Date: Fri, 23 Oct 2020 06:40:12 -0300 Subject: assert-content.sh: Enforce filename to be title slug + date --- _articles/2019-06-02-stateless-os.md | 144 ------------ ...6-02-using-nixos-as-an-stateless-workstation.md | 144 ++++++++++++ ...-08-10-guix-inside-sourcehut-builds-sr-ht-ci.md | 2 +- ...ferences-between-backend-frontend-and-mobile.md | 250 +++++++++++++++++++++ ...ferences-between-backend-frontent-and-mobile.md | 250 --------------------- 5 files changed, 395 insertions(+), 395 deletions(-) delete mode 100644 _articles/2019-06-02-stateless-os.md create mode 100644 _articles/2019-06-02-using-nixos-as-an-stateless-workstation.md create mode 100644 _articles/2020-10-19-feature-flags-differences-between-backend-frontend-and-mobile.md delete mode 100644 _articles/2020-10-19-feature-flags-differences-between-backend-frontent-and-mobile.md (limited to '_articles') diff --git a/_articles/2019-06-02-stateless-os.md b/_articles/2019-06-02-stateless-os.md deleted file mode 100644 index 42c543c..0000000 --- a/_articles/2019-06-02-stateless-os.md +++ /dev/null @@ -1,144 +0,0 @@ ---- -title: Using NixOS as an stateless workstation -date: 2019-06-02 -layout: post -lang: en -ref: using-nixos-as-an-stateless-workstation ---- -Last week[^last-week] I changed back to an old[^old-computer] Samsung laptop, and installed -[NixOS](https://nixos.org/) on it. - -After using NixOS on another laptop for around two years, I wanted -verify how reproducible was my desktop environment, and how far does -NixOS actually can go on recreating my whole OS from my configuration -files and personal data. I gravitated towards NixOS after trying (and -failing) to create an `install.sh` script that would imperatively -install and configure my whole OS using apt-get. When I found a -GNU/Linux distribution that was built on top of the idea of -declaratively specifying the whole OS I was automatically convinced[^convinced-by-declarative-aspect]. - -I was impressed. Even though I've been experiencing the benefits of Nix -isolation daily, I always felt skeptical that something would be -missing, because the devil is always on the details. But the result was -much better than expected! - -There were only 2 missing configurations: - -1. tap-to-click on the touchpad wasn't enabled by default; -2. the default theme from the gnome-terminal is "Black on white" - instead of "White on black". - -That's all. - -I haven't checked if I can configure those in NixOS GNOME module, but I -guess both are scriptable and could be set in a fictional `setup.sh` -run. - -This makes me really happy, actually. More happy than I anticipated. - -Having such a powerful declarative OS makes me feel like my data is the -really important stuff (as it should be), and I can interact with it on -any workstation. All I need is an internet connection and a few hours to -download everything. It feels like my physical workstation and the -installed OS are serving me and my data, instead of me feeling as -hostage to the specific OS configuration at the moment. Having a few -backup copies of everything important extends such peacefulness. - -After this positive experience with recreating my OS from simple Nix -expressions, I started to wonder how far I could go with this, and -started considering other areas of improvements: - -### First run on a fresh NixOS installation - -Right now the initial setup relies on non-declarative manual tasks, like -decrypting some credentials, or manually downloading **this** git -repository with specific configurations before **that** one. - -I wonder what some areas of improvements are on this topic, and if -investing on it is worth it (both time-wise and happiness-wise). - -### Emacs - -Right now I'm using the [Spacemacs](http://spacemacs.org/), which is a -community package curation and configuration on top of -[Emacs](https://www.gnu.org/software/emacs/). - -Spacemacs does support the notion of -[layers](http://spacemacs.org/doc/LAYERS.html), which you can -declaratively specify and let Spacemacs do the rest. - -However this solution isn't nearly as robust as Nix: being purely -functional, Nix does describe everything required to build a derivation, -and knows how to do so. Spacemacs it closer to more traditional package -managers: even though the layers list is declarative, the installation -is still very much imperative. I've had trouble with Spacemacs not -behaving the same on different computers, both with identical -configurations, only brought to convergence back again after a -`git clean -fdx` inside `~/.emacs.d/`. - -The ideal solution would be managing Emacs packages with Nix itself. -After a quick search I did found that [there is support for Emacs -packages in -Nix](https://nixos.org/nixos/manual/index.html#module-services-emacs-adding-packages). -So far I was only aware of [Guix support for Emacs packages](https://www.gnu.org/software/guix/manual/en/html_node/Application-Setup.html#Emacs-Packages). - -This isn't a trivial change because Spacemacs does include extra -curation and configuration on top of Emacs packages. I'm not sure the -best way to improve this right now. - -### myrepos - -I'm using [myrepos](https://myrepos.branchable.com/) to manage all my -git repositories, and the general rule I apply is to add any repository -specific configuration in myrepos' `checkout` phase: - -```shell -# sample ~/.mrconfig file snippet -[dev/guix/guix] -checkout = - git clone https://git.savannah.gnu.org/git/guix.git guix - cd guix/ - git config sendemail.to guix-patches@gnu.org -``` - -This way when I clone this repo again the email sending is already -pre-configured. - -This works well enough, but the solution is too imperative, and my -`checkout` phases tend to become brittle over time if not enough care is -taken. - -### GNU Stow - -For my home profile and personal configuration I already have a few -dozens of symlinks that I manage manually. This has worked so far, but -the solution is sometimes fragile and [not declarative at -all](https://git.sr.ht/~euandreh/dotfiles/tree/316939aa215181b1d22b69e94241eef757add98d/bash/symlinks.sh#L14-75). -I wonder if something like [GNU -Stow](https://www.gnu.org/software/stow/) can help me simplify this. - -## Conclusion - -I'm really satisfied with NixOS, and I intend to keep using it. If what -I've said interests you, maybe try tinkering with the [Nix package -manager](https://nixos.org/nix/) (not the whole NixOS) on your current -distribution (it can live alongside any other package manager). - -If you have experience with declarative Emacs package managements, GNU -Stow or any similar tool, *etc.*, [I'd like some -tips](mailto:eu@euandre.org). If you don't have any experience at all, -[I'd still love to hear from you](mailto:eu@euandre.org). - -[^last-week]: "Last week" as of the start of this writing, so around the end of - May 2019. - -[^old-computer]: I was using a 32GB RAM, i7 and 250GB SSD Samsung laptop. The - switch was back to a 8GB RAM, i5 and 500GB HDD Dell laptop. The biggest - difference I noticed was on faster memory, both RAM availability and the - disk speed, but I had 250GB less local storage space. - -[^convinced-by-declarative-aspect]: The declarative configuration aspect is - something that I now completely take for granted, and wouldn't consider - using something which isn't declarative. A good metric to show this is me - realising that I can't pinpoint the moment when I decided to switch to - NixOS. It's like I had a distant past when this wasn't true. diff --git a/_articles/2019-06-02-using-nixos-as-an-stateless-workstation.md b/_articles/2019-06-02-using-nixos-as-an-stateless-workstation.md new file mode 100644 index 0000000..42c543c --- /dev/null +++ b/_articles/2019-06-02-using-nixos-as-an-stateless-workstation.md @@ -0,0 +1,144 @@ +--- +title: Using NixOS as an stateless workstation +date: 2019-06-02 +layout: post +lang: en +ref: using-nixos-as-an-stateless-workstation +--- +Last week[^last-week] I changed back to an old[^old-computer] Samsung laptop, and installed +[NixOS](https://nixos.org/) on it. + +After using NixOS on another laptop for around two years, I wanted +verify how reproducible was my desktop environment, and how far does +NixOS actually can go on recreating my whole OS from my configuration +files and personal data. I gravitated towards NixOS after trying (and +failing) to create an `install.sh` script that would imperatively +install and configure my whole OS using apt-get. When I found a +GNU/Linux distribution that was built on top of the idea of +declaratively specifying the whole OS I was automatically convinced[^convinced-by-declarative-aspect]. + +I was impressed. Even though I've been experiencing the benefits of Nix +isolation daily, I always felt skeptical that something would be +missing, because the devil is always on the details. But the result was +much better than expected! + +There were only 2 missing configurations: + +1. tap-to-click on the touchpad wasn't enabled by default; +2. the default theme from the gnome-terminal is "Black on white" + instead of "White on black". + +That's all. + +I haven't checked if I can configure those in NixOS GNOME module, but I +guess both are scriptable and could be set in a fictional `setup.sh` +run. + +This makes me really happy, actually. More happy than I anticipated. + +Having such a powerful declarative OS makes me feel like my data is the +really important stuff (as it should be), and I can interact with it on +any workstation. All I need is an internet connection and a few hours to +download everything. It feels like my physical workstation and the +installed OS are serving me and my data, instead of me feeling as +hostage to the specific OS configuration at the moment. Having a few +backup copies of everything important extends such peacefulness. + +After this positive experience with recreating my OS from simple Nix +expressions, I started to wonder how far I could go with this, and +started considering other areas of improvements: + +### First run on a fresh NixOS installation + +Right now the initial setup relies on non-declarative manual tasks, like +decrypting some credentials, or manually downloading **this** git +repository with specific configurations before **that** one. + +I wonder what some areas of improvements are on this topic, and if +investing on it is worth it (both time-wise and happiness-wise). + +### Emacs + +Right now I'm using the [Spacemacs](http://spacemacs.org/), which is a +community package curation and configuration on top of +[Emacs](https://www.gnu.org/software/emacs/). + +Spacemacs does support the notion of +[layers](http://spacemacs.org/doc/LAYERS.html), which you can +declaratively specify and let Spacemacs do the rest. + +However this solution isn't nearly as robust as Nix: being purely +functional, Nix does describe everything required to build a derivation, +and knows how to do so. Spacemacs it closer to more traditional package +managers: even though the layers list is declarative, the installation +is still very much imperative. I've had trouble with Spacemacs not +behaving the same on different computers, both with identical +configurations, only brought to convergence back again after a +`git clean -fdx` inside `~/.emacs.d/`. + +The ideal solution would be managing Emacs packages with Nix itself. +After a quick search I did found that [there is support for Emacs +packages in +Nix](https://nixos.org/nixos/manual/index.html#module-services-emacs-adding-packages). +So far I was only aware of [Guix support for Emacs packages](https://www.gnu.org/software/guix/manual/en/html_node/Application-Setup.html#Emacs-Packages). + +This isn't a trivial change because Spacemacs does include extra +curation and configuration on top of Emacs packages. I'm not sure the +best way to improve this right now. + +### myrepos + +I'm using [myrepos](https://myrepos.branchable.com/) to manage all my +git repositories, and the general rule I apply is to add any repository +specific configuration in myrepos' `checkout` phase: + +```shell +# sample ~/.mrconfig file snippet +[dev/guix/guix] +checkout = + git clone https://git.savannah.gnu.org/git/guix.git guix + cd guix/ + git config sendemail.to guix-patches@gnu.org +``` + +This way when I clone this repo again the email sending is already +pre-configured. + +This works well enough, but the solution is too imperative, and my +`checkout` phases tend to become brittle over time if not enough care is +taken. + +### GNU Stow + +For my home profile and personal configuration I already have a few +dozens of symlinks that I manage manually. This has worked so far, but +the solution is sometimes fragile and [not declarative at +all](https://git.sr.ht/~euandreh/dotfiles/tree/316939aa215181b1d22b69e94241eef757add98d/bash/symlinks.sh#L14-75). +I wonder if something like [GNU +Stow](https://www.gnu.org/software/stow/) can help me simplify this. + +## Conclusion + +I'm really satisfied with NixOS, and I intend to keep using it. If what +I've said interests you, maybe try tinkering with the [Nix package +manager](https://nixos.org/nix/) (not the whole NixOS) on your current +distribution (it can live alongside any other package manager). + +If you have experience with declarative Emacs package managements, GNU +Stow or any similar tool, *etc.*, [I'd like some +tips](mailto:eu@euandre.org). If you don't have any experience at all, +[I'd still love to hear from you](mailto:eu@euandre.org). + +[^last-week]: "Last week" as of the start of this writing, so around the end of + May 2019. + +[^old-computer]: I was using a 32GB RAM, i7 and 250GB SSD Samsung laptop. The + switch was back to a 8GB RAM, i5 and 500GB HDD Dell laptop. The biggest + difference I noticed was on faster memory, both RAM availability and the + disk speed, but I had 250GB less local storage space. + +[^convinced-by-declarative-aspect]: The declarative configuration aspect is + something that I now completely take for granted, and wouldn't consider + using something which isn't declarative. A good metric to show this is me + realising that I can't pinpoint the moment when I decided to switch to + NixOS. It's like I had a distant past when this wasn't true. diff --git a/_articles/2020-08-10-guix-inside-sourcehut-builds-sr-ht-ci.md b/_articles/2020-08-10-guix-inside-sourcehut-builds-sr-ht-ci.md index c699df4..4d7e8d5 100644 --- a/_articles/2020-08-10-guix-inside-sourcehut-builds-sr-ht-ci.md +++ b/_articles/2020-08-10-guix-inside-sourcehut-builds-sr-ht-ci.md @@ -4,7 +4,7 @@ date: 2020-08-10 updated_at: 2020-08-19 layout: post lang: en -ref: guix-inside-sourcehut-buildssrht-ci +ref: guix-inside-sourcehut-builds-sr-ht-ci --- After the release of the [NixOS images in builds.sr.ht][0] and much usage of it, I also started looking at [Guix][1] and diff --git a/_articles/2020-10-19-feature-flags-differences-between-backend-frontend-and-mobile.md b/_articles/2020-10-19-feature-flags-differences-between-backend-frontend-and-mobile.md new file mode 100644 index 0000000..74d4e15 --- /dev/null +++ b/_articles/2020-10-19-feature-flags-differences-between-backend-frontend-and-mobile.md @@ -0,0 +1,250 @@ +--- +title: "Feature flags: differences between backend, frontend and mobile" +date: 2020-10-19 +layout: post +lang: en +ref: feature-flags-differences-between-backend-frontend-and-mobile +category: presentation +--- + +*This article is derived from a [presentation][presentation] on the same +subject.* + +When talking about [feature flags][feature-flags-article], I find that their +costs and benefits are often well exposed and addressed. However the weight of those +costs and benefits apply differently on backend, frontend or mobile, and those +differences aren't covered. + +I'll try to make this distinction clear, with some final best practices I've +acquired when using them in production. + +[presentation]: {% link _slides/2020-10-19-rollout-feature-flag-experiment-operational-toggle.slides %} +[feature-flags-article]: https://martinfowler.com/articles/feature-toggles.html + +## Why feature flags + +Feature flags in general tend to be cited on the context of +[continuous deployment][cd]: + +> A: With continuous deployment, you deploy to production automatically + +> B: But how do I handle deployment failures, partial features, *etc.*? + +> A: With techniques like canary, monitoring and alarms, feature flags, *etc.* + +Even though adopting continuous deployment doesn't force you to use feature +flags, it creates a demand for it. The inverse is also true: using feature flags +on the code points you more obviously to continuous deployment. + +But you should consider feature flags solely by taking into account this +distilled trade-off analysis: + +> Am I willing to pay with code complexity to get dynamicity? + +It is true that you can make the management of feature flags as +straightforward as possible, but having no feature flags is simpler than having +any. What you get in return is the ability to parameterize the behaviour of the +application at runtime, without doing any code changes. + +Sometimes this added complexity may tilt the balance towards not using a feature +flag, and sometimes the flexibility of changing behaviour at runtime is +absolutely worth the added complexity. This can vary a lot by code base, feature, but +fundamentally by environment: its much cheaper to deploy a new version of a +service than to release a new version of an app. + +[cd]: https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment + +## Control over the environment + +The key differentiator that makes the trade-offs apply differently is how much +control you have over the environment. + +When running a **backend** service, you usually are paying for the servers +themselves, and can tweak them as you wish. This means you have full control do +to code changes as you wish. Not only that, you decide when to do it, and for +how long the transition will last. + +On the **frontend** you have less control: even though you can choose to make a +new version available any time you wish, you can't force[^force] clients to +immediately switch to the new version. That means that a) clients could skip +upgrades at any time and b) you always have to keep backward and forward +compatibility in mind. + +Even though I'm mentioning frontend directly, it applies to other environment +with similar characteristics: desktop applications, command-line programs, +*etc*. + +On **mobile** you have even less control: app stores need to allow your app to +be updated, which could bite you when least desired. Theoretically you could +make you APK available on third party stores like [F-Droid][f-droid], or even +make the APK itself available for direct download, which would give you the same +characteristics of a frontend application, but that happens less often. + +On iOS you can't even do that. You have to get Apple's blessing on every single +update. Even though we already know that is a [bad idea][apple] for over a +decade now, there isn't a way around it. This is where you have the least +control. + +In practice, the amount of control you have will change how much you value +dynamicity: the less control you have, the more valuable it is. In other words, +having a dynamic flag on the backend may or may not be worth it since you could +always update the code immediately after, but on iOS it is basically always +worth it. + +[f-droid]: https://f-droid.org/ +[^force]: Technically you could force a reload with JavaScript using + `window.location.reload()`, but that not only is invasive and impolite, but + also gives you the illusion that you have control over the client when you + actually don't: clients with disabled JavaScript would be immune to such + tactics. + +[apple]: http://www.paulgraham.com/apple.html + +## Rollout + +A rollout is used to *roll out* a new version of software. + +They are usually short-lived, being relevant as long as the new code is being +deployed. The most common rule is percentages. + +On the **backend**, it is common to find it on the deployment infrastructure +itself, like canary servers, blue/green deployments, +[a kubernetes deployment rollout][k8s], *etc*. You could do those manually, by +having a dynamic control on the code itself, but rollbacks are cheap enough that +people usually do a normal deployment and just give some extra attention to the +metrics dashboard. + +On the **frontend**, CDN propagation delays and people not refreshing their web +pages are rollouts by themselves. You could do this by geographical region or +something similar, if desired. + +On **mobile**, the Play Store allows you to perform +fine-grained [staged rollouts][staged-rollouts], and the App Store allows you to +perform limited [phased releases][phased-releases]. + +[k8s]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#creating-a-deployment +[staged-rollouts]: https://support.google.com/googleplay/android-developer/answer/6346149?hl=en +[phased-releases]: https://help.apple.com/app-store-connect/#/dev3d65fcee1 + +## Feature flag + +A feature flag is a *flag* that tells the application on runtime to turn on or +off a given *feature*. That means that the actual production code will have more +than one possible code paths to go through, and that a new version of a feature +coexists with the old version. The feature flag tells which part of the code to +go through. + +They are usually medium-lived, being relevant as long as the new code is being +developed. The most common rules are percentages, allow/deny lists, A/B groups +and client version. + +On the **backend**, those are useful for things that have a long development +cycle, or that needs to done by steps. Consider loading the feature flag rules +in memory when the application starts, so that you avoid querying a database +or an external service for applying a feature flag rule and avoid flakiness on +the result due to intermittent network failures. + +Since on the **frontend** you don't control when to update the client software, +you're left with applying the feature flag rule on the server, and exposing the +value through an API for maximum dynamicity. This could be in the frontend code +itself, and fallback to a "just refresh the page"/"just update to the latest +version" strategy for less dynamic scenarios. + +On **mobile** you can't even rely on a "just update to the latest version" +strategy, since the code for the app could be updated to a new feature and be +blocked on the store. Those cases aren't recurrent, but you should always assume +the store will deny updates on critical moments so you don't find yourself with +no cards to play. That means the only control you actually have is via +the backend, by parameterizing the runtime of the application using the API. In +practice, you should always have a feature flag to control any relevant piece of +code. There is no such thing as "too small code change for a feature flag". What +you should ask yourself is: + +> If the code I'm writing breaks and stays broken for around a month, do I care? + +If you're doing an experimental screen, or something that will have a very small +impact you might answer "no" to the above question. For everything else, the +answer will be "yes": bug fixes, layout changes, refactoring, new screen, +filesystem/database changes, *etc*. + +## Experiment + +An experiment is a feature flag where you care about analytical value of the +flag, and how it might impact user's behaviour. A feature flag with analytics. + +They are also usually medium-lived, being relevant as long as the new code is +being developed. The most common rule is A/B test. + +On the **backend**, an experiment rely on an analytical environment that will +pick the A/B test groups and distributions, which means those can't be held in +memory easily. That also means that you'll need a fallback value in case +fetching the group for a given customer fails. + +On the **frontend** and on **mobile** they are no different from feature flags. + +## Operational toggle + +An operational toggle is like a system-level manual circuit breaker, where you +turn on/off a feature, fail over the load to a different server, *etc*. They are +useful switches to have during an incident. + +They are usually long-lived, being relevant as long as the code is in +production. The most common rule is percentages. + +They can be feature flags that are promoted to operational toggles on the +**backend**, or may be purposefully put in place preventively or after a +postmortem analysis. + +On the **frontend** and on **mobile** they are similar to feature flags, where +the "feature" is being turned on and off, and the client interprets this value +to show if the "feature" is available or unavailable. + +## Best practices + +### Prefer dynamic content + +Even though feature flags give you more dynamicity, they're still somewhat +manual: you have to create one for a specific feature and change it by hand. + +If you find yourself manually updating a feature flags every other day, or +tweaking the percentages frequently, consider making it fully dynamic. Try +using a dataset that is generated automatically, or computing the content on the +fly. + +Say you have a configuration screen with a list of options and sub-options, and +you're trying to find how to better structure this list. Instead of using a +feature flag for switching between 3 and 5 options, make it fully dynamic. This +way you'll be able to perform other tests that you didn't plan, and get more +flexibility out of it. + +### Use the client version to negotiate feature flags + +After effectively finishing a feature, the old code that coexisted with the new +one will be deleted, and all traces of the transition will vanish from the code +base. However if you just remove the feature flags from the API, all of the old +versions of clients that relied on that value to show the new feature will go +downgrade to the old feature. + +This means that you should avoid deleting client-facing feature flags, and +retire them instead: use the client version to decide when the feature is +stable, and return `true` for every client with a version greater or equal to +that. This way you can stop thinking about the feature flag, and you don't break +or downgrade clients that didn't upgrade past the transition. + +### Beware of many nested feature flags + +Nested flags combine exponentially. + +Pick strategic entry points or transitions eligible for feature flags, and +beware of their nesting. + +### Include feature flags in the development workflow + +Add feature flags to the list of things to think about during whiteboarding, and +deleting/retiring a feature flags at the end of the development. + +### Always rely on a feature flag on the app + +Again, there is no such thing "too small for a feature flag". Too many feature +flags is a good problem to have, not the opposite. Automate the process of +creating a feature flag to lower its cost. diff --git a/_articles/2020-10-19-feature-flags-differences-between-backend-frontent-and-mobile.md b/_articles/2020-10-19-feature-flags-differences-between-backend-frontent-and-mobile.md deleted file mode 100644 index 74d4e15..0000000 --- a/_articles/2020-10-19-feature-flags-differences-between-backend-frontent-and-mobile.md +++ /dev/null @@ -1,250 +0,0 @@ ---- -title: "Feature flags: differences between backend, frontend and mobile" -date: 2020-10-19 -layout: post -lang: en -ref: feature-flags-differences-between-backend-frontend-and-mobile -category: presentation ---- - -*This article is derived from a [presentation][presentation] on the same -subject.* - -When talking about [feature flags][feature-flags-article], I find that their -costs and benefits are often well exposed and addressed. However the weight of those -costs and benefits apply differently on backend, frontend or mobile, and those -differences aren't covered. - -I'll try to make this distinction clear, with some final best practices I've -acquired when using them in production. - -[presentation]: {% link _slides/2020-10-19-rollout-feature-flag-experiment-operational-toggle.slides %} -[feature-flags-article]: https://martinfowler.com/articles/feature-toggles.html - -## Why feature flags - -Feature flags in general tend to be cited on the context of -[continuous deployment][cd]: - -> A: With continuous deployment, you deploy to production automatically - -> B: But how do I handle deployment failures, partial features, *etc.*? - -> A: With techniques like canary, monitoring and alarms, feature flags, *etc.* - -Even though adopting continuous deployment doesn't force you to use feature -flags, it creates a demand for it. The inverse is also true: using feature flags -on the code points you more obviously to continuous deployment. - -But you should consider feature flags solely by taking into account this -distilled trade-off analysis: - -> Am I willing to pay with code complexity to get dynamicity? - -It is true that you can make the management of feature flags as -straightforward as possible, but having no feature flags is simpler than having -any. What you get in return is the ability to parameterize the behaviour of the -application at runtime, without doing any code changes. - -Sometimes this added complexity may tilt the balance towards not using a feature -flag, and sometimes the flexibility of changing behaviour at runtime is -absolutely worth the added complexity. This can vary a lot by code base, feature, but -fundamentally by environment: its much cheaper to deploy a new version of a -service than to release a new version of an app. - -[cd]: https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment - -## Control over the environment - -The key differentiator that makes the trade-offs apply differently is how much -control you have over the environment. - -When running a **backend** service, you usually are paying for the servers -themselves, and can tweak them as you wish. This means you have full control do -to code changes as you wish. Not only that, you decide when to do it, and for -how long the transition will last. - -On the **frontend** you have less control: even though you can choose to make a -new version available any time you wish, you can't force[^force] clients to -immediately switch to the new version. That means that a) clients could skip -upgrades at any time and b) you always have to keep backward and forward -compatibility in mind. - -Even though I'm mentioning frontend directly, it applies to other environment -with similar characteristics: desktop applications, command-line programs, -*etc*. - -On **mobile** you have even less control: app stores need to allow your app to -be updated, which could bite you when least desired. Theoretically you could -make you APK available on third party stores like [F-Droid][f-droid], or even -make the APK itself available for direct download, which would give you the same -characteristics of a frontend application, but that happens less often. - -On iOS you can't even do that. You have to get Apple's blessing on every single -update. Even though we already know that is a [bad idea][apple] for over a -decade now, there isn't a way around it. This is where you have the least -control. - -In practice, the amount of control you have will change how much you value -dynamicity: the less control you have, the more valuable it is. In other words, -having a dynamic flag on the backend may or may not be worth it since you could -always update the code immediately after, but on iOS it is basically always -worth it. - -[f-droid]: https://f-droid.org/ -[^force]: Technically you could force a reload with JavaScript using - `window.location.reload()`, but that not only is invasive and impolite, but - also gives you the illusion that you have control over the client when you - actually don't: clients with disabled JavaScript would be immune to such - tactics. - -[apple]: http://www.paulgraham.com/apple.html - -## Rollout - -A rollout is used to *roll out* a new version of software. - -They are usually short-lived, being relevant as long as the new code is being -deployed. The most common rule is percentages. - -On the **backend**, it is common to find it on the deployment infrastructure -itself, like canary servers, blue/green deployments, -[a kubernetes deployment rollout][k8s], *etc*. You could do those manually, by -having a dynamic control on the code itself, but rollbacks are cheap enough that -people usually do a normal deployment and just give some extra attention to the -metrics dashboard. - -On the **frontend**, CDN propagation delays and people not refreshing their web -pages are rollouts by themselves. You could do this by geographical region or -something similar, if desired. - -On **mobile**, the Play Store allows you to perform -fine-grained [staged rollouts][staged-rollouts], and the App Store allows you to -perform limited [phased releases][phased-releases]. - -[k8s]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#creating-a-deployment -[staged-rollouts]: https://support.google.com/googleplay/android-developer/answer/6346149?hl=en -[phased-releases]: https://help.apple.com/app-store-connect/#/dev3d65fcee1 - -## Feature flag - -A feature flag is a *flag* that tells the application on runtime to turn on or -off a given *feature*. That means that the actual production code will have more -than one possible code paths to go through, and that a new version of a feature -coexists with the old version. The feature flag tells which part of the code to -go through. - -They are usually medium-lived, being relevant as long as the new code is being -developed. The most common rules are percentages, allow/deny lists, A/B groups -and client version. - -On the **backend**, those are useful for things that have a long development -cycle, or that needs to done by steps. Consider loading the feature flag rules -in memory when the application starts, so that you avoid querying a database -or an external service for applying a feature flag rule and avoid flakiness on -the result due to intermittent network failures. - -Since on the **frontend** you don't control when to update the client software, -you're left with applying the feature flag rule on the server, and exposing the -value through an API for maximum dynamicity. This could be in the frontend code -itself, and fallback to a "just refresh the page"/"just update to the latest -version" strategy for less dynamic scenarios. - -On **mobile** you can't even rely on a "just update to the latest version" -strategy, since the code for the app could be updated to a new feature and be -blocked on the store. Those cases aren't recurrent, but you should always assume -the store will deny updates on critical moments so you don't find yourself with -no cards to play. That means the only control you actually have is via -the backend, by parameterizing the runtime of the application using the API. In -practice, you should always have a feature flag to control any relevant piece of -code. There is no such thing as "too small code change for a feature flag". What -you should ask yourself is: - -> If the code I'm writing breaks and stays broken for around a month, do I care? - -If you're doing an experimental screen, or something that will have a very small -impact you might answer "no" to the above question. For everything else, the -answer will be "yes": bug fixes, layout changes, refactoring, new screen, -filesystem/database changes, *etc*. - -## Experiment - -An experiment is a feature flag where you care about analytical value of the -flag, and how it might impact user's behaviour. A feature flag with analytics. - -They are also usually medium-lived, being relevant as long as the new code is -being developed. The most common rule is A/B test. - -On the **backend**, an experiment rely on an analytical environment that will -pick the A/B test groups and distributions, which means those can't be held in -memory easily. That also means that you'll need a fallback value in case -fetching the group for a given customer fails. - -On the **frontend** and on **mobile** they are no different from feature flags. - -## Operational toggle - -An operational toggle is like a system-level manual circuit breaker, where you -turn on/off a feature, fail over the load to a different server, *etc*. They are -useful switches to have during an incident. - -They are usually long-lived, being relevant as long as the code is in -production. The most common rule is percentages. - -They can be feature flags that are promoted to operational toggles on the -**backend**, or may be purposefully put in place preventively or after a -postmortem analysis. - -On the **frontend** and on **mobile** they are similar to feature flags, where -the "feature" is being turned on and off, and the client interprets this value -to show if the "feature" is available or unavailable. - -## Best practices - -### Prefer dynamic content - -Even though feature flags give you more dynamicity, they're still somewhat -manual: you have to create one for a specific feature and change it by hand. - -If you find yourself manually updating a feature flags every other day, or -tweaking the percentages frequently, consider making it fully dynamic. Try -using a dataset that is generated automatically, or computing the content on the -fly. - -Say you have a configuration screen with a list of options and sub-options, and -you're trying to find how to better structure this list. Instead of using a -feature flag for switching between 3 and 5 options, make it fully dynamic. This -way you'll be able to perform other tests that you didn't plan, and get more -flexibility out of it. - -### Use the client version to negotiate feature flags - -After effectively finishing a feature, the old code that coexisted with the new -one will be deleted, and all traces of the transition will vanish from the code -base. However if you just remove the feature flags from the API, all of the old -versions of clients that relied on that value to show the new feature will go -downgrade to the old feature. - -This means that you should avoid deleting client-facing feature flags, and -retire them instead: use the client version to decide when the feature is -stable, and return `true` for every client with a version greater or equal to -that. This way you can stop thinking about the feature flag, and you don't break -or downgrade clients that didn't upgrade past the transition. - -### Beware of many nested feature flags - -Nested flags combine exponentially. - -Pick strategic entry points or transitions eligible for feature flags, and -beware of their nesting. - -### Include feature flags in the development workflow - -Add feature flags to the list of things to think about during whiteboarding, and -deleting/retiring a feature flags at the end of the development. - -### Always rely on a feature flag on the app - -Again, there is no such thing "too small for a feature flag". Too many feature -flags is a good problem to have, not the opposite. Automate the process of -creating a feature flag to lower its cost. -- cgit v1.2.3