aboutsummaryrefslogtreecommitdiff
path: root/_articles
diff options
context:
space:
mode:
Diffstat (limited to '_articles')
-rw-r--r--_articles/2020-10-12-feature-flags-differences-between-backend-frontent-and-mobile.md168
1 files changed, 160 insertions, 8 deletions
diff --git a/_articles/2020-10-12-feature-flags-differences-between-backend-frontent-and-mobile.md b/_articles/2020-10-12-feature-flags-differences-between-backend-frontent-and-mobile.md
index d4bc066..1376315 100644
--- a/_articles/2020-10-12-feature-flags-differences-between-backend-frontent-and-mobile.md
+++ b/_articles/2020-10-12-feature-flags-differences-between-backend-frontent-and-mobile.md
@@ -55,11 +55,150 @@ service than to release a new version of an app.
[cd]: https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment
+## Control over the environment
+
+The key differentiator that makes the trade-offs apply differently is how much
+control you have over the environment.
+
+When running a **backend** service, you usually are paying for the servers
+themselves, and can tweak them as you wish. This means you have full control do
+to code changes as you wish. Not only that, you decide when to do it, and for
+how long the transition will last.
+
+On the **frontend** you have less control: even though you can choose to make a
+new version available any time you wish, you can't force[^force] clients to
+immediately switch to the new version. That means that a) clients could skip
+upgrades at any time and b) you always have to keep backward and forward
+compatibility in mind.
+
+Even though I'm mentioning frontend directly, it applies to other environment
+with similar characteristics: desktop applications, command-line programs,
+*etc*.
+
+On **mobile** you have even less control: app stores need to allow your app to
+be updated, which could bite you when least desired. Theoretically you could
+make you APK available on third party stores like [F-Droid][f-droid], or even
+make the APK itself available for direct download, which you give the same
+characteristics of a frontend application, but that happens less often.
+
+On iOS you can't even do that. You have to get Apple's blessing on every single
+update. Even though we already know that is a [bad idea][apple] for over a
+decade now, there isn't a way around it. This is where you have the least
+control.
+
+In practice, the amount of control you have will change how much you value
+dynamicity: the less control you have, the more valuable it is. In other words,
+having a dynamic flag on the backend may or may not be worth it since you could
+always update the code immediately after, but on iOS it is basically always
+worth it.
+
+[^force]: Technically you could force a reload with JavaScript using
+ `window.location.reload()`, but that not only is invasive and impolite, but
+ also gives you the illusion that you have control over the client when you
+ actually don't: clients with disabled JavaScript would be immune to such
+ tactics.
+[apple]: http://www.paulgraham.com/apple.html
+
## Rollout
+
+A rollout is used to *roll out* a new version of software.
+
+They are usually short-lived, being relevant as long as the new code is being
+deployed. The most common rule is percentages.
+
+On the **backend**, it is common to find it on the deployment infrastructure
+itself, like canary servers, blue/green deployments,
+[a kubernetes deployment rollout][k8s], *etc*. You could do those manually, by
+having a dynamic control on the code itself, but rollbacks are cheap enough that
+people usually do a normal deployment and just give some extra attention to the
+metrics dashboard.
+
+On the **frontend**, CDN propagation delays and people not refreshing their web
+pages are rollouts by themselves. You could do this by geographical region or
+something similar, if desired.
+
+On **mobile**, the Play Store allows you to perform
+fine-grained [staged rollouts][staged-rollouts], and the App Store allows you to
+perform limited [phased releases][phased-releases].
+
+[k8s]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#creating-a-deployment
+[staged-rollouts]: https://support.google.com/googleplay/android-developer/answer/6346149?hl=en
+[phased-releases]: https://help.apple.com/app-store-connect/#/dev3d65fcee1
+
## Feature flag
+
+A feature flag is a *flag* that tells the application on runtime to turn on or
+off a given *feature*. That means that the actual production code will have more
+than one possible code paths to go through, and that a new version of a feature
+coexists with the old version. The feature flag tells which part of the code to
+go through.
+
+They are usually medium-lived, being relevant as long as the new code is being
+developed. The most common rules are percentages, allow/deny lists, A/B groups
+and client version.
+
+On the **backend**, those are useful for things that have a long development
+cycle, or that needs to done by steps. Consider loading the feature flag rules
+in memory when the application starts, so that you avoid querying the database
+or an external service for applying a feature flag rule to avoid intermittent
+network failures.
+
+Since on the **frontend** you don't control when to update the client software,
+you're left with applying the feature flag rule on the server, and exposing the
+value through an API for maximum dynamicity. This could be in the frontend code
+itself, and fallback to a "just refresh the page"/"just update to the latest
+version" strategy for less dynamic scenarios.
+
+On **mobile** you can't even rely on a "just update to the latest version"
+strategy, since the code for the app could be updated to a new feature but it
+can't get through the store. Those cases aren't recurrent, but you should always
+assume the store will deny updates on critical moments so you don't find
+yourself with no cards to play. That means the only control you actually have is
+remote via the backend, and parameterize the runtime of the application using
+the API. In practice, you should always have a feature flag to control any relevant
+piece of code. There is no such thing as "too small code change for a feature
+flag". What you should ask yourself is:
+
+> If the code I'm writing breaks and stays broken for around a month, do I care?
+
+If you're doing an experimental screen, or something that will have a very small
+impact you might answer "no" to the above question. For everything else, the
+answer will be "yes": bug fixes, layout changes, refactoring, new screen,
+filesystem/database changes, *etc*.
+
## Experiment
+
+An experiment is a feature flag where you care about analytical value of the
+flag, and how it might impact user's behaviour. Like a feature flag with
+analytics.
+
+They are also usually medium-lived, being relevant as long as the new code is
+being developed. The most common rule is A/B test.
+
+On the **backend**, experiment rely on an analytical environment that will
+pick the A/B test groups and distributions, which means those can't be held in
+memory easily. That also means that you'll need a fallback value in case
+fetching the group for a given customer fails.
+
+On the **frontend** and on **mobile** they are no different from feature flags.
+
## Operational toggle
+An operational toggle is like a system-level manual circuit breaker, where you
+turn on/off a feature, or fail over the load to a different server. They are
+useful switches to have during an incident.
+
+They are usually long-lived, being relevant as long as the code is in
+production. The most common rule is percentages.
+
+They can be feature flags that are promoted to operational toggles on the
+**backend**, or may be purposefully put in place preventively or after a
+postmortem analysis.
+
+On the **frontend** and on **mobile** they are similar to feature flags, where
+the "feature" is being turned on and off, and the client interprets this value
+to show if the "feature" is available or unavailable.
+
## Best practices
### Prefer dynamic content
@@ -78,20 +217,33 @@ feature flag for switching between 3 and 5 options, make it fully dynamic. This
way you'll be able to perform other tests that you didn't plan, and get more
flexibility out of it.
-### Use :include-list for named groups
+### Use the client version to negotiate feature flags
-### Always use :app-version
+After effectively finishing a feature, the old code that coexisted with the new
+one will be deleted, and all traces of the transition will vanish from the code
+base. However if you just remove the feature flags from the API, all of the old
+versions of clients that relied on that value to show the new feature will go
+downgrade to the old feature.
-Don't delete app-facing feature flags
+This means that you should avoid deleting client-facing feature flags, and
+retire them instead: use the client version to decide when the feature is
+stable, and return `true` for every client with a version greater or equal to
+that. This way you can stop thinking about the feature flag, and you don't break
+or downgrade clients that didn't upgrade past the transition.
### Beware of many nested feature flags
-### Include a feature flag on the whiteboarding phase
+Nested flags combine exponentially.
-### Include deleting/retiring the feature flag at the end
+Pick strategic entry points or transitions eligible for feature flags, and
+beware of their nesting.
-### Always rely on a feature flag on the app
+### Include feature flags in the development workflow
-There is no such thing as
+Add feature flags to the list of things to think about during whiteboarding, and
+deleting/retiring a feature flags at the end of the development.
-[apple]: http://www.paulgraham.com/apple.html
+### Always rely on a feature flag on the app
+
+Again, there is no such thing "too small for a feature flag". Too many feature
+flags is a good problem to have, not the opposite.