summaryrefslogtreecommitdiff
path: root/src/content/en/blog/2020/10/19/feature-flags.adoc
diff options
context:
space:
mode:
Diffstat (limited to 'src/content/en/blog/2020/10/19/feature-flags.adoc')
-rw-r--r--src/content/en/blog/2020/10/19/feature-flags.adoc306
1 files changed, 306 insertions, 0 deletions
diff --git a/src/content/en/blog/2020/10/19/feature-flags.adoc b/src/content/en/blog/2020/10/19/feature-flags.adoc
new file mode 100644
index 0000000..972f693
--- /dev/null
+++ b/src/content/en/blog/2020/10/19/feature-flags.adoc
@@ -0,0 +1,306 @@
+= Feature flags: differences between backend, frontend and mobile
+:categories: presentation
+:updatedat: 2020-11-03
+
+:empty:
+:slides: link:../../../../slides/2020/10/19/feature-flags.html FIXME
+:fowler-article: https://martinfowler.com/articles/feature-toggles.html
+
+_This article is derived from a {slides}[presentation] on the same subject._
+
+When discussing about feature flags, I find that their costs and benefits are
+often well exposed and addressed. Online articles like
+"{fowler-article}[Feature Toggle (aka Feature Flags)]" do a great job of
+explaining them in detail, giving great general guidance of how to apply
+techniques to adopt it.
+
+However the weight of those costs and benefits apply differently on backend,
+frontend or mobile, and those differences aren't covered. In fact, many of them
+stop making sense, or the decision of adopting a feature flag or not may change
+depending on the environment.
+
+In this article I try to make the distinction between environments and how
+feature flags apply to them, with some final best practices I've acquired when
+using them in production.
+
+== Why feature flags
+
+:atlassian-cicd: https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment
+
+Feature flags in general tend to be cited on the context of
+{atlassian-cicd}[continuous deployment]:
+
+____
+A: With continuous deployment, you deploy to production automatically
+
+B: But how do I handle deployment failures, partial features, _etc._?
+
+A: With techniques like canary, monitoring and alarms, feature flags, _etc._
+____
+
+Though adopting continuous deployment doesn't force you to use feature flags, it
+creates a demand for it. The inverse is also true: using feature flags on the
+code points you more obviously to continuous deployment. Take the following
+code sample for example, that we will reference later on the article:
+
+[source,javascript]
+----
+function processTransaction() {
+ validate();
+ persist();
+ // TODO: add call to notifyListeners()
+}
+----
+
+While being developed, being tested for suitability or something similar,
+`notifyListeners()` may not be included in the code at once. So instead of
+keeping it on a separate, long-lived branch, a feature flag can decide when the
+new, partially implemented function will be called:
+
+[source,javascript]
+----
+function processTransaction() {
+ validate();
+ persist();
+ if (featureIsEnabled("activate-notify-listeners")) {
+ notifyListeners();
+ }
+}
+----
+
+This allows your code to include `notifyListeners()`, and decide when to call it
+at runtime. For the price of extra things around the code, you get more
+dynamicity.
+
+So the fundamental question to ask yourself when considering adding a feature
+flag should be:
+
+____
+Am I willing to pay with code complexity to get dynamicity?
+____
+
+It is true that you can make the management of feature flags as straightforward
+as possible, but having no feature flags is simpler than having any. What you
+get in return is the ability to parameterize the behaviour of the application at
+runtime, without doing any code changes.
+
+Sometimes this added complexity may tilt the balance towards not using a feature
+flag, and sometimes the flexibility of changing behaviour at runtime is
+absolutely worth the added complexity. This can vary a lot by code base,
+feature, but fundamentally by environment: its much cheaper to deploy a new
+version of a service than to release a new version of an app.
+
+So the question of which environment is being targeted is key when reasoning
+about costs and benefits of feature flags.
+
+== Control over the environment
+
+:fdroid: https://f-droid.org/
+:bad-apple: https://www.paulgraham.com/apple.html
+
+The key differentiator that makes the trade-offs apply differently is how much
+control you have over the environment.
+
+When running a *backend* service, you usually are paying for the servers
+themselves, and can tweak them as you wish. This means you have full control do
+to code changes as you wish. Not only that, you decide when to do it, and for
+how long the transition will last.
+
+On the *frontend* you have less control: even though you can choose to make a
+new version available any time you wish, you can't
+force{empy}footnote:force[
+ Technically you could force a reload with JavaScript using
+ `window.location.reload()`, but that not only is invasive and impolite, but
+ also gives you the illusion that you have control over the client when you
+ actually don't: clients with disabled JavaScript would be immune to such
+ tactics.
+] clients to immediately switch to the new version. That means that a) clients
+could skip upgrades at any time and b) you always have to keep backward and
+forward compatibility in mind.
+
+Even though I'm mentioning frontend directly, it applies to other environment
+with similar characteristics: desktop applications, command-line programs,
+_etc_.
+
+On *mobile* you have even less control: app stores need to allow your app to be
+updated, which could bite you when least desired. Theoretically you could make
+you APK available on third party stores like {fdroid}[F-Droid], or even make the
+APK itself available for direct download, which would give you the same
+characteristics of a frontend application, but that happens less often.
+
+On iOS you can't even do that. You have to get Apple's blessing on every single
+update. Even though we already know that is a {bad-apple}[bad idea] for over a
+decade now, there isn't a way around it. This is where you have the least
+control.
+
+In practice, the amount of control you have will change how much you value
+dynamicity: the less control you have, the more valuable it is. In other words,
+having a dynamic flag on the backend may or may not be worth it since you could
+always update the code immediately after, but on iOS it is basically always
+worth it.
+
+== Rollout
+
+:kubernetes-deployment: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#creating-a-deployment
+:play-store-rollout: https://support.google.com/googleplay/android-developer/answer/6346149?hl=en
+:app-store-rolllout: https://help.apple.com/app-store-connect/#/dev3d65fcee1
+
+A rollout is used to _roll out_ a new version of software.
+
+They are usually short-lived, being relevant as long as the new code is being
+deployed. The most common rule is percentages.
+
+On the *backend*, it is common to find it on the deployment infrastructure
+itself, like canary servers, blue/green deployments, {kubernetes-deployment}[a
+kubernetes deployment rollout], _etc_. You could do those manually, by having a
+dynamic control on the code itself, but rollbacks are cheap enough that people
+usually do a normal deployment and just give some extra attention to the metrics
+dashboard.
+
+Any time you see a blue/green deployment, there is a rollout happening: most
+likely a load balancer is starting to direct traffic to the new server, until
+reaching 100% of the traffic. Effectively, that is a rollout.
+
+On the *frontend*, you can selectively pick which user's will be able to
+download the new version of a page. You could use geographical region, IP,
+cookie or something similar to make this decision.
+
+CDN propagation delays and people not refreshing their web pages are also
+rollouts by themselves, since old and new versions of the software will coexist.
+
+On *mobile*, the Play Store allows you to perform fine-grained
+{play-store-rollout}[staged rollouts], and the App Store allows you to perform
+limited {app-store-rollout}[phased releases].
+
+Both for Android and iOS, the user plays the role of making the download.
+
+In summary: since you control the servers on the backend, you can do rollouts at
+will, and those are often found automated away in base infrastructure. On the
+frontend and on mobile, there are ways to make new versions available, but users
+may not download them immediately, and many different versions of the software
+end up coexisting.
+
+== Feature flag
+
+A feature flag is a _flag_ that tells the application on runtime to turn on or
+off a given _feature_. That means that the actual production code will have
+more than one possible code paths to go through, and that a new version of a
+feature coexists with the old version. The feature flag tells which part of the
+code to go through.
+
+They are usually medium-lived, being relevant as long as the new code is being
+developed. The most common rules are percentages, allow/deny lists, A/B groups
+and client version.
+
+On the *backend*, those are useful for things that have a long development
+cycle, or that needs to done by steps. Consider loading the feature flag rules
+in memory when the application starts, so that you avoid querying a database or
+an external service for applying a feature flag rule and avoid flakiness on the
+result due to intermittent network failures.
+
+Since on the *frontend* you don't control when to update the client software,
+you're left with applying the feature flag rule on the server, and exposing the
+value through an API for maximum dynamicity. This could be in the frontend code
+itself, and fallback to a "just refresh the page"/"just update to the latest
+version" strategy for less dynamic scenarios.
+
+On *mobile* you can't even rely on a "just update to the latest version"
+strategy, since the code for the app could be updated to a new feature and be
+blocked on the store. Those cases aren't recurrent, but you should always
+assume the store will deny updates on critical moments so you don't find
+yourself with no cards to play. That means the only control you actually have
+is via the backend, by parameterizing the runtime of the application using the
+API. In practice, you should always have a feature flag to control any relevant
+piece of code. There is no such thing as "too small code change for a feature
+flag". What you should ask yourself is:
+
+____
+If the code I'm writing breaks and stays broken for around a month, do I care?
+____
+
+If you're doing an experimental screen, or something that will have a very small
+impact you might answer "no" to the above question. For everything else, the
+answer will be "yes": bug fixes, layout changes, refactoring, new screen,
+filesystem/database changes, _etc_.
+
+== Experiment
+
+An experiment is a feature flag where you care about analytical value of the
+flag, and how it might impact user's behaviour. A feature flag with analytics.
+
+They are also usually medium-lived, being relevant as long as the new code is
+being developed. The most common rule is A/B test.
+
+On the *backend*, an experiment rely on an analytical environment that will pick
+the A/B test groups and distributions, which means those can't be held in memory
+easily. That also means that you'll need a fallback value in case fetching the
+group for a given customer fails.
+
+On the *frontend* and on *mobile* they are no different from feature flags.
+
+== Operational toggle
+
+An operational toggle is like a system-level manual circuit breaker, where you
+turn on/off a feature, fail over the load to a different server, _etc_. They
+are useful switches to have during an incident.
+
+They are usually long-lived, being relevant as long as the code is in
+production. The most common rule is percentages.
+
+They can be feature flags that are promoted to operational toggles on the
+*backend*, or may be purposefully put in place preventively or after a
+postmortem analysis.
+
+On the *frontend* and on *mobile* they are similar to feature flags, where the
+"feature" is being turned on and off, and the client interprets this value to
+show if the "feature" is available or unavailable.
+
+== Best practices
+
+=== Prefer dynamic content
+
+Even though feature flags give you more dynamicity, they're still somewhat
+manual: you have to create one for a specific feature and change it by hand.
+
+If you find yourself manually updating a feature flags every other day, or
+tweaking the percentages frequently, consider making it fully dynamic. Try
+using a dataset that is generated automatically, or computing the content on the
+fly.
+
+Say you have a configuration screen with a list of options and sub-options, and
+you're trying to find how to better structure this list. Instead of using a
+feature flag for switching between 3 and 5 options, make it fully dynamic. This
+way you'll be able to perform other tests that you didn't plan, and get more
+flexibility out of it.
+
+=== Use the client version to negotiate feature flags
+
+After effectively finishing a feature, the old code that coexisted with the new
+one will be deleted, and all traces of the transition will vanish from the code
+base. However if you just remove the feature flags from the API, all of the old
+versions of clients that relied on that value to show the new feature will go
+downgrade to the old feature.
+
+This means that you should avoid deleting client-facing feature flags, and
+retire them instead: use the client version to decide when the feature is
+stable, and return `true` for every client with a version greater or equal to
+that. This way you can stop thinking about the feature flag, and you don't
+break or downgrade clients that didn't upgrade past the transition.
+
+=== Beware of many nested feature flags
+
+Nested flags combine exponentially.
+
+Pick strategic entry points or transitions eligible for feature flags, and
+beware of their nesting.
+
+=== Include feature flags in the development workflow
+
+Add feature flags to the list of things to think about during whiteboarding, and
+deleting/retiring a feature flags at the end of the development.
+
+=== Always rely on a feature flag on the app
+
+Again, there is no such thing "too small for a feature flag". Too many feature
+flags is a good problem to have, not the opposite. Automate the process of
+creating a feature flag to lower its cost.