diff options
Diffstat (limited to 'src/content/blog/2020/10/19')
-rw-r--r-- | src/content/blog/2020/10/19/feature-flags.adoc | 305 |
1 files changed, 305 insertions, 0 deletions
diff --git a/src/content/blog/2020/10/19/feature-flags.adoc b/src/content/blog/2020/10/19/feature-flags.adoc new file mode 100644 index 0000000..c62c2d1 --- /dev/null +++ b/src/content/blog/2020/10/19/feature-flags.adoc @@ -0,0 +1,305 @@ +--- +title: "Feature flags: differences between backend, frontend and mobile" +date: 2020-10-19 +updated_at: 2020-11-03 +layout: post +lang: en +ref: feature-flags-differences-between-backend-frontend-and-mobile +eu_categories: presentation +--- + +*This article is derived from a [presentation][presentation] on the same +subject.* + +When discussing about feature flags, I find that their +costs and benefits are often well exposed and addressed. Online articles like +"[Feature Toggle (aka Feature Flags)][feature-flags-article]" do a great job of +explaining them in detail, giving great general guidance of how to apply +techniques to adopt it. + +However the weight of those costs and benefits apply differently on backend, +frontend or mobile, and those differences aren't covered. In fact, many of them +stop making sense, or the decision of adopting a feature flag or not may change +depending on the environment. + +In this article I try to make the distinction between environments and how + feature flags apply to them, with some final best practices I've acquired when + using them in production. + +[presentation]: {% link _slides/2020-10-19-rollout-feature-flag-experiment-operational-toggle.slides %} +[feature-flags-article]: https://martinfowler.com/articles/feature-toggles.html + +## Why feature flags + +Feature flags in general tend to be cited on the context of +[continuous deployment][cd]: + +> A: With continuous deployment, you deploy to production automatically + +> B: But how do I handle deployment failures, partial features, *etc.*? + +> A: With techniques like canary, monitoring and alarms, feature flags, *etc.* + +Though adopting continuous deployment doesn't force you to use feature +flags, it creates a demand for it. The inverse is also true: using feature flags +on the code points you more obviously to continuous deployment. Take the +following code sample for example, that we will reference later on the article: + +```javascript +function processTransaction() { + validate(); + persist(); + // TODO: add call to notifyListeners() +} +``` + +While being developed, being tested for suitability or something similar, +`notifyListeners()` may not be included in the code at once. So instead of +keeping it on a separate, long-lived branch, a feature flag can decide when the +new, partially implemented function will be called: + +```javascript +function processTransaction() { + validate(); + persist(); + if (featureIsEnabled("activate-notify-listeners")) { + notifyListeners(); + } +} +``` + +This allows your code to include `notifyListeners()`, and decide when to call it +at runtime. For the price of extra things around the code, you get more +dynamicity. + +So the fundamental question to ask yourself when considering adding a feature +flag should be: + +> Am I willing to pay with code complexity to get dynamicity? + +It is true that you can make the management of feature flags as +straightforward as possible, but having no feature flags is simpler than having +any. What you get in return is the ability to parameterize the behaviour of the +application at runtime, without doing any code changes. + +Sometimes this added complexity may tilt the balance towards not using a feature +flag, and sometimes the flexibility of changing behaviour at runtime is +absolutely worth the added complexity. This can vary a lot by code base, feature, but +fundamentally by environment: its much cheaper to deploy a new version of a +service than to release a new version of an app. + +So the question of which environment is being targeted is key when reasoning +about costs and benefits of feature flags. + +[cd]: https://www.atlassian.com/continuous-delivery/principles/continuous-integration-vs-delivery-vs-deployment + +## Control over the environment + +The key differentiator that makes the trade-offs apply differently is how much +control you have over the environment. + +When running a **backend** service, you usually are paying for the servers +themselves, and can tweak them as you wish. This means you have full control do +to code changes as you wish. Not only that, you decide when to do it, and for +how long the transition will last. + +On the **frontend** you have less control: even though you can choose to make a +new version available any time you wish, you can't force[^force] clients to +immediately switch to the new version. That means that a) clients could skip +upgrades at any time and b) you always have to keep backward and forward +compatibility in mind. + +Even though I'm mentioning frontend directly, it applies to other environment +with similar characteristics: desktop applications, command-line programs, +*etc*. + +On **mobile** you have even less control: app stores need to allow your app to +be updated, which could bite you when least desired. Theoretically you could +make you APK available on third party stores like [F-Droid][f-droid], or even +make the APK itself available for direct download, which would give you the same +characteristics of a frontend application, but that happens less often. + +On iOS you can't even do that. You have to get Apple's blessing on every single +update. Even though we already know that is a [bad idea][apple] for over a +decade now, there isn't a way around it. This is where you have the least +control. + +In practice, the amount of control you have will change how much you value +dynamicity: the less control you have, the more valuable it is. In other words, +having a dynamic flag on the backend may or may not be worth it since you could +always update the code immediately after, but on iOS it is basically always +worth it. + +[f-droid]: https://f-droid.org/ +[^force]: Technically you could force a reload with JavaScript using + `window.location.reload()`, but that not only is invasive and impolite, but + also gives you the illusion that you have control over the client when you + actually don't: clients with disabled JavaScript would be immune to such + tactics. + +[apple]: http://www.paulgraham.com/apple.html + +## Rollout + +A rollout is used to *roll out* a new version of software. + +They are usually short-lived, being relevant as long as the new code is being +deployed. The most common rule is percentages. + +On the **backend**, it is common to find it on the deployment infrastructure +itself, like canary servers, blue/green deployments, +[a kubernetes deployment rollout][k8s], *etc*. You could do those manually, by +having a dynamic control on the code itself, but rollbacks are cheap enough that +people usually do a normal deployment and just give some extra attention to the +metrics dashboard. + +Any time you see a blue/green deployment, there is a rollout happening: most +likely a load balancer is starting to direct traffic to the new server, until +reaching 100% of the traffic. Effectively, that is a rollout. + +On the **frontend**, you can selectively pick which user's will be able to +download the new version of a page. You could use geographical region, IP, +cookie or something similar to make this decision. + +CDN propagation delays and people not refreshing their web +pages are also rollouts by themselves, since old and new versions of the +software will coexist. + +On **mobile**, the Play Store allows you to perform +fine-grained [staged rollouts][staged-rollouts], and the App Store allows you to +perform limited [phased releases][phased-releases]. + +Both for Android and iOS, the user plays the role of making the download. + +In summary: since you control the servers on the backend, you can do rollouts at +will, and those are often found automated away in base infrastructure. On the +frontend and on mobile, there are ways to make new versions available, but users +may not download them immediately, and many different versions of the software +end up coexisting. + +[k8s]: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#creating-a-deployment +[staged-rollouts]: https://support.google.com/googleplay/android-developer/answer/6346149?hl=en +[phased-releases]: https://help.apple.com/app-store-connect/#/dev3d65fcee1 + +## Feature flag + +A feature flag is a *flag* that tells the application on runtime to turn on or +off a given *feature*. That means that the actual production code will have more +than one possible code paths to go through, and that a new version of a feature +coexists with the old version. The feature flag tells which part of the code to +go through. + +They are usually medium-lived, being relevant as long as the new code is being +developed. The most common rules are percentages, allow/deny lists, A/B groups +and client version. + +On the **backend**, those are useful for things that have a long development +cycle, or that needs to done by steps. Consider loading the feature flag rules +in memory when the application starts, so that you avoid querying a database +or an external service for applying a feature flag rule and avoid flakiness on +the result due to intermittent network failures. + +Since on the **frontend** you don't control when to update the client software, +you're left with applying the feature flag rule on the server, and exposing the +value through an API for maximum dynamicity. This could be in the frontend code +itself, and fallback to a "just refresh the page"/"just update to the latest +version" strategy for less dynamic scenarios. + +On **mobile** you can't even rely on a "just update to the latest version" +strategy, since the code for the app could be updated to a new feature and be +blocked on the store. Those cases aren't recurrent, but you should always assume +the store will deny updates on critical moments so you don't find yourself with +no cards to play. That means the only control you actually have is via +the backend, by parameterizing the runtime of the application using the API. In +practice, you should always have a feature flag to control any relevant piece of +code. There is no such thing as "too small code change for a feature flag". What +you should ask yourself is: + +> If the code I'm writing breaks and stays broken for around a month, do I care? + +If you're doing an experimental screen, or something that will have a very small +impact you might answer "no" to the above question. For everything else, the +answer will be "yes": bug fixes, layout changes, refactoring, new screen, +filesystem/database changes, *etc*. + +## Experiment + +An experiment is a feature flag where you care about analytical value of the +flag, and how it might impact user's behaviour. A feature flag with analytics. + +They are also usually medium-lived, being relevant as long as the new code is +being developed. The most common rule is A/B test. + +On the **backend**, an experiment rely on an analytical environment that will +pick the A/B test groups and distributions, which means those can't be held in +memory easily. That also means that you'll need a fallback value in case +fetching the group for a given customer fails. + +On the **frontend** and on **mobile** they are no different from feature flags. + +## Operational toggle + +An operational toggle is like a system-level manual circuit breaker, where you +turn on/off a feature, fail over the load to a different server, *etc*. They are +useful switches to have during an incident. + +They are usually long-lived, being relevant as long as the code is in +production. The most common rule is percentages. + +They can be feature flags that are promoted to operational toggles on the +**backend**, or may be purposefully put in place preventively or after a +postmortem analysis. + +On the **frontend** and on **mobile** they are similar to feature flags, where +the "feature" is being turned on and off, and the client interprets this value +to show if the "feature" is available or unavailable. + +## Best practices + +### Prefer dynamic content + +Even though feature flags give you more dynamicity, they're still somewhat +manual: you have to create one for a specific feature and change it by hand. + +If you find yourself manually updating a feature flags every other day, or +tweaking the percentages frequently, consider making it fully dynamic. Try +using a dataset that is generated automatically, or computing the content on the +fly. + +Say you have a configuration screen with a list of options and sub-options, and +you're trying to find how to better structure this list. Instead of using a +feature flag for switching between 3 and 5 options, make it fully dynamic. This +way you'll be able to perform other tests that you didn't plan, and get more +flexibility out of it. + +### Use the client version to negotiate feature flags + +After effectively finishing a feature, the old code that coexisted with the new +one will be deleted, and all traces of the transition will vanish from the code +base. However if you just remove the feature flags from the API, all of the old +versions of clients that relied on that value to show the new feature will go +downgrade to the old feature. + +This means that you should avoid deleting client-facing feature flags, and +retire them instead: use the client version to decide when the feature is +stable, and return `true` for every client with a version greater or equal to +that. This way you can stop thinking about the feature flag, and you don't break +or downgrade clients that didn't upgrade past the transition. + +### Beware of many nested feature flags + +Nested flags combine exponentially. + +Pick strategic entry points or transitions eligible for feature flags, and +beware of their nesting. + +### Include feature flags in the development workflow + +Add feature flags to the list of things to think about during whiteboarding, and +deleting/retiring a feature flags at the end of the development. + +### Always rely on a feature flag on the app + +Again, there is no such thing "too small for a feature flag". Too many feature +flags is a good problem to have, not the opposite. Automate the process of +creating a feature flag to lower its cost. |