Feature flags and Canary releasing might seem to be competing with each other, but they can actually live and work together very well! Both have their own use-cases and superpowers, and it’s always better to have a superhero team, than only a single superhero. Let’s investigate how we can build this team of combined canary and flagging superpowers.

So, as a starter, let us refresh our knowledge of what Canary releasing and Feature flags (or toggles or flippers) entitles.

What are Feature Flags?

Feature flagging are essentially conditional blocks of code: if a certain variable (the flag) has a certain value (like True, On, 1, or Red) then we are going to activate a certain part of the code, otherwise we’re going to jump over it and ignore it.

This is very useful if you’re working on certain functionality within a bigger piece of code, and you need to ship your application or service to the outside world, but you don’t want to enable the new functionality to users or other services just yet.

This technology has been used for a long time, but lately a slew of online feature-toggling SaaS-services have rekindled its popularity. Not in the least because with mobile and web based apps it’s very handy to be able to switch features on and off without having to release and push a new version of your app and have the client update the app.

What is Canary Releasing?

Canary releasing is all about sending small segments and percentages of traffic to a new version of your software, while the remaining traffic goes to the current already running software version.

The term “canary” comes from the coalminers that took little birds in cages into the mines to detect dangerous gas issues early before it would become a serious issue to themselves.

Download the whitepaper right here

Typically canary releasing was done percentage-based only (f.e. 10% goes to the new version, 90% to the current version, and after 24hrs we will increase this to 25%), but now we also see this combined with Layer7 traffic shaping and segmenting. This allows us to do things like “segment iPhone users from Germany with a certain VIP CRM-status and a basket-value of more than €50, and send 5% of them to our new canary version”.

The idea is to go in small controlled steps, that are continuously and (often) automatically observed for potential issues and required performance. If all looks good, we increase the assigned traffic percentage, and/or add a new user-segment. This way we can avoid “Big Bang production issues” and reduce the blast radius of potential issues in production.

From a 10,000 foot view let’s recap the above:

  • Feature toggles are conditional blocks of code at the application code level
  • Canary releases are infrastructure-driven (typically using load-balancers, proxies or service-meshes) percentage and Layer 7 traffic segmentation driven controlled migrations from one version of running code to another.
  • Flags mostly feel logical to software developers, Canary releases typically feel logical to ops and infrastructure engineers.

Dangers and cost of Feature Flags and Toggles

As with all superpowers, with great power comes great responsibility. These are sharp tools, and they can be super-powerful, but also very dangerous. Let’s take a closer look at the potential dangers of both approaches.

One of the most famous examples of Feature flags gone wrong can be read here, where a feature flag made a $400M company go bankrupt in 45 minutes. Woops!

Pete Hodgon from Thoughtworks phrases the potential dangers and challenges of Feature flags clearly:

"Feature Flags have a tendency to multiply rapidly, particularly when first introduced. They are useful and cheap to create and so often a lot are created. However toggles do come with a carrying cost. They require you to introduce new abstractions or conditional logic into your code. They also introduce a significant testing burden.”

Also, having a lot of feature-flags in place can also be an indication that feature-development takes a long time before being ready to be delivered to users, a lot like long-lived feature branches. This is why Martin Fowler says:

"Release toggles are the last thing you should do. Your first choice should be to break the feature down so you can safely introduce parts of the feature into the product. Only if you can't do small releases or UI last should you employ release toggles.”

Things to consider with Canary Releasing

Is Canary releasing then the solution for this potentially dangerous and costly “Flag sprawl”?

Of course the answer is “it depends”, and there is no silver bullet, as Canary releasing also has its pro’s and con’s. It's a very good thing not having to instrument and change your code, as this could potentially leave us with dangling code we don’t use anymore, it doesn’t impact the performance of our code negatively, and we don’t need to test all the different combinations that your code can hit. We also don’t have any “forgotten” flags nobody knows what they actually do anymore.

We do the traffic segmenting, shaping and rebalancing on the infrastructure level, nice and cleanly separated from the code-logic. Sounds good eh?

Infrastructure usage

A potential issue with Canary releasing is that you will have two versions of your service or application running during the time of your Canary test and release process. This can cost more cloud or infrastructure resources than having a single application or service with flags in it.

Luckily the adoption of Docker containers, efficient languages and frameworks like Go and NodeJS, the advent of small microservices, and autoscaling mechanisms, f.e. in Kubernetes, makes it much easier to dynamically resize resource-dimensions (like memory, CPU or the number of running instances) based on realtime requirements.

We can do this dynamically, based on the needed resources for the assigned traffic size of each version, so we can even out and balance the total needed resources between the 2 versions, just like load balancing works.

Layer 7 user segmentation

When you want to enable a feature for certain types of users of your application, f.e. a logged-in returning VIP user from Germany, it often feels logical to do this at the code level, because this user-information is typically readily available in the session or state data.  

But typically, this information is (or should?) also available in cookies or headers, and because Canary releasing uses Layer7 for its traffic shaping it can easily use this information to redirect traffic and requests based on this information.

A certain customer-segment ID session cookie, a geo country-code header coming from your CDN, or a device agent-header, these all be easily used in canary releasing to detect and shift visitors to a certain new version.

Long live the power of dynamic layer7 load-balancing!

And yes, because of load-balancers affinity mechanisms, useful things like sticky sessions are also supported, so your visitors can keep on returning to the same version they originally landed on.

And by using load-balancing mechanisms like weight, it’s very easy to combine both conditional segmentation with percentage-based redirection, to achieve things like sending f.e. 10% from German iPhone users with a certain CRM VIP-status to a new version of the software.

Canary releasing requires a mechanism that (re)configures the traffic shaping mechanisms (like. f.e. Kubernetes Ingress, Loadbalancers, proxies, or a service-mesh), so your ops and infra engineers will be comfortable with having such an automated configuration system in place that uses the technologies and components they are already using anyway.

Last, when running event-based asynchronous services, like f.e. microservices that consume and produce from and to a Kafka pipeline, reshaping traffic is not an option, because these services typically are not routing through a proxy, ingress or service-mesh. It is possible though, to run multiple versions of a consumer or producer side by side, as long as you keep your data-structures backwards and forwards compatible.

Combined superpowers: Canaries wearing Flags

Now we know what challenges AND opportunities Feature Flagging and Canary Releases can provide, how can we combine the two and have the best of both worlds? Because: friends, not enemies!

What combined super-power can we unleash into the world that makes both the lives of developers and ops much nicer? Let’s investigate! Let’s take all of the above into consideration and sum it up as following.

We can approach feature flagging as a good solution for software developers to quickly add and test a new customer-facing feature for a certain audience segment.

We can approach Canary releasing as a good solution for enabling a controlled multi-stage “Go Live” release process, where you continuously and automatically validate all relevant technical and business metrics, to check if there is no regression, degradation of service, or more serious issues like bugs, appearing into your production environment.

And of course you can also use Canary releasing to do static A/B or MVT tests with visitor segmentation, if you don’t mind the potential increase in temporary infrastructure usage. This way you keep your code-base clean, don’t need any instrumentation, and avoid “flag sprawl” or “forgotten flags”.

The beauty is in having both mechanisms available, understanding their pros and cons, and combine them to get the best of both worlds. This way we can achieve the holy grail of delivering better features with higher velocity and quality to our customers even better.


If you want to understand how Vamp can help you to setup a fully automated next-gen ML-based canary-releasing pipeline, with or without Feature flagging integrated, book a 30-min meeting with me or take a look at our product page.