Recently, one of our customers said about working with Vamp "just knowing you've never been able to test in production, and now knowing you can actually test in production, that's mind-blowing". He was thinking about the benefits of being able to validate releases using real user traffic (you can never simulate how traffic is going to behave in pre-production) and being able to do AB testing of both tech upgrades and features. For many engineers and architects, testing in production sounds cool, but also sounds like living on the edge, think base-jumping for your application environment. The truth is that testing in production can be completely safe. What’s more, if done in a managed way, it is a key step to reaping the benefits of releasing features faster without breaking things. So how can you rethink testing in production and do it safely? Read on.

Of course, you are already testing in production

Although testing is considered something you do in pre-production, the reality is that everyone is testing in production all the time. Each time you do a “big bang” release to production, you are testing your code for the first time on real user traffic. The problem is that a “big bang” release is not the kind of test you want to be doing. Meaning, it is not controlled. In pre-production, there’s a lot of room for testing for specific conditions and collecting feedback. So, the test is controlled, software is validated, and there’s room for improvement.

But here’s the rub we all know, you can put a release through all the pre-production testing you can think of, check the health of your deployment thoroughly and still run into problems once the code hits the road of a real production environment. That’s because synthetic user testing can never predict real user behavior with all its, um surprises, or as our customer put it “how are you going to know what data to generate to predict all the crazy varieties that exist in real-life scenarios?”.

So now think of your release as still a test, but a safe one: enter user segmentation

The solution to this conundrum is to rethink testing in production and, by that same token, your release process. Instead of exposing a new update or version to all users in a “big bang” (or even a mass traffic batch of say 20% which quite a few release solutions allow), make your releases as small and controlled as possible. Release incrementally to a controlled subset of users. For instance, specify location, device, cookie, IP address, basket value. Applying user segmentation allows you to set very specific conditions around the release test, observe and be able to trace back if things go wrong.

Thank you gkze for this image on Imgur

How to safely test in production in a managed process:

·         Specify which users will be exposed to an update or feature and under which conditions (for instance, location, device, cookie, IP address, basket value). In steps, specify when to increase the number of users (at Vamp, we codify this process in flexible release policies)

·         Also specify when to rollback. Make sure rollback is automated.

·         Release in a series of automated steps.

·         Make sure you are able to monitor and observe. Specifying the conditions of the release in advance creates health checks - not only tech checks but also checks for business outcomes. This will give you early warning signals if things go wrong

·         Share knowledge across teams. Learn what’s normal behavior in production and what’s not.

The benefits of having a managed process for testing and releasing to production

If this sounds like a canary release to you, that’s because it is. But codifying the canary release into a series of release policies lays the foundation for a managed, robust release process. The benefits of this approach are:

·         You can rely on a controlled, repeatable process: that improves the quality and reliability of your release

·         You don’t have to rely so much on guesstimating synthetic user behaviour

·         No more stressful manual rollbacks

·         Better incident response: you can see what condition of the test wasn’t met and fix it faster

·         Having two versions in production allows for: technical AB testing that improves application performance

·         Having two versions in production allows for: business AB testing of features for better ROI

·         You can have multiple teams releasing code changes to production simultaneously

Add intelligent automation and now you have a platform that will take over release decisions for you

Controlled, incremental releases create a reliable repeatable process. Add intelligent backwards and forwards automation like that of a release orchestration solution (spoiler alert: we recommend Vamp) and you have a platform that will take over release decisions for you. That makes not just testing in production, but the entire process of shipping new features to customers faster much less “scary”. Leaving you to do other important things than worrying about pager duty…like base jumping...or spending time at that family BBQ on a Friday, or Sunday afternoon.

If you are interested in seeing how Vamp Cloud-Native Release Orchestration can help you test in production, book a guided tour, with or without your own containers!