The challenge of how to work with different release cadences eventually hits every microservice-based application. Some microservices will be updated more frequently than others which leads to uncertainty about the compatibility between different versions. How quickly this happens depends on the size of the development teams.

Initially the team members can keep the knowledge in their heads and there is a hidden manual check to prevent incompatible versions from being released.

However, as teams grow or people simply take vacations, the uncertainty builds until those checks fail and there is an incident that forces a change.

Kubernetes Deployments for an Example E-commerce Application

Depending on the maturity of the development teams, the incident may be a production outage caused by an incompatible version of a microservice being released. Or simply one too many failed integration tests and broken builds.

Typically, the next move is to formalise the existing informal processes. Making the previously hidden, manual checks explicit and tracking the compatible versions in a shared document either as a page in a wiki or as a file in a repo.

What then happens is you see chat messages asking if the file or wiki page is up to date. It still only takes one mistake — a hasty reply or a typo or an uncommitted change — to cause another incident.

In older organisations this is when go/no go meetings start to appear in people’s agendas and the benefits of adopting microservices are lost. As ad hoc daily releases turn into preplanned 8 week release cycles because the engineering teams can’t reliably managed the dependencies between their microservices.

Reassembling Monoliths

Dependency issues are not new but monolithic package and release practices come with a built-in solution. All the dependent parts are tested together and released to production together, as a single package. If the testing is thorough enough, then there are no compatibility issue in production.

This is another way teams often try to control dependencies. Assembling an application out of specific versions of microservices in test or acceptance environments, testing they work together and then pushing them to production as a set.

This approach puts the breaks on the services with the highest release cadences in order to give the teams time to verify which versions work together. The development teams with the highest release cadences either  reduce their velocity or piles of unreleased versions begin to build up.

Ultimately, we develop new product features or improve on existing features to meet a financial or other business objective. So, slowing down productive teams, is the last thing we want to happen.

More Testing in Production not Less

The sense of comfort that the increased preproduction testing provides is often false. The synthetic user testing that happens in pre-production is no substitute for real user testing. The only way to know if a new feature works is to try it with real users, measure their responses and react to that data. Production is where your real users are, so that’s where you need to be testing and in order to do that you need a strong, intelligent safety net.

Vamp’s continuous release validation provides that safety net. Vamp lets you model your application dependencies live in Kubernetes via your CI/CD pipeline. Vamp then uses that model  to stop incompatible versions of your microservices harming your users’ experience.

The core of this safety net are two Kubernetes Custom Resource Definitions (CRDs). These CRDs allow you to define the shape and structure of your microservice-based application as Kubernetes config.

The ApplicationDependencies CRD tells Vamp what your development teams believe the compatible versions are.

The AllowedVersions CRD tells Vamp which recent versions are safe for the application in that namespace. The information in this CRD serves two purposes:

  1. It tells Vamp which versions have passed your automated testing; and
  2. If you have legal or other complicany requirements, you can use it to tell Vamp which versions have satisfied those processes.

How Vamp Works

Now is a good time for a quick explanation of how Vamp works.

Vamp is a Cloud-Native AIOps platform that provides continuous release orchestration capabilities for engineering, DevOps and SRE teams. Vamp automates the release process, validates the health of microservices and applies a safety net when releases fail – all while safeguarding the customer experience.

Request a free, guided tour of Vamp's AIOps platform.

Service — a service in Vamp is a microservice that maps to a Kubernetes workload, this can be either a Kubernetes Deployment or a StatefulSet.

Application — an application in Vamp is a collection of microservices identified by a unique combination of Kubernetes cluster and Kubernetes Namespace. Vamp natively supports releasing across multiple clusters and multiple cloud providers.

Release policy — each service in an application has one or more associated release policies which specify how a new version of the service will be validated. A release policy is a set of rules for how a new version should be tested and the actions to be taken if the release is successful or unsuccessful.

Semantic versioning — Vamp determines which release policy to apply by comparing the semantic version number of the new version with the version that’s live. A semantic version or semver number consists of three numbers separated by dots, for example: 1.2.3.

The first number (from left to right) is called the major number and only changes when you make incompatible API changes. The second number is called the minor number and only changes when you add functionality in a backwards-compatible manner. The third number is called the patch number and only changes when you make backwards-compatible bug fixes.

For example, if the live version is 2.0.1 and the new version is 2.1.0 , then the change is minor because the major numbers are the same but the minor numbers are different. So Vamp would use  the service’s minor policy to validate the new version.

Automatic and Continuous Validation

Vamp builds upon Google’s Site Reliability Engineering (SRE) principles. The success of a release is calculated by examining the Service Level Objectives (SLOs) that have been defined for that microservice and for the application as a whole. We call this calculation Vamp Health.

Vamp Health is a rollup of different metrics and provides a single value that represents the customer experience

When a new version of a service is deployed, Vamp applies a release policy to determine if that version is healthy. If the new version is healthy then it becomes the live version, replacing the previous version. However, if the new version is considered unhealthy then Vamp rolls it back and the previous version remains the live version.

Awareness of Dependency

Kubernetes workloads are independent of each other. When you deploy a new version of a workload, Kubernetes cannot check how the new version will affect any of the other workloads.

Vamp also knows nothing about the dependencies between your microservices until you tell it. If the new version is incompatible with other services that make up the application, then Vamp may detect a drop in the overall health of your application and rollback the new version.

Example E-commerce Application with a Failing Dependency

In this example we have an application with 5 services. The cart-service depends on the product-service for some of its functionality. If a newly deployed version of the product-service breaks part of the cart-service functionality, then Vamp may detect the overall health of the application has deteriorated and rollback the change.

Vamp doesn’t know the relationship between the services, instead it is relying on the SLOs to indicate that something is wrong. Instead of relying on just your SLOs to detect broken dependencies, you can provide Vamp with information on the relationships between the services using an instance of the ApplicationDependencies CRD.

The CRDs allow you to express the dependencies as ranges of semantic versions. If you are familiar with Helm charts or Node.js or Python then the way the dependencies are expressed should feel familiar.

The ApplicationDependencies for our example e-commerce application looks like this:

apiVersion: vamp.io/v1
kind: ApplicationDependencies
metadata:
  name: eu-app-deps
spec:
  services:
  - name: cart-service
  	versions:
  	- version: "2.7.x"
      dependencies:
      - name: product-service
        version: "3.x"
      - name: customer-service
        version: "9.3.x"
  - name: product-service
    versions: 
    - version: "3.x"
      dependencies:
      - name: ratings-service
        version: "0.x"
  - name: ratings-service
    versions: 
    - version: "0.x"
      dependencies: []
  - name: customer-service
    versions: 
    - version: "9.3.x"
      dependencies:
      - name: eu-payment-service
        version: "1.x"
  - name: eu-payment-service
    versions: 
    - version: "1.x"
      dependencies: []

The contents of the CRD allow Vamp to build the following dependency map for the application:

Example E-commerce Application Showing Service Dependencies

Now Vamp knows that the cart-service may be affected by a change to the product-service. When a new version of the product-service is deployed, Vamp will explicitly check the health of the cart-service, not just the health of product-service and the overall application.

Immediate Rejection of Incompatible Versions

Until now Vamp has relied on monitoring SLOs to decide if a version is healthy. This can take many minutes which is a long time if the your user’s experience is affected.

If you provide dependency information for your application then Vamp will immediately reject any versions that violate the application’s dependencies. Now before  selecting a policy, Vamp checks that the new version is acceptable.

  1. That all the new version’s dependencies are satisfied; and
  2. That making the new version live will not break the dependency requirements of other services.

Only then does Vamp select and apply a release policy. This means that an incompatible version will be rejected in less than a minute.

Vamp provides two Kubernetes CRDs for managing dependencies so that you can maintain a strong separation of concerns and you are not forced to create separate git repositories to store the dependency information and test results.

The ApplicationDependencies for an application are intended to provide the developer’s view of the dependencies. It tells Vamp which versions should be compatible based on their major and minor version numbers.

For example, versions of the cart-service starting with 2.7.0 and less than 2.8.0 should work with any version of the product-service starting with 3.0.0 and less than 4.0.0 .

Typically, the team responsible for the product-service will maintain a file in the service’s git repository that contains the snippet of config for that service, for example:

name: product-service
versions: 
- version: "3.x"
  dependencies:
  - name: ratings-service
    version: "0.x"

And the team responsible for the cart-service will maintain a similar file in their git repository, for example:

name: cart-service
versions:
- version: "2.7.x"
  dependencies:
   - name: product-service
     version: "3.x"
   - name: customer-service
     version: "9.3.x"

The content of the CRD can then be assembled from these snippets in your CD pipeline.

Software development is more a craft than a precise science, so not every version of the  product-service >=3.0.0 and <4.0.0 will work. This is why Vamp supports a second source of dependency information, the AllowedVersions CRD. This tells Vamp which recent versions are actually safe to use.

Typically, the AllowedVersions is updated each time a new version passes your automated testing. If you use Vamp to release to your pre-production and production, then if a release fails in your test or acceptance environment Vamp will automatically block that version from being released in your production environment.

You can also use the AllowedVersions  CRD to block the release of compatible versions that need to be signed off by for legal or other reasons.

The AllowedVersions for our example e-commerce application looks like this:

apiVersion: vamp.io/v1
kind: AcceptableVersions
metadata:
  name: eu-production
  namespace: eu-production
spec:
 services:
 - name: cart-service
   version: "> 2.7.22 <= 2.7.28"
 - name: product-service
   version: "<= 3.0.98"
 - name: ratings-service
   version: "<= 0.1.6"
 - name: customer-service
   version: "<= 9.3.9"
 - name: eu-payment-service
   version: "> 1.1.12 <= 1.2.23"

Using the information in this CRD, Vamp can narrow down  the range of versions that can be released  for the application:

Example E-commerce Application Showing Precise Service Dependencies

Using this information Vamp will now immediately reject any version of the product-service less than 3.0.0 or greater than 3.0.98 .

Take Aways

  1. Being able to releases microservices at different frequencies is one of the benefits of moving to microservice-based development. You shouldn’t artificially force your microservices to be released at the same time.
  2. Relying on “best effort” manual processes to prevent incompatible versions of your microservices reaching production and impacting your users’ experience is error prone and doesn’t scale.
  3. Verifying your application in production using Vamp safeguards your users’ experience without sacrificing the benefits of moving to a microservice-based architecture.
Download our whitepaper "Design Patterns for Continuous Delivery" right here.