A Better Way of Deploying a Dockerized Application to Azure Kubernetes Service Using Azure Pipelines
Throughout 2018 I wrote a mini blog post series aimed at providing specific and detailed guidance on how to create a CI/CD pipeline using VSTS/Azure DevOps to deploy a dockerized ASP.NET Core application to Azure Kubernetes Service (AKS):
- Deploy a Dockerized ASP.NET Core Application to Kubernetes on Azure Using a VSTS CI/CD Pipeline: Part 1
- Deploy a Dockerized ASP.NET Core Application to Kubernetes on Azure Using a VSTS CI/CD Pipeline: Part 2
- Deploy a Dockerized ASP.NET Core Application to Azure Kubernetes Service Using a VSTS CI/CD Pipeline: Part 3
- Deploy a Dockerized ASP.NET Core Application to Azure Kubernetes Service Using a VSTS CI/CD Pipeline: Part 4
Whilst the resulting solution works I wasn't entirely happy with several aspects and I've spent a great deal of time thinking and tinkering to come up with something better. In this blog post I explain what I wasn't happy with and how my new solution addresses most of my concerns. You don't necessarily need to read the posts above as I'm going to provide some context, but it will probably make things much clearer if you are planning to implement any of my suggestions.
The sample application I've been using to deploy to Kubernetes consists of the following components:
- ASP.NET Core web application, that sends messages to a
- NATS message queue service, which pushes messages to a
- .NET Core message queue handler application, which saves messages to an
- Azure SQL database
Apart from the database all the components run as docker containers. The container images are built in in an Azure Pipelines build pipeline and images pushed to an Azure Container Registry (ACR). An Azure Pipelines release pipeline then deploys the necessary services and deployments to AKS which causes the images to be pulled from ACR and instantiated as containers inside pods. My release pipeline consists of two environments: dat (developer automated test where automated acceptance tests might take place) and prd (production). That's just arbitrary of course and in a live scenario the pipeline can have whatever environments are needed.
My sample application is called MegaStore and you can find the code on GitHub here. In the rest of this post I explain my areas of concern and how I addressed them.
Azure Pipelines Tasks
Whilst there is no doubt that Azure Pipelines Tasks are great for quickly building a pipeline and definitely make it easier for those less familiar with the technology behind a task to get started, I now see some tasks as more of a curse than a blessing. I've particularly taken issue with tasks that manipulate a command line application (such as docker or kubectl) and which results in the task becoming something of a Swiss Army Knife task. Why have I taken issue? There are several reasons, some specific to the Swiss Army Knife variety and some of tasks in general:
- There is often a need to set mandatory fields in ‘Swiss Army Knife' tasks even though those parameters will not be used by the chosen sub-command. Where there are multiple instances of the same task in use this becomes very tedious and is a potential maintenance problem when something changes. (Yes, I know tasks can be cloned but this doesn't make me any happier.)
- Tasks by their nature only allow you to do what they have been coded to do and you can sometimes find yourself in a blind alley. For example, at the time of writing the only way I know of updating an existing Kubernetes ConfigMap without deleting it first and re-creating it is with a piped command, for example:
MS DOS1kubectl create configmap message.queue --from-literal=URL=nats://mq-service:4222 --dry-run -o yaml | kubectl apply -f -
Running a command such as this isn't possible with the current Deploy to Kubernetes Azure DevOps task, which is very limiting.
- Speaking of command lines, my next issue is that tasks abstract you from what is actually going on behind the scenes. For simple tasks such as copying files this might be fine, however I've become frustrated at the way tasks such as Docker or Deploy to Kubernetes ‘hide' what they are doing, and the way that makes fine-tuning that little bit harder. Additionally, for me it's also a lost learning opportunity—a missed chance to learn the full syntax of a command because the task is constructing it on your behalf.
- Another big issue is that tasks such as Docker or Deploy to Kubernetes offer nothing in the way of code usability, and break the DRY principle in multiple dimensions (ie there is scope for repetition within an environment and also across environments). To illustrate, the release pipeline in my 2018 mini blog series consisted of no fewer than 30 Deploy to Kubernetes tasks across two environments, resulting in a great deal of repetition.
- Finally, the use of tasks in the current version of Azure Pipelines releases means that you don't have your ‘code' under proper version control. I know there are changes coming that will help to address this, and whilst they will be welcome I think there is an opportunity to do better.
So what's my solution to all this? Very simply, get rid of multiple Swiss Army Knife tasks and implement Bash scripts running from a single Bash task. I started off by using the Inline script feature of Bash tasks but this didn't help with getting code in to version control and I also quickly realised that there were big code reusability opportunities to be had across environments by using File Path scripts. By using Bash scripts stored in the repo I solved all the issues mentioned above and in the case of the release portion of the pipeline I reduced the number of tasks from 15 in each environment to two! What follows are the techniques I used to achieve this for the Docker builds and Kubernetes deployments.
Converting Docker builds to use a Bash script was reasonably straightforward so I'll start by discussing the first problem I encountered when converting Deploy to Kubernetes tasks to Bash scripts, which was how to authenticate to Kubernetes. Tasks rely on the creation of a Kubernetes service connection (Project Settings > Service connections) and I'd been using the Kubeconfig version which involves pasting in the contents of the Kubeconfig file that gets created (if you run the appropriate command) when you set up an AKS cluster:
By tracing the logging output of the Deploy to Kubernetes tasks I could see what was happening: a Kubeconfig file was being saved to disk and referenced in a kubectl command using the --kubeconfig parameter that points to the file on disk. I could successfully pass the file in from an Artifact as a proof of concept but how to store the Kubeconfig contents securely and create the file dynamically? The obvious choice was a secret variable however that didn't work because it destroyed the Kubeconfig formatting which is important in the re-hydrated file on disk. After a lot of fiddling I finally turned to LoECDA who are super-responsive via Twitter, and very quickly the suggestion came back to try using Secure files (Pipelines > Library > Secure files). This worked perfectly: a file is first uploaded to the Secure files area and this is then available for use using the Download Secure File task. The file is downloaded in to a temporary folder which can be referenced as the $AGENT_TEMPDIRECTORY variable in a Bash script. Great!
Next up was sorting out the practicalities of using Bash scripts in Bash tasks. I created a deployment (dep) folder in the repo to hold the scripts and then arranged for this folder to be available as an Artifact created directly from the GitHub repo:
I used VS Code to create the Bash files however in order for the file to be executed as a Bash script it needs its permissions setting to make it executable (chmod +x). This needs to be done from a Linux environment and there are several possibilities for achieving this including Windows Subsystem for Linux if you are on Windows 10. I chose to go with Azure Cloud Shell, which can be configured to run either a Bash or a PowerShell command line in the cloud! Once that was configured it was a case of cloning my repo, navigating to the dep folder and running chmod +x some-filename-sh. There's no GUI in Azure Cloud Shell so it does involve using git commands to push the changes back to GitHub. If this is new to you then git add *, git commit -m "Commit message" and git push origin master are what you need. To authenticate you'll likely need to use a personal access token unless you go to the bother of setting up SSH. It gets to be a bit of a pain having to enter credentials every time you want to push to GitHub however the git config credential.helper store command will save credentials across Azure Cloud Shell sessions to make life easier.
Finding out what commands needed to be executed in the Bash scripts required a bit of detective work, and involved a combination of understanding what the task was attempting to accomplish and then looking at the build or release logs to see the actual output. With the basic command figured out this exercise offered the opportunity to do a bit of fine tuning. For example, I'd been tagging my docker images with the latest tag but it turns out that this isn't a great idea for release pipelines. By writing the actual command myself I was able to get exactly what I wanted.
I describe how I organised the Bash scripts to move away from a monolithic pipeline below. In this section I want to describe the tips and tricks I used to actually write the Bash scripts. Generally, the scripts make heavy use of variables to make them applicable to all release environments, however there are some essential things to know:
- Variables created as part of Azure DevOps pipelines can be used as variables (ie passed in to a script) however with the exception of secrets they are also created as environment variables which are available directly in scripts. This means that a variable created as MyVariable is available as $MYVARIABLE directly in a Bash script (in Bash scripts the variable is really a constant which convention dictates should be in upper case and any periods need converting to underscores to ensure valid syntax).
- Variables created as part of Azure DevOps pipelines can have the same name as long as they are scoped to a different environment. So you can have two variables called MyVariable with different values for each environment and simply refer to $MYVARIABLE in the Bash script, ie no need to pass $MYVARIABLE in as a parameter to the script for different environments.
- As mentioned above, secrets are not created as environment variables and must be passed in to a script via the Arguments field, and in the script a variable is declared to accept the incoming parameter. Important: as of the time of writing a secret needs to be passed in to the Argument field as $(MYSECRET) ie with parentheses around the actual parameter name. If you omit the parentheses the secret is not passed in. A non-secret parameter doesn't require parentheses and I have queried whether this is a a bug here.
- Later in this post I explain how I break up a monolithic pipeline in to multiple pipelines, which results in the same variables being needed in different pipelines. By using Variable Groups I was able to avoid repeated variable declarations and manage many variables from just one location.
- In addition to variables that are created manually, built-in variables are also available as environment variables in the script. The ones I've used are $AGENT_TEMPDIRECTORY to define the download location of the Kubeconfig file from the Secure files area, $RELEASE_ENVIRONMENTNAME to refer to the environment (ie dat or prd) and also $BUILD_BUILDNUMBER used to tag docker images with a unique build number in the build process and then to refer to them by their unique name in the release. However, there are many built-in variables available to use—see here for details but remember that for use in Bash scripts you should change text to uppercase and must replace periods with an underscore.
I'm not a Bash scripting expert and I'm sure my scripts would be considered very rudimentary. The great thing though is that you can do whatever you like now the code is a script. Possibilities might include adding error handling or refactoring further using functions. There's potential to really go to town here.
At the time of writing this article in early 2019 there aren't that many blog post examples of implementing a CI/CD pipeline to deploy an application to Kubernetes. Furthermore, the posts that do exist tend, not unreasonably, to use a simplistic application scenario to illustrate the concepts. Typically, this involves deploying the whole application as part of a single pipeline, and indeed this is the route I took with my 2018 blog post mini series. However, it became quickly apparent to me that this is an unsatisfactory arrangement for two main reasons:
- Just one change to one of the application components would cause all the components of the application to be redeployed (or more correctly the parts of the application that have their docker images built by the pipeline).
- A change to the Kubernetes configuration would also trigger a redeployment of all of the application components. Sometimes this is necessary but often it's not.
These issues arise because the trigger for the build component of the pipeline is set as the root of the GitHub repo, so if anything changes in the repo a build is triggered. Clearly not an optimal situation.
My solution to this problem is to divide the monolithic pipeline in to multiple pipelines that correspond to the individual components of the overall application. Then with a bit of refactoring of the codebase it's possible to use a very nifty feature of Azure Pipelines that allows a build to be triggered from one or more specific folders (or files for that matter) in the repo, ie a much more granular solution.
One complication that I had to cater for is that the pipeline isn't just building docker images and marshalling them in to the Kubernetes cluster: additionally, the pipeline is configuring Kubernetes elements such as Namespaces, Secrets and ConfigMaps.
Through the use of Bash scripts as described above the number of tasks needed is drastically reduced: just one Bash task for the builds and two tasks for releases (a Download Secure File task to copy the kubeconfig file to disk and a Bash task to host the bash script). All scripts are Namespace/environment aware.
In terms of Azure Pipelines build and release pipelines my current CI/CD solution is as follows:
This is a release that is not associated with a build and its sole purpose is to configure a Kubernetes Namespace in preparation for the deployment of the application. As such, this component is only intended to be run to either initialise a new Kubernetes cluster or (rarely) if one of the configuration items needs to change (in which case elements of the application will likely have to be redeployed for the configuration to be built in to the appropriate pods).
The configuration handled by megastore.init.release is as follows:
- Creation of a Namespace for a corresponding Azure Pipelines environment.
- Creation (or update) of ACR credentials (as a specialised Secret) that allow Deployments to pull docker images from ACR.
- Creation (or update) of the message queue URL as a ConfigMap.
- Creation (or update) of the Application Insights instrumentation key as a ConfigMap.
This configuration is handled by init.sh.
This is another release that is not associated with a build, and in this case the requirement is to deploy the NATS message queue service. The absence of a build is due to the docker image being pulled from Docker Hub. The downside of not having a build associated with the release is that if any of the NATS configuration changes the release needs to be triggered manually. I see this as an infrequent requirement though. The message queue service doesn't have any dependencies on any other part of the application and so is the first component to be deployed following the initial Kubernetes configuration.
The configuration handled by megastore.message-queue.release is as follows:
- Deployment of the Kubernetes Service for the message queue.
- Deployment of the Kubernetes Deployment for the message queue.
This configuration is handled by message-queue.sh.
megastore.savesalehandler.build and megastore.savesalehandler.release
This build and linked release are responsible for deploying a new version of the .NET Core message queue handler application which receives message from the message queue and saves them to an Azure SQL database. The docker image is built and uploaded to ACR using this generic Bash script. This in turn triggers the megastore.savesalehandler.release which deals with the following configuration:
- Creation (or update) of the database connection string as a Secret.
- Deployment of the Kubernetes Deployment for the message queue handler component.
- Update the image for the Deployment to the latest version using the unique tag for the build that triggered the release.
This configuration is handled by megastore-savesalehandler.sh. The build is triggered through the Azure Pipelines Path filters feature:
Using the Path filters feature ensures that the build will only be triggered for continuous integration if a file in the specified folder is changed.
megastore.web.build and megastore.web.release
This build and linked release are responsible for deploying a new version of the ASP.NET Core web application which sends messages to the message queue service. As with the message queue handler, the docker image is built and uploaded to ACR using this generic Bash script. The build triggers the megastore.web.release which deals with the following configuration:
- Creation (or update) of the ASPNETCORE_ENVIRONMENT environment variable as a ConfigMap.
- Deployment of the Kubernetes Deployment for the web component.
- Deployment of the Kubernetes Service for the web component.
- Update the image for the Deployment to the latest version using the unique tag for the build that triggered the release.
This configuration is handled by megastore-web.sh and once again the build is triggered through the Azure Pipelines Path filters feature:
As before, using the Path filters feature ensures that the build will only be triggered for continuous integration if a file in the specified folder is changed.
In breaking down a monolithic pipeline in to multiple pipelines I exposed the problem of what to do with the shared helper library of functions that is use both by the megastore.web and megastore.savesalehandler components, because if this code changes one or sometimes both components will need redeploying. I think the answer is that helper libraries like these do not belong in the Visual Studio solution and instead should be developed separately and distributed and referenced as NuGet packages.
One of my aspirations is to get as much pipeline configuration in the GitHub repo as possible and you might well ask why I'm not using yaml files. Apart from the fact that I just haven't had time to look at this in detail yet, at the time of writing it's only a partial solution as it's only available for the build portion of the pipeline. This will change hopefully later this year when the release portion of the pipeline is supported, and at that point I'll make the switch.
That's it for now! Whether you are deploying to AKS or somewhere else I hope this post has provided you with ideas to supercharge your Azure DevOps pipelines.
Cheers -- Graham