Deploy a Dockerized Application to Azure Kubernetes Service using Azure YAML Pipelines 5 – Application Deployment Pipelines
This is the fifth post in a series where I'm taking a fresh look at how to deploy a dockerized application to Azure Kubernetes Service (AKS) using Azure Pipelines after having previously blogged about this in 2018. The list of posts in this series is as follows:
- Getting Started
- Terraform Development Experience
- Terraform Deployment Pipeline
- Running a Dockerized Application Locally
- Application Deployment Pipelines (this post)
- Telemetry and Diagnostics
In this post I deploy the MegaStore sample application that was introduced in the previous post to AKS using YAML Azure Pipelines. If you want to follow along you can clone / fork my repo here, and if you haven't already done so please take a look at the first post to understand the background, what this series hopes to cover and the tools mentioned in this post. I'm not covering Azure Pipelines basics here and if this is of interest take a look at this video and or this series of videos. I'm also assuming general familiarity with Azure DevOps and the Azure Portal.
For me this is probably the most exciting post in the series. I've been developing Azure Pipelines using YAML for a little while now and I love working in this way and wouldn't want to go back to classic pipelines ie GUI tasks.
Even though we're dealing with pipelines as code there's still a lot to configure, so let's get started!
Azure SQL qa and prd Databases
First configure the Azure SQL qa and prd databases created in a previous post. Using SQL Server Management Studio (SSMS) login to Azure SQL where Server name will be something like yourservername-asql.database.windows.net and Login and Password are the values supplied to the asql_administrator_login_name and asql_administrator_login_password Terraform variables. Once logged in create the following objects using the files in the repo's sql folder (use Ctrl+Shift+M in SSMS to show the Template Parameters dialog to add the qa and prd suffixes):
- SQL logins called sales_user_qa and sales_user_prd based on create-login-template.sql. Make a note of the passwords.
- In both the qa and prd databases users called sales_user and a table called Sale based on configure-database-template.sql.
Note: if you are having problems logging in to Azure SQL from SSMS make sure you have correctly set a firewall rule to allow your local workstation to connect.
Self-hosted Linux Agent
The MegaStore sample application uses Linux containers so we need a Linux agent running Docker to build them. The Microsoft ubuntu-latest agent will work but as noted in a previous post the Microsoft agents can be slow and you can't directly see what they are doing at the file system level. However, due to the magic of the newer versions of Docker Desktop and WSL 2 we can easily run a self-hosted Linux agent on a Windows 10 machine. The instructions for configuring a self-host agent can be found here and I assume that you have the prerequisites installed and configured as per the first post in this series. The high-level procedure is as follows:
- If you didn't create a new Agent Pool in Azure DevOps as part of a previous post, you'll need to create anew pool called Local at Organization Settings > Pipelines > Agent Pools > Add pool.
- On your Windows machine create a folder such as C:\agents\linux.
- Download the agent which will have a filename like vsts-agent-linux-x64-2.165.2.tar.gz. Move this file to C:\agents\linux (it's okay to do this in Windows Explorer).
- The tar file needs to be unzipped from an Ubuntu Bash prompt (ie Ubuntu running under WSL 2). Make sure you are at /mntc/agents/linux and then run tar zxvf vsts-agent-linux-x64-2.165.2.tar.gz (obviously substitute the correct filename as the version may have moved on by the time you read this). It took a couple of minutes on my machine.
- Now run ./config.sh to start the configuration process.
- You will need to supply your Azure DevOps server URL and previously created PAT.
- Use ubuntu-18.04 as the agent name and for this local instance I recommend not running as a service or at startup.
- The agent can be started by running ./run.sh at an Ubuntu Bash prompt after which you should see something this:
- After the agent has finished running a pipeline job you can examine the files in C:\agents\linux\_work (Windows Explorer works fine) to understand what happened and assist with troubleshooting any issues.
- The ubuntu-18.04 agent name will be used in a few pipelines so it's a good candidate for adding to the megastore variable group as local_linux_agent_name.
- Don't forget that you'll need Docker Desktop running to run any pipeline jobs that use Docker.
Create a Secure File to Authenticate to AKS
One of the techniques I'm demonstrating in this blog series and in this post in particular is how to take full control of the pipeline by working with command line tools rather than Azure Pipeline tasks. Whilst tasks undoubtedly have their place, for some command line tools I don't like the way that tasks abstract away what is going on and, because of the Swiss Army knife nature of some tasks, the way they sometimes force you to supply information that may not actually be used for a task sub-command.
The command line tool predominantly in use in this post is kubectl—used to issue commands to a Kubernetes cluster. When used locally kubectl works in conjunction with a kubeconfig file that specifies connection details to a cluster. On a Windows machine, by default kubectl is going to look in C:\Users\%USERNAME%\.kube for a kubeconfig file called config. That's not going to work in an Azure Pipeline (or any pipeline) so we need a different approach. It turns out that kubectl has a --kubeconfig parameter for specifying the path to a kubeconfig file. We can make use of this in Azure Pipelines by uploading the C:\Users\%USERNAME%\.kube\config file as a Secure files item. In the pipeline we can then call a task to download the file, which by default will be to $(Agent.TempDirectory). The procedure for configuring all this is as follows:
- Whilst logged in to the Azure CLI and with the correct Azure subscription set, run az aks get-credentials --resource-group yourResourceGroup --name yourAksCluster. This will create the config file at C:\Users\%USERNAME%\.kube.
- In Azure DevOps navigate to Pipelines > Library and click + Secure file.
- Use the Upload file dialog to Browse to and upload the config file. The new secure file item is named the same as the file.
- Use the ellipsis to the right of the new secure file item to edit it:
- Edit the secure file item so that Pipeline permissions is set to Authorize for use in all pipelines:
- Note that (at least at the time of writing) for some reason this change doesn't cause the Save link to light up but you can navigate away from the editor without losing changes.
Once you have the kubeconfig file installed on your local machine you can access the cluster's Dashboard by running az aks browse --resource-group yourResourceGroup --name yourAksCluster.
Create Kubernetes Namespaces
Two Kubernetes namespaces are needed that will be the deployment environments. The great thing about using namespaces is that exactly the same configuration can be applied to each namespace without any naming collisions. For example, the message queue URL is nats://message-queue-service:4222 and this same URL works in all environments without any clashes.
With the kubeconfig file installed as above namespaces can be created from the command line using kubectl create namespace qa and kubectl create namespace prd.
Configure a Pipeline Environment
From the docs: An environment is a collection of resources that can be targeted by deployments from a pipeline. At the time of writing only a couple of resource types are supported, one of them being Kubernetes. It's actually a very handy way of being able to see what's going on in the cluster, including the health of pods and being able to look at the logs for each pod. There's also some nice traceability. Configuration is mostly straightforward:
- In Azure DevOps navigate to Pipelines > Environments and click New Environment.
- In the dialog that appears set the Name to megastore, select Kubernetes then Next.
- In the next step select Azure Kubernetes Service as the Provider and follow through with the authentication procedure.
- For Namespace select Existing and select qa in the dropdown:
- Click Validate and create to complete the first part of the process.
- In the next screen that appears click Add resource and repeat the above process but this time for the prd namespace. The final result should be something like this:
- Create a variable called environment_name for the name of the environment in the megastore variable group.
- Note that I've never seen the Latest job column change from Never deployed despite doing many deployments. Something to investigate...
Generic Procedure for Creating a Pipeline from an Existing YAML File
Thee are four separate pipelines that need creating to deploy MegaStore to AKS and this is the generic procedure for creating them from existing YAML files assuming you have cloned / forked the repo on GitHub:
- In Azure DevOps navigate to Pipelines > Pipelines and click New pipeline.
- In the Connect tab choose GitHub as the location for your code.
- In the Select tab choose the appropriate repository, possibly using the dropdown to show All repositories rather than My repositories.
- In the Configure tab choose Existing Azure Pipelines YAML file and then in the window that pops, for Path select the required YAML file and click Continue.
- In the Review tab click the dropdown next to Run and click Save.
- The next screen you are presented with invites you to run the pipeline but before doing that click the vertical ellipsis / slimline hamburger menu next to the rightmost Run pipeline and select Rename / move:
- Overwrite Name with the desired name and click Save.
- The final step is to define any variables that are not defined in the pipeline itself. There are two options here: in the UI of the pipeline and in a variable group. More on this below.
Working With YAML Pipelines
Whilst it's possible to edit pipelines in Azure DevOps I've never bothered, and instead I prefer to use VS Code with the Azure Pipelines extension. By using a yml extension for pipeline file and a yaml extension for Kubernetes files it's possible to tell VS Code to associate just yml files with the pipelines extension using this in settings.json:
If that convention doesn't work for you an alternative could be to add a prefix to your pipelines and use that to identify them to the extension.
For various reasons I spent a very long time refactoring and fine-tuning the pipelines used in this blog series (okay, I went down several rabbit holes) and I've tried to capture what I learned below.
Choose stage names to promote code reusability
I know it's not always possible but if you can match the stage names in the release part of the pipeline to the names of your actual environments then you can make use of predefined variables such as $(System.StageName) to write templates (see below) that can be reused in different stages possibly without any extra work. (If your stage and environment names can't match for whatever reason you can still pass in the environment name as a parameter to a template but it's extra work.) For MegaStore deployment I have two AKS environments (qa and prd) and these match the qa and prd stages of the pipelines.
Talking of stages there is also a first stage to each pipeline I call init as I think this is a better name than build when nothing is actually being built, but that's just a personal preference.
Consider how many jobs a pipeline needs and the type of job
A job in Azure Pipelines is the top level container for the work that actually happens. Jobs do a lot of stuff to get ready for this work which is all potential overhead for a pipeline. As a rule of thumb you probably want to use as few jobs as you can get away with, which at a minimum is one job per stage.
You should also appreciate the difference between standard and deployment jobs. In addition to the differences described in the documentation I've noticed that a deployment job doesn't perform a git checkout unlike a standard job, so it looks like Microsoft have optimised the deployment job for deployment as well as giving it some extra functionality. In the MegaStore pipelines I've used a standard job for the init stage and deployment jobs for the qa and prd stages.
Where to declare variables
Variables in Azure Pipelines is a pretty large and complex topic but these resources go a long way to help understand how they work and the different options:
- Understanding Azure DevOps Variables [Complete Guide]
- Azure Pipeline Variables
- Define variables
In terms of where to declare variables, if they are just needed for that pipeline and are not secrets they should be declared in the pipeline itself. Variables that are needed across multiple pipelines should be declared in a variable group, which also allows for the management of variables that are secrets. The remaining scenario is where to store secrets that are only used in one pipeline. The official documentation advises using the pipeline settings UI, but I'm not certain if storing related variables and secrets in multiple locations might cause confusion and whether it's better to store related items together in a variable group. I will be using the pipeline settings UI in this post to illustrate the technique and will leave it to you to make your own mind up about whether it's a good idea to split related variables.
Giving a pipeline a custom run name
The name keyword at the beginning of each pipeline allows you to provide a custom name for each run of the pipeline. I've specified a Semantic versioning type name but there's lots of configurability.
How and when to clean the workspace
Whilst it may not always be appropriate, my general preference is to start each new run of a pipeline with a completely clean workspace so there is no chance of contamination from a previous run. Looking back in time it seems that in late 2019 the procedure for cleaning the workspace changed from cleaning at the pool level to the job level. Typically you only want to clean the workspace once per run and I've dealt with this by performing a clean in the init job of the init stage of each pipeline.
Versioning files used in the pipeline
The MegaStore pipelines call Kubernetes manifest files from the kubectl command line. (These are the YAML files in the k8s folder.) Since this folder exists on disk after the git checkout these files can be referenced directly from the command line. However, this is probably not a great idea because in theory it's possible to write a pipeline against a frequently changing repo that could end up using one version of a file in one stage of the pipeline and a different version in another.
A much better practice in my view is to package files in to an artifact and then make those packaged files available to the stages of the pipeline. An additional benefit of this approach is that the artifact is associated with the pipeline run and can be examined at a later date if you need to understand what was actually deployed. (Note that in the MegaStore pipelines I'm being a bit lazy in packaging the whole k8s folder but that isn't strictly necessary as not every file is used in each pipeline.)
By default a deployment job will try and download an artifact created in a previous part of the pipeline. In my pipelines I'm explicitly downloading the artifact in the init stage so I suppress this in the qa and prd stages using the download: none keyword.
Refactor the pipeline with templates
You can and should refactor your pipelines with templates. From the docs: Templates let you define reusable content, logic, and parameters. Templates function in two ways. You can insert reusable content with a template or you can use a template to control what is allowed in a pipeline. I'm using the first version here, ie to package reusable content.
Templates work at different levels, and can be used to reuse steps, jobs and stages. I started by creating job templates as it made the main pipeline much cleaner. However, I realised that the job templates in a stage were executing in any order, which definitely was not what I wanted. Other than possibly passing in a parameter to the template to control dependency I couldn't see an obvious way to set the execution order of jobs templates. This, in conjunction with my realising that there is some overhead to each job (see above) meant that I ditched job templates for step templates.
As an aside, one great thing I learned whilst using (the now abandoned) job templates was how to dynamically set the job name, as I wanted the job name to include the stage name. You can't simply append $(System.StageName) to the job name in a template because the job name needs to be evaluated before the pipeline executes. However, you can pass a parameter in to the template that uses the template expression syntax in the template which gets resolved during pipeline initialization. I couldn't stop smiling when I came across this feature.
A final thought about templates is that it's probably a good idea to make sure you don't take refactoring too far, as to me it feels like the single-responsibility principle ought to apply to templates. I fell foul of this by nesting a template in a template. There are valid reasons to do this but in my case the nested template had nothing to do with the parent template and I decided it was probably a bad idea.
Configuring and Running the MegaStore Pipelines
At long last we get to actually create the pipelines. You should follow the generic procedure above to create the following:
- megastore-config, with the following variables
- acr_authentication_secret_name = acrauth: in pipeline settings UI as plain text
- acr_name = ACR name from Azure Portal: in megastore variable group as plain text
- acr_password = ACR password from Azure Portal: in megastore variable group as secret
- appinsights_instrumentationkey_qa = App Insights qa key from Azure Portal: in pipeline settings UI as plain text
- appinsights_instrumentationkey_prd = App Insights prd key from Azure Portal: in pipeline settings UI as plain text
- db_password_qa = password generate above for sales_user_qa login
- db_password_prd = password generate above for sales_user_prd login
- db_server_name = Azure SQL server name without the megastoreprm-asql.database.windows.net element
The first pipeline to run should be megastore-config as this sets up environment variables used by other pipelines. In a stable system (ie not in active development / test cycle) this pipeline wouldn't be needed again unless any of the environment variables change.
The next pipeline to run is megastore-message-queue as it doesn't have dependencies. The pipeline creates a Kubernetes Service to expose pod(s) running the NATS message queue which are deployed using a Kubernetes Deployment. For this demo setup the NATS Docker image is pulled directly from Docker Hub so there is no interaction with Azure Container Registry. Again, once deployed this pipeline would only needed to be deployed infrequently.
The final pipelines can be run in any order. The megastore-savesalehandler pipeline only consists of a deployment because nothing needs to connect to it all it does is monitor the message queue. The megastore-web pipeline requires both a service and a deployment because we want to talk to the pod(s) from the outside world. In both cases the init stage of the pipeline runs a series of commands to build a new image and upload it to Azure Container Registry tagged with the build number. The kubectl set image command ensures that the image with the correct build number is deployed. With a changing application these pipelines would be deployed as required to release new features. These application components can be developed and deployed independently of each other but will reply on testing in Visual Studio to make sure nothing is broken.
That's it Folks!
I'm aware that there is a lot of small moving parts here and lots of scope for things to be missed. If you are following along and getting errors please leave a comment and I'll try to help. Missing or misspelt variables are a common thing that trip me up.
For me, the big takeaway from this post is that I've found writing YAML Azure Pipelines to be a very enjoyable and extremely productive way to develop deployment pipelines. If you haven't tried them I urge you to give it a go. You might be pleasantly surprised.
Next time we change gears completely and look at how Application Insights fits in to all of this.
Cheers -- Graham