Terraform + VSTS + Release Management
What a wordy bloody title, unfortunately dealing with multiple environments can be wordy and tooly (I made that up, but you know what I mean, bit of a workshop job). Coming from the fabled world of a single environment proudly displaying the hashtag #TestInProduction; Imagine for a second me rocking up at the new gig only to find no less than 8 environments with 2 more in gestation 😱
Remeber that scene in aliens with the little birthing pods with the face huggers ready to jump out and kill you off! that's what the sight of 8 environments looks like to me, there was probably a Prometheus joke in there somewhere but let's not talk about the observability situation.
Anyway, I've digressed and we've only just hit the 3rd paragraph, so back on track, I know terraform very well for managing 1 environment, I know terraform workspaces well enough, now what I didn't know was how to role changes elegantly through a few environments. Actually, for me, it's a schematic pain point every time I have to do this exercise, let me explain a little more.
Here is Aubrey's worldview, don't worry if you disagree; tech is vast and you can find places and times where you can come heckle me from the stalls. Taking trunk based development as a base, we feature branch, write some cool stuff, push said feature branch, raise PR, merge with master.
Branch + Push
CI will run against our branch and do nothing with the result.
Raise PR
I recommend you policy your master branch to ensure you can't merge unless you PR is rebased. This is because CI will run against the head of your PR, but if master moves on, then there is still a chance something could break. There are a couple of pipelines out there that will rebase and do the CI but as with anything you're always fighting against a point in time and there next guy or girl is always behind you with their merge.
Merge
Once our PR is merged, we will likely have a conditional line in our CI pipeline to do something with the produced artefact, so there is little point pushing a Dockerfile to the remote repository on any build other than a master branch build because we're running on the golden principle that master should always be deployable.
Back to my point about semantic pain in the ass, we can release this artefact into an environment, and I could forgive having that step in the Pipeline definition if you only had one environment, what if you don't! what if you are staring into the precipice and all you see are face-hugger eggs?
In this case, Pipeline is a dumb place to handle releasing because its a massive conflation of objectives, build an artefact vs release an artefact, there is no sexy way to handle this and that's a hard fail for me.
Let's talk about the Terraform way of handling multiple environments and build on that, and I should make the distinction between Terraform and Terraform Enterprise. Now in both enterprise and the regular joe Terraform; there is a concept of workspaces, they are however totally different things.
In Terraform, you start off in the default workspace, new workspaces can be created with the terraform workspace new [workspace-name]
command and selected for use with terraform workspace select [workspace-name]
.
Let me answer a couple of those questions you've already got, workspaces are stored in your backend, same if you have a remote backend on AWS or Azure of where ever it might live. You can leverage the workspace name by using ${terraform.workspace}
here is a pretty example:
You can sort of see how this comes together now right, there are some other helpful things like environmental specific .tfvar files that exist when using terraform workspaces too. We write our infrastructure definition so that we're environment agnostic, where we need to identify the environment, we interpolate the terraform.workspace value.
Finally, it's just the regular plan
and apply
commands but before you run them you must remember to switch to the correct workspace first using the terraform workspace select [workspace-name]
command.
Now we have covered regular Terraform , let's look at terraform enterprise; when you create a workspace in terraform enterprise it's done via the web interface, we choose a source repository and a name for the workspace, thus if we have for example 4 environments, we would need to set up 4 workspaces pointing to the same repo. When the push hook fires, each one of those workspaces in enterprise will start running a plan and then wait for you to apply.
My personal view is that can be a bit messy, I also don't think it's a great way of representing environments, even if they are technically flat, there is usually an order change are promoted and I'd prefer that to be constrained.
So what's the solution? Well VSTS does offer Release Management which solves our problem perfectly with just a few minor speed bumps. I know I'm a CircleCI kind of girl and I know a few people will consider me a traitor but hey, I'm working with Azure so VSTS makes much more sense compared to Circle, we have to take ecosystem into account.
Comprises of a couple of components, first of all, we need to define our artefacts, so we can target a build on VSTS so we need to modify our build to push artefacts at the end, our release will then automagically download those artefacts when it runs on an agent.
You'd be forgiven for thinking the plan would be the artefact but remember that plan is generated off the back of two things, the workspace and definition itself (all the terraform files), so using the same plan at each stage of the release makes little sense because we need to generate that bad boy for each environment.
So in this situation, our backend will be remote, and we will bundle up our terraform definitions and push those as the artefacts when we build from our master branch.
Essentially we're using release management to provide the required context needed to run a terraform plan
. Inorder to do a plan, manual intervention and then apply in the release definition, we're going to have to do something a little bit clever due to the limitation that we can't pass files between agents in any way, we also can't do an artefact push and serialising the plan to a envvar would be a bit risky.
What do we do Aubrey??? Well option one would be to just duplicate the definition and run a plan and apply in the second stage after we approve, the only problem is drift and if anyone else has made a change between the first step and the approval there is nothing to lock those changes into place.
Our second option is to push the plan after the first step to S3 or Blob Storage, name it after the build ID + environment which we do have as an envvar for, and in step two, suck in the said plan from our cloud storage and apply that, not bad eh, not perfect but not bad.
We can also use Task Groups to abstract all this junk into just one task group we can import for future environments through your workspace will probably need to be the same as the value of the variable you're using with that group to pivot behaviour.
Clearly, if you want to just apply & plan in a single step a lot of stuff can be avoided and automated which makes things very slick, personally I like to check what I'm applying but then if you're constraining access to an environment so that nothing else could possibly commit changes to it, maybe you could use release management to auto-flow your definition through all environments.
One final point, you always have the option of mixing automated with manual when it comes to rolling through your defined environments.
Release management view, we can redeploy or deploy anything set to manual from here.
This is a nice way to look at previous releases, history, environments they made their way through.
I hope you enjoyed this write up, A good way to think of mono-environments is like this; you could wasted loads of time building, maintaining loads of environments and then fixing problems, or just have one environment and invest all that saved time into great testing & tools like kubernetes, istio & kayenta.
Aubrey xoxo