Inside the Terraform Environments Feature

Terraform v0.9.0 shipped with a new core feature called environments. This formalizes a pattern that has been circulating around in the Terraform community for probably a number of years now, and is crucial to using TF effectively at scale.

In this post I’m going to dive into this feature and discuss a bit of what to expect, especially if you implement a custom environment pattern of your own currently. I’m also going to briefly discuss a pattern for managing your naming based on a module that computes names based off your environment, and finally a brief on the future plans for the feature.

Effective State Management

First off, it’s probably prudent to discuss the background of why first-class support for this feature was added.

A modern organization will have at least a production environment and a staging environment. Possibly, more complex organizations will have maybe multiple production environments in different datacenters spread across the world, staging environments to match, and perhaps even sandbox environments for their individual teams.

Terraform is great because it allows you to codify your infrastructure in a repeatable way. To take full advantage of this repeatability, it’s best to spread your Terraform configuration across multiple projects (especially when those projects don’t share infrastructure), and then deploy those projects in a unit of granularity that makes sense (such as on a per-datacenter or per-region basis), managing each state separately.

This is what I will be referring in this article as the environment pattern.

State Namespacing

The environment pattern is the simple partitioning of your Terraform state namespace into different categories, hopefully starting with the most unique entity (your project), down to the most common (your datacenters or regions).

This allows an effective logical separation of state, maintaining sanity, and reducing single points of failure.

As an example, say you have 2 projects. One is for a purely frontend web application whose content is hosted on S3 and served via CloudFront in AWS. The bucket is homed in us-west-2, which is notable even though both S3 and CloudFront are “global” services (the latter is, the former not necessarily, especially if you remember the recent S3 outage in us-east-1). The second is a backend API service for the web application, hosted on EC2 in both us-east-1 and us-west-2. Both of these applications reside in different projects and don’t share code and possibly even developers. Both of these projects are deployed to staging for QA, and then production once they pass QA and are ready for deploy.

Given this simple example, using the environment pattern, we would have two projects, and 6 different Terraform state namespaces. Almost, if not all, of our state pathing information is naturally delimited by slashes. They are below:

FE/app-frontend/production/us-west-2/terraform.tfstate
FE/app-frontend/staging/us-west-2/terraform.tfstate
BE/api-backend/production/us-east-1/terrafrom.tfstate
BE/api-backend/production/us-west-2/terrafrom.tfstate
BE/api-backend/staging/us-east-1/terraform.tfstate
BE/api-backend/staging/us-west-2/terrafrom.tfstate

This separation of configuration, and more importantly, Terraform state, allows for a couple of things. First off, the infrastructure for the frontend site and the backend API do not interfere with each other. They can be set up and maintained independently, without risk to the other when doing so. The second is the flexibility in where the infrastructure can be set up. Thanks to the fact that we are not explicitly stating in the Terraform code where the infrastructure is being set up, we can deploy first to a local region that makes sense, then to a satellite region for local availability or further site resilience.

This does not even need to necessarily apply to “physical” infrastructure either. Other kinds of configuration can be controlled in this way. Consider the scenario of organizing AWS IAM access for a complex application infrastructure. Leveraging modules and a name spacing scheme, you can use Terraform to manage IAM in a pluggable fashion where projects can get only the access they need for the environments they need it in, allowing you to add and remove this access at will, simply by adding or removing configuration.

Control of this pattern in your Terraform configuration used to be entirely up to you, and as such was highly flexible to what the individual project needed. Ultimately, this yields separate state organized in a hierarchical fashion as discussed above, either checked into source control, or stored remotely in a central state store. The scheme is highly flexible as to what you want to do, as if pathed the way we show above, referencing the state is semantic, and there are very few chances for collisions.

It can still be up to you, but now you have first-class support in that decision.

The Old Way

As a very basic illustration of how environment namespacing ultimately ultimately translated to Terraform state configuration is shown below. We use the example of remote configuration.

Remote Configuration

Pre-v0.9.0, this was entirely controlled via the terraform remote config command. This basically did the same thing that dropping in the backend configuration into your TF directory and running terraform init does now. The command went something like this:

You could then pass in this data via variables to your Terraform run.

The lack of fist-class support meant that structure of the environment and the variables that you pass into your project were completely up to you.

The New Way

Terraform v0.9.x changes this in a few ways.

First off, since environment is a single key now, we have to manage this key-space a different way if we want to incorporate region, or any other namespace, into the mix.

Backend Configuration

First off, we need to drop in the backend config into the Terraform directory:

After this is done, terraform init needs to be run to ensure that the .terraform directory contains all of the information that is needed to connect to the remote backend. This also needs to be incorporated into your scripts, as the .terraform directory should not be checked into source control – hence it should be the first command that is run, before checking for, and switching to, environments.

Switching to the Environment

After backend configuration is done, we need to either switch to the new environment, or create it if it doesn’t exist. See the gist below:

Note how we are getting around the single-keyspace limitation we now have, by simply combining our ENV and REGION with an underscore. Again, this can be customized to your needs, but you may want to keep in mind some other considerations to future-proof yourself (more on this later).

Referencing the Environment in Config

Probably one of the best things about the first-class support for environments is the addition of the terraform.env built-in variable. This can be referenced in config much like the variables that we had to pass in manually before, creating a clear, best-practice pattern and helping to clean up build scripts. Aside from any other variables you now choose to pass in, the only thing that is now necessary is to run terraform apply.

Under the Hood

So what does the state namespace look like under the hood when using environments?

Using local state (no backend), Terraform creates a terraform.state.d directory for your environments, dropping a terraform.tfstate file in a named directory for your environment within that directory. This is created in the directory that you run Terraform from.

Using remote state, things are a little different. Using S3 as the example, Terraform creates an env:/${ENVIRONMENT_NAME} directory (with colon), prefixing this on to the key that you specify to the backend config. This might take a bit of getting used to, and especially if you rely on the terraform_remote_state data source, you may actually need to migrate the location of your remote state to the new structure. Especially if you are reliant on a format that’s similar to the namespacing discussed at the beginning of this article, this will need to happen before you can take advantage of this support.

Managing Naming With Environment Data

Managing naming across several repositories with different environments can be a challenge. The environment pattern, both new and old, actually gives you a set of input data that can be used to help you manage these conventions in a conditional fashion. You can go a step further and encapsulate all of these names in a module. The below example shows you how you can do it when delimiting region on an underscore. You can obviously simplify this if you are not including region in your environment.

A very simple module could look like:

And could be included like so:

This would then render something like frontend.us-west-2.dev.foobar.local if your environment was staging_us-west-2. You can encapsulate any variety of names behind this module too, giving you a nicely packaged way to compute any number of standardized endpoint or resource names for your organization across multiple projects and environments.

This module looks pretty much the same in the old pattern, just with the inclusion of an environment and region parameter. So ultimately, the new pattern does save some coding for the module consumer, as now they don’t need to worry about supplying environment and region.

Future-Proofing

HashiCorp has expressed some intentions in regards to plans for their environment support, namely having to do with source control tracking. As such, it might be prudent to start structuring your environment naming conventions into your git workflows, so that you can take advantage of this support when it comes out. This also means that your environments should probably be structured so that they are compatible with the naming Git or your VCS of choice affords you, to, at the very least, ensure things look sane and readable.

Conclusion

The environments feature is a welcome feature to Terraform. It provides first-class validation to a pattern that has been been in use in the community for a while, possibly allowing internal code that has been written to manage this practice to be cleaned up in the process. There might be a little bit of a learning curve in adoption, but the task is not insurmountable, and the presence of the feature alone and the formalization of the pattern will help encourage its use as a general best practice.

Advertisements