Migrating Terraform State Files to Workspaces in an AWS S3 Bucket

Just as I did with GCP a few weeks ago, I needed to circle back and migrate my state files to a cloud storage bucket. This done mainly to centralize the storage location automatically and thus lower the chance of a state file loss or corruption.

Previously, I’d been separating the state files using the -state parameter. I then use a different input file and state file for each environment like this:

terraform apply -var-file=env1.tfvars -state=env1.tfstate
terraform apply -var-file=env2.tfvars -state=env2.tfstate
terraform apply -var-file=env3.tfvars -state=env3.tfstate

To instead store the state files in an AWS S3 bucket, create a backend.tf file with this content:

terraform {
  backend "s3" {
    bucket               = "my-bucket-name"
    workspace_key_prefix = "tf-state"
    key                  = "terraform.tfstate"
    region               = "us-west-1"
  }
}

This will use a bucket named ‘my-bucket-name’ in AWS region us-west-1. Each workspace will store its state file in tfstate/<WORKSPACE_NAME>/terraform.tfstate

Note: if workspace_key_prefix is not specified, the directory ‘env:‘ will be created and used.

Since the backend has changed, I have to run this:

terraform init -reconfigure

I then have to copy the local state files to the correct location that the workspace will be using. This is easiest done with the AWS CLI tool, which will automatically create the sub-directory if it doesn’t exist.

aws s3 cp env1.tfstate s3://my-bucket-name/tf-state/env1/terraform.tfstate
aws s3 cp env2.tfstate s3://my-bucket-name/tf-state/env2/terraform.tfstate
aws s3 cp env3.tfstate s3://my-bucket-name/tf-state/env3/terraform.tfstate

I then create a workspace for each state file:

$ terraform workspace new env1
Created and switched to workspace "env1"!

Now I’m ready to run the applies and verify state is matching input

$ terraform apply -var-file=env1.tfvars

No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

$ terraform workspace new env2
Created and switched to workspace "env2"!

$ terraform apply -var-file=env2.tfvars

No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Doing it in the opposite order

An alternate way to do this migration is enable workspaces first, then migrate the backend to S3.

$ terraform workspace new env1
Created and switched to workspace "env1"!

$ mv env1.tfstate terraform.tfstate.d/env1/terraform.tfstate

$ terraform apply -var-file=env1.tfvars

No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Then create the backend.tf file and run terraform init -reconfigure. You’ll then be prompted to move the state files to S3:

$ terraform init -reconfigure
Initializing modules...

Initializing the backend...
Do you want to migrate all workspaces to "s3"?

Enter a value: yes

$ terraform apply -var-file=env1.tfvars

No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Either way, the state files have to be individually migrated to the storage bucket

Advertisement

Migrating Terraform to Workspaces & Storage Buckets

As I started using Terraform more, I quickly realized it’s beneficial to use separate state files for difference groups of resources. It goes without saying multiple environments should be in different state files, as should MSP scenarios where there’s multiple customer deployments running off the same Terraform code. The main benefit is to reduce blast radius if something goes wrong, but the additional benefit is limiting dependencies and improving performance.

So when running Terraform, I’d end up doing these steps:

git pull
terraform init
terraform plan -var-file="env1.tfvars" -state="env1.tfstate"
terraform apply -var-file="env1.tfvars" -state="env1.tfstate"
terraform plan -var-file="env2.tfvars" -state="env2.tfstate"
terraform apply -var-file="env2.tfvars" -state="env2.tfstate"
git add *.tfstate *.tfstate.backup
git commit -m "updated state files"
git push

This works OK, but isn’t ideal for a couple reasons. First, the state file can’t be checked out and updated by two users at the same time – git would try and merge the two files, which would likely result in corruption. Also, state files can contain sensitive information like passwords, and really shouldn’t be stored in the repo at all.

So the better solution is store in a Cloud Storage bucket, such as AWS S3 or Google Cloud Storage. This is usually configured by a backend.tf file that specifies the bucket name and directory prefix for storing state files and looks something like this:

terraform {
  backend "gcs" {
    bucket = "my-gcs-bucket-name"
    prefix = "terraform"
  }
}

After creating this file, we must run terraform init to initialize the new backend:

terraform init
Initializing modules...

Initializing the backend...

Successfully configured the backend "gcs"! Terraform will automatically
use this backend unless the backend configuration changes.

But now if we run terraform with the -state parameter, it will look for the state file in the bucket, not find it, and determine it needs to re-create everything, which is incorrect.

The solution to this problem is use a different workspace for each state file.

terraform workspace list
* default

terraform workspace new env1
Created and switched to workspace "env1"!

You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.

Terraform will now look in the bucket for terraform/env1.tfstate, but that file is still local. So we must manually copy it over:

gsutil copy env1.tfstate gs://my-gcs-bucket/terraform/

Repeat this process for all state files. Now, when we run terraform plan/apply, there is no need to specify the state file. It’s automatically known. And assuming we’ve made no changes, terraform should report no changes required.

terraform workspace select env1
terraform apply -var-file="env1.tfvars"

No changes. Your infrastructure matches the configuration.

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

terraform workspace select env2
terraform apply -var-file="env2.tfvars"

No changes. Your infrastructure matches the configuration.

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

And it’s all good

Installing Terraform on FreeBSD

I was pleased to recently discover that Terraform is in the FreeBSD packages. To install, simply do this:

pkg install terraform

As of February 2022, the latest version is 1.0.11. To run 1.1.6, first remove the package:

pkg remove terraform

Then download the most recent version and copy the binary to /usr/local/bin. Should be good to go:

% terraform version
Terraform v1.1.6
on freebsd_amd64

% terraform init

Initializing the backend...

Initializing provider plugins...
- Finding latest version of hashicorp/aws...
- Installing hashicorp/aws v4.2.0...
- Installed hashicorp/aws v4.2.0 (signed by HashiCorp)

Terraform has been successfully initialized!

Hopefully they’ll update to 1.1 soon. Today I learned about the nullable option for variables, and it’s a very useful option when working with parent/child modules.

Install Terraform on Debian 10 (Buster) when a proxy is required

# Setup proxy, if required
sudo bash -c 'echo "Acquire::http::Proxy \"http://10.0.0.9:3128\";" > /etc/apt/apt.conf.d/99http-proxy'

# Set environment variables to be used by Curl
export http_proxy=http://10.0.0.9:3128
export https_proxy=http://10.0.0.9:3128

Now install Terraform

curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -

sudo apt-get install software-properties-common

sudo apt-add-repository "deb [arch=$(dpkg --print-architecture)] https://apt.releases.hashicorp.com $(lsb_release -cs) main"

sudo apt update
sudo apt upgrade
sudo apt install terraform 

Basic Network-Related Terraform w/ GCP

Setting up Terraform for GCP

Start creating .tf files:

terraform {
  required_providers {
    google = {
      source = "hashicorp/google"
    }
  }
}

provider "google" {
  version = "3.5.0"
  credentials = file("myproject-123456-f72073802721.json")
  project = "myproject-123456"
  region  = "us-central1"
  zone    = "us-central1-a"
}

Create new VPC Network with subnets in Oregon and London

# Create new network called 'my-network'
resource "google_compute_network" "TF_NETWORK" {
  name = "my-network"
  auto_create_subnetworks = false
}

# Create subnet 172.16.1.0/24 in us-west1 (Oregon);
# Enable private API access & 1 minute 100% flow logging
resource "google_compute_subnetwork" "TF_SUBNET_1" {
  name          = "my-network-subnet-oregon"
  ip_cidr_range = "172.16.1.0/24"
  region        = "us-west1"
  network       = google_compute_network.TF_NETWORK.id
  private_ip_google_access = true
  log_config {
    aggregation_interval = "INTERVAL_1_MIN"
    flow_sampling        = 1.0
    metadata             = "INCLUDE_ALL_METADATA"
  }
}

# Create subnet 172.16.2.0/24 in europe-west2 (London)
# Add secondary IP range 192.168.200.0/26
resource "google_compute_subnetwork" "TF_SUBNET_2" {
  name          = "my-network-subnet-london"
  ip_cidr_range = "172.16.2.0/24"
  region        = "europe-west2"
  network       = google_compute_network.TF_NETWORK.id
  secondary_ip_range {
    range_name    = "tf-subnet-london-secondary-range"
    ip_cidr_range = "192.168.200.0/26"
  }
}

Create (ingress) firewall rules

# Allow ICMP, SSH, and DNS from RFC-1918 Private Address Space
resource "google_compute_firewall" "TF_FWRULE_1" {
  name    = "allow-ssh-and-dns-from-rfc-1918"
  network = google_compute_network.TF_NETWORK.name
  allow {
    protocol = "icmp"
  }
  allow {
    protocol = "tcp"
    ports = ["22"]
  }
  allow {
    protocol = "udp"
    ports = ["53"]
  }
  source_ranges = ["10.0.0.0/8","172.16.0.0/12","192.168.0.0/16"]
}

# Allow HTTP & HTTPS from Internet w/ logging enabled;
# applied to instances with network tag 'nginx' or 'apache'
resource "google_compute_firewall" "TF_FWRULE_2" {
  name    = "allow-http-and-https-from-internet"
  network = google_compute_network.TF_NETWORK.name
  enable_logging = true
  allow {
    protocol = "tcp"
    ports    = ["80", "443"]
  }
  target_tags = ["nginx", "apache"]
}

Create an External L7 Load balancer

# Create basic port 80 healthcheck
resource "google_compute_health_check" "TF_HEALTHCHECK" {
  name               = "check-website-backend"
  check_interval_sec = 15
  timeout_sec        = 3
  tcp_health_check {
    port = "80"
  }
}

# Create Backend service
 with backend timeout of 15 seconds and client IP session affinity
resource "google_compute_backend_service" "TF_BACKEND_SERVICE" {
  name                  = "website-backend-service"
  health_checks         = [google_compute_health_check.TF_HEALTHCHECK.id]
  timeout_sec           = 15
  session_affinity      = "CLIENT_IP"
}

# Create URL map (Load balancer)
resource "google_compute_url_map" "TF_URL_MAP" {
  name                  = "my-load-balancer"
  default_service       = google_compute_backend_service.TF_BACKEND_SERVICE.id
}

# Create HTTP target proxy
resource "google_compute_target_http_proxy" "TF_TPROXY_HTTP" {
  name                  = "my-http-target-proxy"
  url_map               = google_compute_url_map.TF_URL_MAP.id
}

# Create ssl cert/key HTTPS target proxy
resource "google_compute_ssl_certificate" "TF_SSL_CERT" {
  name        = "my-ssl-certificate"
  private_key = file("mykey.key")
  certificate = file("mycert.crt")
}
resource "google_compute_target_https_proxy" "TF_TPROXY_HTTPS" {
  name                  = "my-https-target-proxy"
  url_map               = google_compute_url_map.TF_URL_MAP.id
  ssl_certificates      = [google_compute_ssl_certificate.TF_SSL_CERT.id]
}

# Allocate External Global IP Address
resource "google_compute_global_address" "TF_IP_ADDRESS" {
  name                  = "gcp-l7-externalip-global"
}

# Create HTTP frontend
resource "google_compute_global_forwarding_rule" "TF_FWD_RULE_1" {
  name                  = "my-frontend-http"
  ip_address            = google_compute_global_address.TF_GLOBAL_IP_ADDRESS.address
  port_range            = "80"
  target                = google_compute_target_http_proxy.TF_TPROXY_HTTP.id
}

# Create HTTPS frontend
resource "google_compute_global_forwarding_rule" "TF_FWD_RULE_2" {
  name                  = "my-frontend-https"
  ip_address            = google_compute_global_address.TF_GLOBAL_IP_ADDRESS.address
  port_range            = "443"
  target                = google_compute_target_https_proxy.TF_TPROXY_HTTPS.id
}