Terraform Infrastructure as Code

Overview

Infrastructure as Code is the practice of defining and managing cloud infrastructure — servers, databases, networking, load balancers, DNS records, IAM policies — through code rather than through manual configuration in cloud provider consoles. Terraform is the dominant Infrastructure as Code tool, providing a declarative configuration language (HCL) and a provider ecosystem that covers AWS, Google Cloud, Azure, and hundreds of other cloud and SaaS providers through a consistent workflow.

The operational case for Infrastructure as Code is straightforward. Infrastructure defined in code can be version-controlled — every change is tracked, every historical configuration is recoverable, and the reason for each change can be documented in commit messages. Infrastructure defined in code is reproducible — applying the same configuration to a new environment produces an identical result, without the manual steps that may be executed incorrectly or forgotten. Infrastructure defined in code is auditable — what is running in production matches what is in the repository, rather than having drifted from the original configuration through manual changes that were never documented.

The alternative — infrastructure configured manually through cloud consoles — accumulates drift over time. Resources are created that are not documented. Configuration changes are made that are not tracked. The person who configured the original setup leaves, and what they built is partially understood at best. Reproducing the environment after a failure requires reconstructing from memory and observation rather than applying a known configuration. The production environment and the disaster recovery environment diverge because the changes made to one are not systematically applied to the other.

Terraform eliminates this class of operational problems for the infrastructure it manages. We use Terraform for the infrastructure provisioning of the cloud-deployed applications we build, and provide Terraform infrastructure development and migration services for existing infrastructure that needs to be brought under Infrastructure as Code management.

What Terraform Infrastructure as Code Covers

Provider configuration and state management. Terraform's interaction with cloud providers through provider plugins, and the state file that tracks what Terraform has created.

Provider configuration: the Terraform provider block that authenticates Terraform to the cloud provider — AWS credentials through IAM roles, GCP credentials through service account keys, Azure credentials through service principals. Provider version pinning that prevents unintended provider upgrades from changing resource behaviour. The provider configuration that allows multiple AWS accounts or multiple cloud providers to be managed in the same Terraform configuration.

State backends: the Terraform state file that records the current state of managed infrastructure — the mapping between Terraform resource definitions and the actual cloud resources they manage. Remote state backends — AWS S3 with DynamoDB locking, Terraform Cloud, GCS — that store state outside the local filesystem for team collaboration and for CI/CD pipeline access. State locking that prevents concurrent Terraform operations from producing conflicting state.

State management operations: terraform import for importing existing cloud resources into Terraform management without recreation. terraform state mv for restructuring state without destroying and recreating resources. terraform state rm for removing resources from Terraform management without deleting the underlying infrastructure.

AWS infrastructure. The AWS resource types that production web application infrastructure commonly requires.

Compute: EC2 instances with the correct instance type, AMI, key pair, security group, and IAM instance profile. Auto Scaling Groups with launch templates that define instance configuration, scaling policies that adjust capacity based on CloudWatch metrics, and lifecycle hooks for custom initialisation logic. Elastic Load Balancers — Application Load Balancers for HTTP/HTTPS with target groups, path-based routing rules, SSL certificates from ACM, and access logging.

Networking: VPC with the CIDR block that provides sufficient address space. Public and private subnets across multiple availability zones — public subnets for load balancers and NAT gateways, private subnets for application servers and databases. Internet gateway for public subnet internet access. NAT gateways in each availability zone for private subnet outbound internet access. Route tables with the routing rules that direct traffic correctly. Security groups with the ingress and egress rules that implement the principle of least privilege — allowing only the traffic that each resource needs to send and receive.

Databases: RDS instances with the correct engine (PostgreSQL, MySQL), instance class, storage configuration, backup retention, multi-AZ deployment for high availability, parameter groups, and subnet groups that place the RDS instance in private subnets. RDS Aurora clusters for applications requiring greater scalability than single-instance RDS provides.

Container infrastructure: ECS clusters with task definitions that specify container image, CPU and memory allocation, environment variables from Secrets Manager and Parameter Store, and logging configuration to CloudWatch. ECS services with the desired task count, load balancer integration, and the service discovery that allows tasks to find each other. EKS clusters for applications using Kubernetes — the managed control plane, the node groups with the instance types and scaling configuration the workload requires.

Caching and messaging: ElastiCache Redis clusters for session storage and application caching. SQS queues for decoupled asynchronous processing. SNS topics for fan-out notification. EventBridge rules for scheduled tasks and event routing.

Storage: S3 buckets with the access policies, lifecycle rules, versioning configuration, and encryption settings that each bucket's content requires. CloudFront distributions that serve S3 content and application responses through AWS's CDN with the cache behaviours, origin configurations, and SSL certificates that each distribution needs.

IAM: IAM roles for EC2 instances, ECS tasks, and Lambda functions — the least-privilege policies that grant each compute resource exactly the permissions it needs. IAM roles for CI/CD pipeline access — the GitHub Actions OIDC provider configuration that allows GitHub Actions workflows to assume IAM roles without storing long-lived AWS credentials as secrets.

Networking and security. The network architecture that provides isolation, security, and connectivity for production infrastructure.

Multi-tier networking: the three-tier architecture with public, application, and data tiers in separate subnets — load balancers in the public tier accessible from the internet, application servers in the application tier accessible only from the load balancers, databases in the data tier accessible only from the application servers. The security group rules that enforce this isolation — the application tier security group that allows ingress only from the load balancer security group, the data tier security group that allows ingress only from the application tier security group.

VPC peering and Transit Gateway: connecting multiple VPCs — development, staging, production, shared services — with the routing and security group rules that allow appropriate inter-VPC traffic while maintaining isolation between environments.

AWS WAF: Web Application Firewall rules attached to CloudFront and Application Load Balancers — rate limiting rules, geographic restrictions, managed rule groups for OWASP top 10 protection, and custom rules for application-specific threat patterns.

Secrets and configuration management. The Terraform resources that provision and manage the secrets and configuration that applications consume at runtime.

AWS Secrets Manager: secrets provisioned through Terraform — the RDS master password, the API keys for external services, the application secrets — stored in Secrets Manager with rotation configuration and the IAM policies that grant application roles read access to specific secrets.

AWS Systems Manager Parameter Store: application configuration values stored in Parameter Store — the parameters that are not sensitive enough for Secrets Manager but that should not be hardcoded in application configuration or stored in version control. Hierarchical parameter names (/application/environment/parameter-name) with the IAM policies that grant access by path prefix.

Module development. Terraform modules for reusable infrastructure components — the module that encapsulates the complete pattern for deploying a web application tier (ALB, ECS service, security groups, IAM role, CloudWatch logs) into a single, parameterised, reusable unit.

Module structure: the main.tf with resource definitions, the variables.tf with input variable declarations and validation rules, the outputs.tf with the values the module exposes to its callers, and the versions.tf with the Terraform and provider version requirements. The module design that is parameterised for what varies between uses (instance count, instance size, application name) while encapsulating the patterns that should be consistent (security group rules, IAM permission boundaries, logging configuration).

Module registry: private module registries in Terraform Cloud or in a Git repository with versioned tags that allow modules to be pinned to specific versions. The module versioning that allows modules to evolve without forcing all consumers to adopt changes immediately.

Environment management. The Terraform configuration structure that manages multiple environments — development, staging, production — with the separation that prevents a mistake in the development environment from affecting production.

Workspace-based environment management: Terraform workspaces that use the same configuration with different variable values and separate state for each environment. The workspace approach that works well for environments with identical structure but different sizes or configurations.

Directory-based environment management: separate Terraform root modules for each environment — environments/dev/, environments/staging/, environments/prod/ — that reference shared modules but maintain separate state. The directory approach that provides stronger isolation and allows environment configurations to diverge where needed.

Terragrunt for DRY configuration. Terragrunt as a thin wrapper around Terraform that reduces the configuration repetition in multi-environment, multi-account infrastructures. The terragrunt.hcl configuration that defines the remote state configuration and provider settings once and inherits them across all child configurations. The dependency blocks that manage the ordering of Terraform applies across interdependent modules.

CI/CD Integration for Infrastructure

Terraform in a CI/CD pipeline — the workflow that validates, plans, and applies infrastructure changes through automated processes rather than through manual terraform apply runs.

Plan on pull request. A CI job that runs terraform plan on every pull request to infrastructure configuration — showing the proposed infrastructure changes as a pull request comment, making the operational impact of each change visible to reviewers before the change is merged. The plan output that shows exactly what Terraform will create, modify, or destroy, with no surprises when the change is applied.

Apply on merge. Automated terraform apply triggered by merges to the main branch — or to an environment-specific branch for phased environment promotion. The apply that executes in the CI environment with the appropriate AWS credentials, using the remote state backend to prevent concurrent applies, and reporting success or failure back to the repository.

Checkov and tfsec for security scanning. Static analysis tools that scan Terraform configurations for security misconfigurations — open security groups, unencrypted storage, missing logging, public S3 buckets, IAM policies that are too permissive. Security scanning in the CI pipeline that catches infrastructure security issues before they are deployed to production.

Technologies Used

Terraform — primary Infrastructure as Code tool
HCL (HashiCorp Configuration Language) — Terraform's configuration language
Terragrunt — DRY Terraform wrapper for multi-environment configurations
AWS provider — Amazon Web Services resource management
Google Cloud provider — Google Cloud Platform resource management
Azure provider — Microsoft Azure resource management
AWS S3 / DynamoDB — remote state backend and state locking
Terraform Cloud — managed Terraform execution and state storage
Checkov / tfsec — Terraform security scanning
GitHub Actions — CI/CD pipeline for automated plan and apply
AWS IAM / OIDC — keyless authentication for CI/CD Terraform execution
AWS ECS / EKS — container infrastructure provisioned by Terraform
AWS RDS / ElastiCache — managed database and cache infrastructure
AWS VPC / Security Groups — network infrastructure
AWS Secrets Manager / Parameter Store — secrets and configuration management

Infrastructure That Matches Its Definition

The operational promise of Terraform is that the infrastructure running in production matches the infrastructure defined in the repository. This promise is only kept if Terraform is the exclusive mechanism for making infrastructure changes — no manual console changes, no undocumented resource creation, no configuration drift. When infrastructure changes are needed, the change is made in the Terraform configuration, reviewed in a pull request, and applied through the CI/CD pipeline.

This discipline eliminates the category of operational problems that infrastructure drift causes: the incident that cannot be reproduced because the actual configuration differs from the assumed configuration, the security review that cannot trust its findings because the scanned configuration may not reflect reality, the environment rebuild that produces different behaviour because the original configuration was never captured.

Infrastructure Defined, Managed, and Auditable

Terraform infrastructure development that brings cloud infrastructure under version-controlled, reviewable, reproducible management — whether for new infrastructure built from scratch or for existing infrastructure migrated from manual configuration into Terraform management.