Terraform Modules Done Right#

January 12, 2026 · 10 min read

Terraform modules are the primary mechanism for code reuse in infrastructure-as-code. But poorly designed modules create more problems than they solve — tightly coupled dependencies, impossible-to-debug plan outputs, and versioning nightmares. After maintaining a library of 40+ internal modules used across 15 teams, here is what I have learned about building modules that actually scale.

Module Structure That Works

Every module in our library follows the same file layout. Consistency matters more than cleverness when 50 engineers are consuming your modules:

modules/
  aws-ecs-service/
    main.tf          # Core resources
    variables.tf     # All input variables
    outputs.tf       # All outputs
    versions.tf      # Provider and terraform version constraints
    locals.tf        # Computed values and transformations
    README.md        # Auto-generated with terraform-docs
    examples/
      basic/
        main.tf
      with-alb/
        main.tf
    tests/
      service_test.go

The key rule: one module, one concern. Our aws-ecs-service module creates an ECS service, task definition, and associated IAM roles. It does not create the ECS cluster, the VPC, or the ALB. Those are separate modules composed together in the calling code. When a module tries to do too much, every consumer pays the complexity tax.

Input and Output Patterns

Good variable design is the difference between a module that is a joy to use and one that generates Slack messages at 4 PM asking "what does this parameter do?" Every variable should have a description, a type constraint, and a sensible default where applicable:

variable "service_name" {
  description = "Name of the ECS service. Used for resource naming and tagging."
  type        = string
  validation {
    condition     = length(var.service_name) <= 28
    error_message = "Service name must be 28 characters or fewer (ECS naming limit)."
  }
}

variable "cpu" {
  description = "CPU units for the task (1 vCPU = 1024 units)."
  type        = number
  default     = 256
}

variable "environment" {
  description = "Map of environment variables to inject into the container."
  type        = map(string)
  default     = {}
  sensitive   = false
}

For outputs, expose everything a consumer might reasonably need, even if you do not need it today. Adding an output later is a non-breaking change. Removing one is a breaking change that forces a major version bump.

output "service_arn" {
  description = "ARN of the ECS service."
  value       = aws_ecs_service.this.id
}

output "task_definition_arn" {
  description = "ARN of the active task definition."
  value       = aws_ecs_task_definition.this.arn
}

output "security_group_id" {
  description = "ID of the security group attached to the service."
  value       = aws_security_group.service.id
}

Versioning with Git Tags

We version our modules using Git tags with semantic versioning. The module source reference pins to a specific tag:

module "api_service" {
  source = "git::https://github.com/org/terraform-modules.git//modules/aws-ecs-service?ref=v3.2.1"

  service_name = "api"
  cpu          = 512
  memory       = 1024
  environment  = { LOG_LEVEL = "info" }
}

The versioning contract is strict:

Patch (v3.2.x): Bug fixes, documentation updates. No variable changes.
Minor (v3.x.0): New variables with defaults, new outputs, new optional resources behind feature flags.
Major (vX.0.0): Removed variables, renamed resources (causes destroy/recreate), changed default behavior.

We run a CI job on every PR that checks whether the change is backward-compatible. If any existing variable is removed or any resource address changes, the job flags it as a major version bump and requires an explicit migration guide in the PR description.

Testing with Terratest

Untested Terraform modules are a liability. We use Terratest to write integration tests in Go that actually provision infrastructure, validate it, and tear it down:

func TestEcsServiceBasic(t *testing.T) {
    t.Parallel()

    terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
        TerraformDir: "../examples/basic",
        Vars: map[string]interface{}{
            "service_name": fmt.Sprintf("test-%s", random.UniqueId()),
            "environment":  "test",
        },
    })

    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)

    serviceArn := terraform.Output(t, terraformOptions, "service_arn")
    assert.Contains(t, serviceArn, "arn:aws:ecs")

    sgId := terraform.Output(t, terraformOptions, "security_group_id")
    assert.Regexp(t, `^sg-[a-f0-9]+$`, sgId)
}

These tests run in a dedicated AWS account on every PR. They take 8-12 minutes per module, so we run them in parallel across modules. The cost is roughly $150/month for the test account, which is trivial compared to the cost of a bad infrastructure change reaching production.

Avoiding Module Spaghetti

The most common anti-pattern I see is modules that call other modules that call other modules, creating a dependency chain four or five levels deep. When something goes wrong, the Terraform plan output is incomprehensible, and a targeted terraform apply -target becomes impossible.

Our rule: modules should be flat. A module can create resources and use data sources, but it should not call other modules. Composition happens at the root level:

# environments/production/main.tf — flat composition
module "vpc" {
  source  = "git::...//modules/aws-vpc?ref=v2.1.0"
  cidr    = "10.0.0.0/16"
}

module "ecs_cluster" {
  source     = "git::...//modules/aws-ecs-cluster?ref=v1.4.0"
  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnet_ids
}

module "api_service" {
  source     = "git::...//modules/aws-ecs-service?ref=v3.2.1"
  cluster_id = module.ecs_cluster.cluster_id
  subnet_ids = module.vpc.private_subnet_ids
}

This pattern keeps the dependency graph shallow and explicit. Every module receives its dependencies through variables, never by reaching into another module's internals.

Real-World Impact

Before we standardized our module practices, provisioning a new microservice took 2-3 days of Terraform wrangling. Teams would copy-paste from other services, creating drift and inconsistencies. After building the module library with proper testing and versioning, a new service goes from zero to deployed in under 30 minutes using a single module call with 8-10 variables.

The best Terraform module is one that a new engineer can use correctly after reading a 10-line example, without ever looking at the module's source code.

Infrastructure-as-code is only as good as the abstractions you build on top of it. Invest in your module library the same way you invest in your application frameworks. The returns compound across every team and every deployment.