Terraform Multi-Layer Architecture: Bootstrap, Foundation, Platform, Application

Terraform Multi-Layer Architecture

One of the most common mistakes in Terraform projects is putting everything into a single state file. A monolithic state works fine at the start — until a botched terraform apply on the application layer accidentally destroys the VPC, taking down every environment at once. Or a developer who only needs to deploy a Lambda function gets blocked because someone else is applying a database change and the state file is locked.

Multi-layer architecture solves this by splitting infrastructure into independent stacks with clear ownership boundaries. Each layer has its own state file, its own IAM role, and its own deployment cadence. Changes to the application layer never touch the networking layer. A mistake in the platform layer cannot accidentally terminate EC2 instances in the application layer.

This article walks through a four-layer model — Bootstrap, Foundation, Platform, Application — explains the chicken-and-egg bootstrapping problem that trips up almost every new Terraform project, and provides the patterns and code to implement it correctly from the start.


The Chicken and the Egg Problem

Every Terraform best-practice guide says to store state remotely — in an S3 bucket with a DynamoDB table for locking. This is correct. But it creates an immediate contradiction:

You need Terraform to create the S3 bucket. But you need the S3 bucket to store Terraform’s state.

This is the bootstrap problem, and it has exactly one clean solution: start with local state, create the bucket, then migrate.

The wrong approaches:

  • Create the bucket manually in the console — now you have unmanaged infrastructure that Terraform does not know about, and you will eventually drift.
  • Import the manually-created bucket — better, but still fragile if the import step is not documented.
  • Use terraform apply with local backend forever — state lives on a developer’s laptop. One disk wipe and you lose all state.

The right approach:

# bootstrap/backend.tf — start with local state
terraform {
  backend "local" {
    path = "terraform.tfstate"
  }
}

Apply the bootstrap layer once with local state to create the S3 bucket, DynamoDB table, and KMS key. Then migrate:

# After initial apply, migrate to the S3 backend
terraform init -migrate-state

Update backend.tf to point to the newly created bucket:

# bootstrap/backend.tf — after migration
terraform {
  backend "s3" {
    bucket         = "my-project-terraform-state"
    key            = "bootstrap/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    kms_key_id     = "arn:aws:kms:us-east-1:ACCOUNT_ID:key/KEY_ID"
    dynamodb_table = "my-project-terraform-locks"
  }
}

From this point forward, all four layers store their state in S3. The bootstrap layer is applied rarely — only when the foundational state infrastructure itself needs to change.


The Four-Layer Model

Layer dependency and remote state flow

Each layer builds on the one below it. The dependency is one-directional — a layer can read outputs from layers below it, but never from layers above it. This constraint is what keeps the architecture clean.

Layer 3 — Application   (deploys frequently — every release)
    ↑ reads from
Layer 2 — Platform      (deploys occasionally — shared services)
    ↑ reads from
Layer 1 — Foundation    (deploys rarely — networking)
    ↑ reads from
Layer 0 — Bootstrap     (deployed once — state infrastructure)

Directory Structure

terraform/
├── bootstrap/          # Layer 0
│   ├── main.tf
│   ├── s3.tf
│   ├── dynamodb.tf
│   ├── kms.tf
│   ├── iam.tf
│   ├── outputs.tf
│   └── backend.tf

├── foundation/         # Layer 1
│   ├── main.tf
│   ├── vpc.tf
│   ├── dns.tf
│   ├── acm.tf
│   ├── outputs.tf
│   └── backend.tf

├── platform/           # Layer 2
│   ├── main.tf
│   ├── rds.tf
│   ├── sqs.tf
│   ├── ecr.tf
│   ├── secrets.tf
│   ├── outputs.tf
│   └── backend.tf

└── application/        # Layer 3
    ├── main.tf
    ├── ecs.tf
    ├── lambda.tf
    ├── api_gateway.tf
    ├── cloudfront.tf
    ├── outputs.tf
    └── backend.tf

Each layer is a completely independent Terraform root module. You cd into it and run terraform init, terraform plan, terraform apply independently.


Layer 0 — Bootstrap

The bootstrap layer creates the infrastructure that all other layers depend on for their own state management. It is applied once and rarely touched.

bootstrap/s3.tf — State Bucket

# bootstrap/s3.tf
resource "aws_s3_bucket" "state" {
  bucket = "${var.project_name}-terraform-state"
}

resource "aws_s3_bucket_versioning" "state" {
  bucket = aws_s3_bucket.state.id
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "state" {
  bucket = aws_s3_bucket.state.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.state.arn
    }
    bucket_key_enabled = true
  }
}

resource "aws_s3_bucket_public_access_block" "state" {
  bucket                  = aws_s3_bucket.state.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

resource "aws_s3_bucket_policy" "state" {
  bucket = aws_s3_bucket.state.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid       = "DenyHTTP"
        Effect    = "Deny"
        Principal = "*"
        Action    = "s3:*"
        Resource  = ["${aws_s3_bucket.state.arn}/*", aws_s3_bucket.state.arn]
        Condition = {
          Bool = { "aws:SecureTransport" = "false" }
        }
      }
    ]
  })
}

bootstrap/dynamodb.tf — Lock Table

# bootstrap/dynamodb.tf
resource "aws_dynamodb_table" "locks" {
  name         = "${var.project_name}-terraform-locks"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"

  attribute {
    name = "LockID"
    type = "S"
  }

  server_side_encryption {
    enabled     = true
    kms_key_arn = aws_kms_key.state.arn
  }

  point_in_time_recovery {
    enabled = true
  }
}

bootstrap/kms.tf — KMS Encryption Key

# bootstrap/kms.tf
resource "aws_kms_key" "state" {
  description             = "KMS key for Terraform state encryption"
  deletion_window_in_days = 30
  enable_key_rotation     = true

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Sid       = "Enable IAM User Permissions"
        Effect    = "Allow"
        Principal = { AWS = "arn:aws:iam::${var.account_id}:root" }
        Action    = "kms:*"
        Resource  = "*"
      }
    ]
  })
}

resource "aws_kms_alias" "state" {
  name          = "alias/${var.project_name}-terraform-state"
  target_key_id = aws_kms_key.state.key_id
}

bootstrap/iam.tf — OIDC Provider & CI Roles

# bootstrap/iam.tf — OIDC provider for GitHub Actions (no static keys)
resource "aws_iam_openid_connect_provider" "github" {
  url             = "https://token.actions.githubusercontent.com"
  client_id_list  = ["sts.amazonaws.com"]
  thumbprint_list = ["6938fd4d98bab03faadb97b34396831e3780aea1"]
}

resource "aws_iam_role" "foundation_ci" {
  name = "${var.project_name}-foundation-ci"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Principal = { Federated = aws_iam_openid_connect_provider.github.arn }
      Action = "sts:AssumeRoleWithWebIdentity"
      Condition = {
        StringLike = {
          "token.actions.githubusercontent.com:sub" = "repo:${var.github_org}/${var.github_repo}:*"
        }
      }
    }]
  })
}

bootstrap/outputs.tf — Layer Outputs

# bootstrap/outputs.tf
output "state_bucket_name" {
  value = aws_s3_bucket.state.id
}

output "state_lock_table" {
  value = aws_dynamodb_table.locks.name
}

output "kms_key_arn" {
  value     = aws_kms_key.state.arn
  sensitive = true
}

output "oidc_provider_arn" {
  value = aws_iam_openid_connect_provider.github.arn
}

Layer 1 — Foundation

The foundation layer creates the network infrastructure that everything else sits inside. It reads the KMS key ARN from Layer 0 via terraform_remote_state.

foundation/main.tf — Remote State Reference

# foundation/main.tf — remote state reference
data "terraform_remote_state" "bootstrap" {
  backend = "s3"
  config = {
    bucket = var.state_bucket
    key    = "bootstrap/terraform.tfstate"
    region = var.aws_region
  }
}

foundation/vpc.tf — VPC & Endpoints

# foundation/vpc.tf
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.0"

  name = "${var.project_name}-vpc"
  cidr = var.vpc_cidr

  azs             = ["${var.aws_region}a", "${var.aws_region}b", "${var.aws_region}c"]
  private_subnets = var.private_subnet_cidrs
  public_subnets  = var.public_subnet_cidrs

  enable_nat_gateway     = true
  single_nat_gateway     = false   # HA: one per AZ
  enable_dns_hostnames   = true
  enable_dns_support     = true

  # VPC Flow Logs to S3
  enable_flow_log                      = true
  create_flow_log_cloudwatch_iam_role  = false
  flow_log_destination_type            = "s3"
  flow_log_destination_arn             = aws_s3_bucket.flow_logs.arn

  tags = local.common_tags
}

# VPC Endpoints — keep traffic inside AWS network
resource "aws_vpc_endpoint" "s3" {
  vpc_id            = module.vpc.vpc_id
  service_name      = "com.amazonaws.${var.aws_region}.s3"
  vpc_endpoint_type = "Gateway"
  route_table_ids   = module.vpc.private_route_table_ids
}

resource "aws_vpc_endpoint" "ecr_dkr" {
  vpc_id              = module.vpc.vpc_id
  service_name        = "com.amazonaws.${var.aws_region}.ecr.dkr"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = module.vpc.private_subnets
  security_group_ids  = [aws_security_group.vpc_endpoints.id]
  private_dns_enabled = true
}

foundation/outputs.tf — Layer Outputs

# foundation/outputs.tf
output "vpc_id" {
  value = module.vpc.vpc_id
}

output "private_subnet_ids" {
  value = module.vpc.private_subnets
}

output "public_subnet_ids" {
  value = module.vpc.public_subnets
}

output "certificate_arn" {
  value = aws_acm_certificate.main.arn
}

Layer 2 — Platform

The platform layer creates shared services. It reads VPC outputs from Layer 1 and KMS outputs from Layer 0.

platform/main.tf — Remote State References

# platform/main.tf — reads from both lower layers
data "terraform_remote_state" "bootstrap" {
  backend = "s3"
  config = {
    bucket = var.state_bucket
    key    = "bootstrap/terraform.tfstate"
    region = var.aws_region
  }
}

data "terraform_remote_state" "foundation" {
  backend = "s3"
  config = {
    bucket = var.state_bucket
    key    = "foundation/terraform.tfstate"
    region = var.aws_region
  }
}

locals {
  vpc_id             = data.terraform_remote_state.foundation.outputs.vpc_id
  private_subnet_ids = data.terraform_remote_state.foundation.outputs.private_subnet_ids
  kms_key_arn        = data.terraform_remote_state.bootstrap.outputs.kms_key_arn
}

platform/rds.tf — RDS Database

# platform/rds.tf
resource "aws_db_instance" "main" {
  identifier        = "${var.project_name}-db"
  engine            = "postgres"
  engine_version    = "16.2"
  instance_class    = var.db_instance_class
  allocated_storage = 20
  storage_type      = "gp3"

  db_name  = var.db_name
  username = var.db_username
  # No password argument — use IAM authentication instead
  iam_database_authentication_enabled = true

  # Security
  storage_encrypted   = true
  kms_key_id          = local.kms_key_arn
  deletion_protection = true
  skip_final_snapshot = false
  final_snapshot_identifier = "${var.project_name}-db-final"

  # Network — private only
  db_subnet_group_name   = aws_db_subnet_group.main.name
  vpc_security_group_ids = [aws_security_group.rds.id]
  publicly_accessible    = false

  # Monitoring
  monitoring_interval             = 60
  monitoring_role_arn             = aws_iam_role.rds_monitoring.arn
  performance_insights_enabled    = true
  performance_insights_kms_key_id = local.kms_key_arn
  enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]

  backup_retention_period = 7
  backup_window           = "03:00-04:00"
  maintenance_window      = "sun:04:00-sun:05:00"

  tags = local.common_tags
}

resource "aws_security_group" "rds" {
  name   = "${var.project_name}-rds"
  vpc_id = local.vpc_id

  # Only allow access from app security group — no 0.0.0.0/0
  ingress {
    from_port       = 5432
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_security_group.app.id]
    description     = "PostgreSQL from application layer"
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
    description = "Allow all outbound"
  }
}

platform/sqs.tf — SQS Queue & DLQ

# platform/sqs.tf
resource "aws_sqs_queue" "main" {
  name                       = "${var.project_name}-queue"
  message_retention_seconds  = 86400
  visibility_timeout_seconds = 300
  kms_master_key_id          = local.kms_key_arn

  redrive_policy = jsonencode({
    deadLetterTargetArn = aws_sqs_queue.dlq.arn
    maxReceiveCount     = 3
  })
}

resource "aws_sqs_queue" "dlq" {
  name              = "${var.project_name}-dlq"
  kms_master_key_id = local.kms_key_arn

  # Alert on DLQ messages
}

resource "aws_cloudwatch_metric_alarm" "dlq_messages" {
  alarm_name          = "${var.project_name}-dlq-not-empty"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = 1
  metric_name         = "ApproximateNumberOfMessagesVisible"
  namespace           = "AWS/SQS"
  period              = 60
  statistic           = "Sum"
  threshold           = 0
  alarm_actions       = [var.ops_sns_topic_arn]

  dimensions = {
    QueueName = aws_sqs_queue.dlq.name
  }
}

Layer 3 — Application

The application layer is deployed most frequently — on every release. It reads from all three layers below it.

application/main.tf — Remote State References

# application/main.tf
data "terraform_remote_state" "bootstrap" {
  backend = "s3"
  config  = { bucket = var.state_bucket, key = "bootstrap/terraform.tfstate", region = var.aws_region }
}

data "terraform_remote_state" "foundation" {
  backend = "s3"
  config  = { bucket = var.state_bucket, key = "foundation/terraform.tfstate", region = var.aws_region }
}

data "terraform_remote_state" "platform" {
  backend = "s3"
  config  = { bucket = var.state_bucket, key = "platform/terraform.tfstate", region = var.aws_region }
}

locals {
  vpc_id             = data.terraform_remote_state.foundation.outputs.vpc_id
  private_subnet_ids = data.terraform_remote_state.foundation.outputs.private_subnet_ids
  certificate_arn    = data.terraform_remote_state.foundation.outputs.certificate_arn
  db_endpoint        = data.terraform_remote_state.platform.outputs.db_endpoint
  queue_url          = data.terraform_remote_state.platform.outputs.queue_url
  kms_key_arn        = data.terraform_remote_state.bootstrap.outputs.kms_key_arn
}

application/lambda.tf — Lambda Function & IAM

# application/lambda.tf
resource "aws_lambda_function" "api" {
  function_name = "${var.project_name}-api"
  role          = aws_iam_role.lambda_exec.arn
  handler       = "index.handler"
  runtime       = "nodejs22.x"
  filename      = data.archive_file.lambda.output_path

  # VPC configuration — Lambda in private subnets
  vpc_config {
    subnet_ids         = local.private_subnet_ids
    security_group_ids = [aws_security_group.lambda.id]
  }

  # No secrets in environment variables — use Secrets Manager
  environment {
    variables = {
      DB_SECRET_ARN = var.db_secret_arn   # ARN only, not the secret value
      QUEUE_URL     = local.queue_url
      REGION        = var.aws_region
    }
  }

  # Encrypt environment variables
  kms_key_arn = local.kms_key_arn

  tracing_config {
    mode = "Active"
  }

  tags = local.common_tags
}

resource "aws_iam_role" "lambda_exec" {
  name = "${var.project_name}-lambda-exec"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Service = "lambda.amazonaws.com" }
      Action    = "sts:AssumeRole"
    }]
  })
}

resource "aws_iam_role_policy" "lambda_exec" {
  name = "${var.project_name}-lambda-exec"
  role = aws_iam_role.lambda_exec.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      # Least privilege — only what this Lambda actually needs
      {
        Effect   = "Allow"
        Action   = ["secretsmanager:GetSecretValue"]
        Resource = [var.db_secret_arn]
      },
      {
        Effect   = "Allow"
        Action   = ["sqs:SendMessage", "sqs:ReceiveMessage", "sqs:DeleteMessage"]
        Resource = [local.queue_arn]
      },
      {
        Effect   = "Allow"
        Action   = ["kms:Decrypt", "kms:GenerateDataKey"]
        Resource = [local.kms_key_arn]
      },
      # VPC networking
      {
        Effect   = "Allow"
        Action   = ["ec2:CreateNetworkInterface", "ec2:DescribeNetworkInterfaces", "ec2:DeleteNetworkInterface"]
        Resource = ["*"]
      }
    ]
  })
}

application/cloudfront.tf — CloudFront Distribution

# application/cloudfront.tf
resource "aws_cloudfront_distribution" "main" {
  origin {
    domain_name              = aws_lb.main.dns_name
    origin_id                = "alb"
    custom_origin_config {
      http_port              = 80
      https_port             = 443
      origin_protocol_policy = "https-only"
      origin_ssl_protocols   = ["TLSv1.2"]
    }
  }

  enabled         = true
  is_ipv6_enabled = true
  aliases         = [var.domain_name]

  default_cache_behavior {
    allowed_methods        = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
    cached_methods         = ["GET", "HEAD"]
    target_origin_id       = "alb"
    viewer_protocol_policy = "redirect-to-https"

    forwarded_values {
      query_string = true
      cookies { forward = "none" }
    }
  }

  restrictions {
    geo_restriction { restriction_type = "none" }
  }

  viewer_certificate {
    acm_certificate_arn      = local.certificate_arn
    ssl_support_method       = "sni-only"
    minimum_protocol_version = "TLSv1.2_2021"
  }

  web_acl_id = aws_wafv2_web_acl.main.arn

  logging_config {
    bucket          = "${var.logs_bucket}.s3.amazonaws.com"
    prefix          = "cloudfront/"
    include_cookies = false
  }
}

Connecting Layers: The Remote State Pattern

The terraform_remote_state data source is the glue between layers. It reads the outputs of a lower layer’s state file from S3.

Rule: never hardcode values between layers. If Layer 2 needs the VPC ID, it reads it from Layer 1’s state — it does not have the VPC ID string pasted into a .tfvars file.

# Always reference layer outputs, never hardcode
# WRONG:
vpc_id = "vpc-0abc123def456"   # hardcoded — will drift

# RIGHT:
vpc_id = data.terraform_remote_state.foundation.outputs.vpc_id

Mark sensitive outputs explicitly:

# Any output that contains credentials, keys, or ARNs that reveal account structure
output "db_endpoint" {
  value     = aws_db_instance.main.endpoint
  sensitive = true   # hidden in plan output, not stored in plain text in logs
}

output "kms_key_arn" {
  value     = aws_kms_key.state.arn
  sensitive = true
}

Security Best Practices

Security controls per layer

Enforce Least-Privilege IAM Per Layer

Each layer gets its own CI/CD IAM role scoped to only what that layer manages. The application layer role cannot touch VPC resources. The foundation layer role cannot touch databases.

# A foundation CI role that can ONLY manage VPC, Route53, and ACM
resource "aws_iam_role_policy" "foundation_ci" {
  name = "foundation-ci-policy"
  role = aws_iam_role.foundation_ci.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Action = ["ec2:*Vpc*", "ec2:*Subnet*", "ec2:*RouteTable*",
                  "ec2:*SecurityGroup*", "ec2:*NetworkAcl*",
                  "ec2:*InternetGateway*", "ec2:*NatGateway*"]
        Resource = "*"
        Condition = {
          StringEquals = { "aws:RequestedRegion" = var.aws_region }
        }
      },
      {
        Effect   = "Allow"
        Action   = ["route53:*", "acm:*"]
        Resource = "*"
      }
      # Notably absent: rds:*, ecs:*, lambda:*, s3:* (general)
    ]
  })
}

Lock Provider and Module Versions

# versions.tf — pin everything
terraform {
  required_version = ">= 1.9.0, < 2.0.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.50"
    }
  }
}

Always run terraform init -upgrade in a controlled environment and review the diff in .terraform.lock.hcl before merging provider upgrades.

Tag Everything

# common/locals.tf — shared across all layers
locals {
  common_tags = {
    Project     = var.project_name
    Environment = var.environment
    Layer       = var.layer          # "bootstrap" | "foundation" | "platform" | "application"
    ManagedBy   = "terraform"
    Repository  = var.github_repo
  }
}

Tags enable cost allocation per layer, security policy enforcement via SCPs, and automated compliance checks.

Use Sensitive Variables for Secrets — Never .tfvars in Git

variable "db_password" {
  type      = string
  sensitive = true   # never printed in plan output
}

In CI/CD, pass sensitive values via environment variables (TF_VAR_db_password) sourced from a secrets manager — never committed to the repository.

Run tfsec Before Every Apply

# .github/workflows/terraform.yml
jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: aquasecurity/tfsec-action@v1.0.0
        with:
          working_directory: ./terraform/${{ matrix.layer }}
          additional_args: --minimum-severity HIGH

  apply:
    needs: security
    # ... rest of apply job

CI/CD Pipeline Per Layer

Each layer is deployed by its own GitHub Actions workflow, triggered when files in its directory change.

# .github/workflows/deploy-foundation.yml
name: Foundation

on:
  push:
    branches: [main]
    paths:
      - 'terraform/foundation/**'

jobs:
  security:
    uses: ./.github/workflows/tfsec.yml
    with:
      directory: terraform/foundation

  plan:
    needs: security
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.FOUNDATION_CI_ROLE_ARN }}
          aws-region: us-east-1
      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: "~1.9"
      - run: terraform init
        working-directory: terraform/foundation
      - run: terraform plan -out=tfplan
        working-directory: terraform/foundation
      - uses: actions/upload-artifact@v4
        with:
          name: tfplan
          path: terraform/foundation/tfplan

  apply:
    needs: plan
    runs-on: ubuntu-latest
    environment: production   # requires manual approval in GitHub
    permissions:
      id-token: write
      contents: read
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.FOUNDATION_CI_ROLE_ARN }}
          aws-region: us-east-1
      - uses: actions/download-artifact@v4
        with:
          name: tfplan
          path: terraform/foundation
      - run: terraform apply tfplan
        working-directory: terraform/foundation

The environment: production block requires a manual approval step in GitHub before the apply runs. This is the human gate that prevents automated terraform apply on infrastructure layers — use it on Layer 0, 1, and 2. Layer 3 (application) can auto-apply on merge if your test coverage is high.


Common Pitfalls

Circular dependencies. If Layer 2 reads from Layer 3 and Layer 3 reads from Layer 2, neither can be applied first. Enforce the one-directional rule strictly — no layer reads from a layer above it.

Stale remote state. If Foundation outputs change (e.g., a new subnet is added), the Platform layer will not automatically pick up the change on the next apply — it only re-reads remote state during terraform init or terraform refresh. Run terraform apply -refresh-only in downstream layers after significant foundation changes.

State file proliferation. With four layers per environment, and three environments (dev/staging/prod), you have 12 state files. Use a consistent key convention:

{environment}/{layer}/terraform.tfstate
# e.g.:
prod/bootstrap/terraform.tfstate
prod/foundation/terraform.tfstate
staging/platform/terraform.tfstate

Destroying in the wrong order. When tearing down an environment, destroy in reverse order: Application → Platform → Foundation → Bootstrap. Destroying Foundation first while Application still exists will leave orphaned resources with broken references.


Key Takeaways

Multi-layer Terraform architecture is not about adding complexity — it is about containing it. Each layer is small enough to reason about, blast radius is bounded, and teams can work on different layers in parallel without stepping on each other.

The bootstrap problem is real and has one correct solution: start with local state, create the backend infrastructure, migrate, and never look back. Every other approach creates unmanaged drift or fragile manual steps.

Apply these three principles and the architecture stays manageable as the project grows:

  1. One state file per layer — isolated blast radius, independent lock
  2. One IAM role per layer — least privilege enforced by design
  3. Remote state for cross-layer references — no hardcoded values, no drift