Skip to content
Home / Skills / Devops / Infrastructure as Code
DE

Infrastructure as Code

Devops core v1.0.0

Infrastructure as Code

Overview

Infrastructure as Code (IaC) defines cloud resources — compute, networking, storage, databases — as version-controlled, declarative configuration files. IaC ensures reproducible environments, eliminates configuration drift, and enables infrastructure changes to flow through the same CI/CD pipeline as application code. Terraform is the default multi-cloud IaC tool; AWS CDK is used for AWS-native projects.


Key Concepts

IaC Tool Comparison

FeatureTerraformAWS CDKCloudFormationPulumi
LanguageHCLTypeScript/Python/JavaYAML/JSONTypeScript/Python/Go
Cloud SupportMulti-cloudAWS onlyAWS onlyMulti-cloud
StateRemote (S3)CloudFormation stackCloudFormation stackPulumi Cloud
MaturityVery highHighHighMedium
Best ForMulti-cloud, modularAWS-native, complex logicAWS simple setupsDevelopers who prefer code

Terraform Project Structure

infrastructure/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── terraform.tfvars
│   ├── staging/
│   └── production/
├── modules/
│   ├── networking/         # VPC, subnets, security groups
│   ├── compute/            # ECS, EC2, Lambda
│   ├── database/           # RDS, ElastiCache
│   ├── messaging/          # SQS, SNS, MSK
│   └── monitoring/         # CloudWatch, alarms
├── backend.tf              # Remote state configuration
└── versions.tf             # Provider version constraints

Best Practices

  1. Use remote state — S3 + DynamoDB for Terraform state locking
  2. Modularize resources — Reusable modules for networking, compute, database
  3. Use workspaces or directories per environment — Separate dev/staging/prod state
  4. Pin provider versionsrequired_providers { aws = { version = "~> 5.0" } }
  5. Use variables and outputs — Never hardcode values; parameterize everything
  6. Plan before apply — Always review terraform plan output before applying
  7. Tag all resources — Environment, team, cost-center, managed-by tags
  8. Enable drift detection — Scheduled terraform plan to detect manual changes

Code Examples

✅ Good: Terraform Module for ECS Service

# modules/compute/ecs-service/main.tf
resource "aws_ecs_service" "this" {
  name            = var.service_name
  cluster         = var.cluster_id
  task_definition = aws_ecs_task_definition.this.arn
  desired_count   = var.desired_count
  launch_type     = "FARGATE"

  network_configuration {
    subnets         = var.private_subnet_ids
    security_groups = [aws_security_group.service.id]
  }

  load_balancer {
    target_group_arn = aws_lb_target_group.this.arn
    container_name   = var.service_name
    container_port   = var.container_port
  }

  deployment_circuit_breaker {
    enable   = true
    rollback = true
  }

  tags = merge(var.tags, {
    Service   = var.service_name
    ManagedBy = "terraform"
  })
}

resource "aws_ecs_task_definition" "this" {
  family                   = var.service_name
  requires_compatibilities = ["FARGATE"]
  network_mode             = "awsvpc"
  cpu                      = var.cpu
  memory                   = var.memory
  execution_role_arn       = var.execution_role_arn
  task_role_arn            = var.task_role_arn

  container_definitions = jsonencode([{
    name      = var.service_name
    image     = var.container_image
    essential = true
    portMappings = [{
      containerPort = var.container_port
      protocol      = "tcp"
    }]
    logConfiguration = {
      logDriver = "awslogs"
      options = {
        "awslogs-group"         = var.log_group_name
        "awslogs-region"        = var.aws_region
        "awslogs-stream-prefix" = var.service_name
      }
    }
    healthCheck = {
      command     = ["CMD-SHELL", "curl -f http://localhost:${var.container_port}/actuator/health || exit 1"]
      interval    = 30
      timeout     = 5
      retries     = 3
      startPeriod = 60
    }
    environment = var.environment_variables
    secrets     = var.secrets
  }])
}

# variables.tf
variable "service_name" {
  type        = string
  description = "Name of the ECS service"
}

variable "container_image" {
  type        = string
  description = "Docker image URI"
}

variable "desired_count" {
  type        = number
  default     = 2
  description = "Number of tasks to run"
}

variable "tags" {
  type        = map(string)
  default     = {}
  description = "Resource tags"
}

✅ Good: AWS CDK Stack

// lib/app-stack.ts
import * as cdk from 'aws-cdk-lib';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as rds from 'aws-cdk-lib/aws-rds';

export class AppStack extends cdk.Stack {
  constructor(scope: cdk.App, id: string, props: AppStackProps) {
    super(scope, id, props);

    const vpc = new ec2.Vpc(this, 'AppVpc', {
      maxAzs: 2,
      natGateways: 1,
    });

    const cluster = new ecs.Cluster(this, 'AppCluster', { vpc });

    const database = new rds.DatabaseInstance(this, 'AppDb', {
      engine: rds.DatabaseInstanceEngine.postgres({
        version: rds.PostgresEngineVersion.VER_16,
      }),
      instanceType: ec2.InstanceType.of(
        ec2.InstanceClass.T4G, ec2.InstanceSize.MEDIUM
      ),
      vpc,
      multiAz: props.isProduction,
      deletionProtection: props.isProduction,
    });
  }
}

❌ Bad: IaC Anti-Patterns

# Hardcoded values everywhere
resource "aws_instance" "web" {
  ami           = "ami-abc123"        # Hardcoded AMI
  instance_type = "t2.micro"         # No variable
  # No tags, no security group, no state management
}

Anti-Patterns

  1. Manual resource creation — ClickOps in console creates untracked resources
  2. Local stateterraform.tfstate in local filesystem; use S3 + DynamoDB
  3. Hardcoded values — AMI IDs, subnets, credentials in code
  4. No modules — Copy-paste duplicated blocks instead of reusable modules
  5. Missing tags — Untagged resources impossible to track or cost-allocate
  6. Applying without planterraform apply without reviewing plan output

Testing Strategies

  • terraform validate — Syntax and configuration validation
  • terraform plan — Preview changes before apply
  • tflint — Linting for Terraform best practices
  • checkov — Security policy-as-code scanning
  • terratest — Go-based integration tests for Terraform modules

References