AWS for Backend Engineers
March 31, 2026|9 min read
Lesson 12 / 15

12. Serverless vs Containers on AWS

“Should we use Lambda or containers?” is one of the most common architectural questions on AWS. The answer is almost never a clean either/or — it depends on your traffic patterns, cost constraints, team skills, and operational maturity.

This lesson gives you a decision framework backed by real numbers, not vibes.

The Options on AWS

AWS offers five main compute options for backend workloads:

Service What It Is You Manage
Lambda Function-as-a-Service Code only
Fargate Serverless containers Container images, task definitions
ECS on EC2 Container orchestration on your instances Container images, EC2 instances, capacity
EKS on Fargate Managed Kubernetes, serverless nodes Kubernetes manifests, pod specs
EKS on EC2 Managed Kubernetes on your instances Kubernetes manifests, EC2 instances, node groups
App Runner Fully managed container platform Source code or container image

The spectrum runs from least operational burden (Lambda) to most control (EKS on EC2).

The Comparison Matrix

Serverless vs Containers Architecture

Cold Starts vs Warm Performance

Cold starts are Lambda’s most discussed tradeoff. Here are real-world numbers:

Runtime Cold Start (p99) Warm Invocation (p50)
Python 3.12 200-400 ms 5-15 ms
Node.js 20 150-350 ms 3-10 ms
Java 21 (SnapStart) 200-500 ms 5-20 ms
Java 21 (no SnapStart) 3-8 seconds 5-20 ms
.NET 8 (Native AOT) 200-400 ms 3-10 ms
Rust 10-30 ms 1-3 ms

Container cold starts are different — they are about pulling the image and starting the process:

Scenario Cold Start
Fargate (new task) 30-60 seconds
ECS on EC2 (image cached) 5-15 seconds
ECS on EC2 (image pull) 15-45 seconds
App Runner (from cold) 15-30 seconds

Key insight: Lambda cold starts are per-request, but they are short. Container cold starts are rare (only during scaling or deployment), but they are long. For APIs that need consistent latency, use Lambda Provisioned Concurrency or containers.

Lambda Provisioned Concurrency

If cold starts are unacceptable, Provisioned Concurrency keeps functions initialized:

# SAM template with provisioned concurrency
MyApiFunction:
  Type: AWS::Serverless::Function
  Properties:
    Runtime: python3.12
    Handler: app.handler
    MemorySize: 512
    Architectures: [arm64]
    AutoPublishAlias: live
    ProvisionedConcurrencyConfig:
      ProvisionedConcurrentExecutions: 50
    DeploymentPreference:
      Type: Linear10PercentEvery1Minute

But provisioned concurrency is expensive — you pay for it whether invocations happen or not. At that point, you are paying container-like prices for Lambda, which brings us to cost.

Cost Crossover Analysis

This is where most comparisons get hand-wavy. Let us use real numbers.

Scenario: REST API Handling 10 Million Requests/Month

Lambda (512 MB, 100ms average duration, ARM):

Requests:     10,000,000 × $0.20/million = $2.00
Duration:     10,000,000 × 0.1s × 0.5 GB × $0.0000133/GB-s = $6.65
API Gateway:  10,000,000 × $1.00/million = $10.00
Total:        $18.65/month

Fargate (1 vCPU, 2 GB RAM, 2 tasks for HA):

Compute:  2 tasks × 730 hours × ($0.04048/vCPU-hr + $0.004445/GB-hr × 2 GB)
        = 2 × 730 × $0.04937 = $72.08/month
ALB:      $16.20 base + ~$5 LCU = $21.20/month
Total:    $93.28/month

At 10M requests/month, Lambda is 5x cheaper.

Scenario: REST API Handling 500 Million Requests/Month

Lambda (512 MB, 100ms average, ARM):

Requests:     500,000,000 × $0.20/million = $100.00
Duration:     500,000,000 × 0.1s × 0.5 GB × $0.0000133/GB-s = $332.50
API Gateway:  500,000,000 × $1.00/million = $500.00
Total:        $932.50/month

Fargate (2 vCPU, 4 GB, 10 tasks with auto-scaling):

Compute:  10 tasks × 730 hours × ($0.04048 × 2 + $0.004445 × 4)
        = 10 × 730 × $0.09874 = $720.80/month
ALB:      $16.20 + ~$50 LCU = $66.20/month
Total:    $787.00/month

At 500M requests/month, Fargate is cheaper — primarily because API Gateway costs scale linearly. With an ALB, the load balancer cost barely changes.

The Crossover Point

For a typical API workload:

Lambda + API Gateway becomes more expensive than
Fargate + ALB at roughly 100-200 million requests/month.

But this shifts dramatically if you:

  • Use Lambda Function URLs instead of API Gateway (free, but fewer features)
  • Run longer-duration Lambda functions (>1 second average)
  • Need more memory per function (>1 GB)
# Quick cost estimator
def estimate_monthly_cost(requests_millions, avg_duration_ms, memory_mb):
    # Lambda costs
    lambda_request = requests_millions * 0.20
    lambda_duration = (requests_millions * 1_000_000 *
                      (avg_duration_ms / 1000) *
                      (memory_mb / 1024) *
                      0.0000133)
    apigw = requests_millions * 1.00
    lambda_total = lambda_request + lambda_duration + apigw

    # Fargate costs (auto-estimate task count)
    rps = (requests_millions * 1_000_000) / (30 * 24 * 3600)
    tasks = max(2, int(rps / 500))  # ~500 RPS per task
    vcpu, memory_gb = 1, 2
    fargate_compute = tasks * 730 * (0.04048 * vcpu + 0.004445 * memory_gb)
    alb = 16.20 + (tasks * 3)  # rough LCU estimate
    fargate_total = fargate_compute + alb

    return {
        'lambda_total': round(lambda_total, 2),
        'fargate_total': round(fargate_total, 2),
        'cheaper': 'Lambda' if lambda_total < fargate_total else 'Fargate',
        'tasks_needed': tasks
    }

# Examples
print(estimate_monthly_cost(10, 100, 512))
# {'lambda_total': 18.65, 'fargate_total': 93.28, 'cheaper': 'Lambda', 'tasks_needed': 2}

print(estimate_monthly_cost(500, 100, 512))
# {'lambda_total': 932.50, 'fargate_total': 787.00, 'cheaper': 'Fargate', 'tasks_needed': 10}

Scaling Behavior

Lambda Scaling

Lambda scales per-request. Each concurrent invocation gets its own execution environment:

Concurrent executions = Requests per second × Average duration in seconds

Example: 1,000 RPS × 0.2 seconds = 200 concurrent executions

Lambda can burst to 3,000 concurrent executions instantly in most regions, then scales at 500/minute. The account-level default limit is 1,000 concurrent executions (request an increase for production).

# Reserve concurrency to prevent one function from starving others
import boto3

lambda_client = boto3.client('lambda')

lambda_client.put_function_concurrency(
    FunctionName='critical-payment-processor',
    ReservedConcurrentExecutions=200  # Guaranteed 200, capped at 200
)

Container Scaling

ECS/Fargate scaling is based on CloudWatch metrics and takes 1-3 minutes to add new tasks:

{
  "TargetTrackingScalingPolicyConfiguration": {
    "TargetValue": 70.0,
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ECSServiceAverageCPUUtilization"
    },
    "ScaleOutCooldown": 60,
    "ScaleInCooldown": 300
  }
}

For traffic spikes, containers need headroom — you run more capacity than needed so you can absorb spikes while auto-scaling catches up. Lambda does not need headroom because it scales per-request.

Deployment and Debugging

Lambda Deployment

Lambda deployments are atomic — a new version replaces the old one. With aliases and traffic shifting, you can do canary deployments:

DeploymentPreference:
  Type: Canary10Percent5Minutes
  Alarms:
    - !Ref ApiErrorAlarm
    - !Ref ApiLatencyAlarm

Debugging Lambda is harder than containers. You cannot SSH into a Lambda function. Your tools are:

  • CloudWatch Logs (structured JSON logging is essential)
  • X-Ray tracing
  • Lambda Insights (memory, CPU, network metrics)
  • Local testing with SAM CLI (sam local invoke)

Container Deployment

ECS supports rolling updates, blue/green (via CodeDeploy), and circuit breaker:

{
  "deploymentConfiguration": {
    "maximumPercent": 200,
    "minimumHealthyPercent": 100,
    "deploymentCircuitBreaker": {
      "enable": true,
      "rollback": true
    }
  }
}

Debugging containers is significantly easier:

  • aws ecs execute-command — SSH into a running container
  • Standard APM tools work (Datadog, New Relic, Prometheus)
  • Familiar Docker tooling for local development
  • Log directly to stdout, ECS sends to CloudWatch
# Shell into a running Fargate container
aws ecs execute-command \
  --cluster production \
  --task abc123def456 \
  --container api \
  --interactive \
  --command "/bin/sh"

ECS Concepts: Task Definitions and Services

If you choose containers, you need to understand ECS’s core abstractions:

ECS Cluster

Service A

Service B

Task 1

Task 2

Task 3

Task 1

Task 2

Container: API

Container: Sidecar

Task Definition — A blueprint for your containers (like a docker-compose file). Specifies image, CPU, memory, ports, environment variables:

{
  "family": "order-service",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "arn:aws:iam::123456789:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::123456789:role/order-service-role",
  "containerDefinitions": [
    {
      "name": "api",
      "image": "123456789.dkr.ecr.us-east-1.amazonaws.com/order-service:v1.2.3",
      "portMappings": [
        {"containerPort": 8080, "protocol": "tcp"}
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/order-service",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "api"
        }
      },
      "secrets": [
        {
          "name": "DB_PASSWORD",
          "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789:secret:prod/db-password"
        }
      ],
      "healthCheck": {
        "command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3
      }
    }
  ]
}

Service — Ensures a desired number of tasks are running. Handles placement, rolling updates, and integration with load balancers.

Fargate vs EC2 Launch Type:

Aspect Fargate EC2
Management No instances to manage You manage EC2 fleet
Cost ~20% premium over equivalent EC2 Cheaper (especially with Spot/RI)
Scaling Per-task Must scale instances + tasks
GPU Not supported Supported
Storage 20 GB ephemeral (expandable to 200 GB) Instance storage available

Start with Fargate. Move to EC2 launch type only if you need GPU, specific instance types, or significant cost savings at scale.

EKS — When You Need Kubernetes

EKS makes sense when:

  • Your team already knows Kubernetes
  • You need multi-cloud portability
  • You need the Kubernetes ecosystem (service mesh, custom operators, GitOps)
  • You are running 50+ microservices

EKS does not make sense when:

  • Your team does not know Kubernetes (the learning curve is steep)
  • You have fewer than 10 services
  • You are a small team (Kubernetes is operationally expensive)

The EKS control plane costs $0.10/hour ($73/month) before you add any worker nodes. With Fargate pods or managed node groups, total costs climb quickly.

App Runner — The Simple Option

App Runner is the “I just want to run this container” service:

aws apprunner create-service \
  --service-name order-api \
  --source-configuration '{
    "ImageRepository": {
      "ImageIdentifier": "123456789.dkr.ecr.us-east-1.amazonaws.com/order-service:latest",
      "ImageRepositoryType": "ECR",
      "ImageConfiguration": {
        "Port": "8080",
        "RuntimeEnvironmentVariables": {
          "NODE_ENV": "production"
        }
      }
    },
    "AutoDeploymentsEnabled": true
  }' \
  --instance-configuration '{
    "Cpu": "1024",
    "Memory": "2048"
  }' \
  --auto-scaling-configuration-arn "arn:aws:apprunner:us-east-1:123456789:autoscalingconfiguration/default"

App Runner handles TLS, load balancing, auto-scaling, and deployments. It scales to zero (you only pay for active instances). The tradeoff: limited configuration options. No VPC integration (it has it now but it is basic), no sidecar containers, and limited observability.

Use App Runner for: Internal tools, simple APIs, prototypes, and services where operational simplicity matters more than fine-grained control.

Hybrid Patterns — The Real-World Architecture

Most production architectures use both Lambda and containers:

┌─────────────────────────────────────────────────────────────┐
│                        Architecture                         │
├─────────────────────────┬───────────────────────────────────┤
│  Event Processing       │  API Layer                        │
│  ─────────────────      │  ─────────                        │
│  S3 → Lambda            │  ALB → ECS/Fargate               │
│  SQS → Lambda           │  (long-running, consistent       │
│  EventBridge → Lambda   │   latency, WebSocket support)    │
│  (bursty, short-lived,  │                                   │
│   event-driven)         │  Background Workers               │
│                         │  ─────────────────                │
│  Scheduled Tasks        │  ECS/Fargate                      │
│  ───────────────        │  (long-running processes,         │
│  EventBridge → Lambda   │   queue consumers, cron jobs)     │
│  (cron-like, <15 min)   │                                   │
└─────────────────────────┴───────────────────────────────────┘

This pattern uses each service where it excels:

  • Lambda for event-driven, bursty workloads with short execution times
  • ECS/Fargate for APIs that need consistent latency and long-running background workers
# Lambda for S3 event processing
def handle_upload(event, context):
    for record in event['Records']:
        bucket = record['s3']['bucket']['name']
        key = record['s3']['object']['key']

        # Process the uploaded file
        process_image(bucket, key)

        # Trigger downstream ECS task for heavy processing
        ecs = boto3.client('ecs')
        ecs.run_task(
            cluster='production',
            taskDefinition='heavy-processing:latest',
            launchType='FARGATE',
            overrides={
                'containerOverrides': [{
                    'name': 'processor',
                    'environment': [
                        {'name': 'S3_BUCKET', 'value': bucket},
                        {'name': 'S3_KEY', 'value': key}
                    ]
                }]
            },
            networkConfiguration={
                'awsvpcConfiguration': {
                    'subnets': ['subnet-abc123'],
                    'securityGroups': ['sg-def456'],
                    'assignPublicIp': 'DISABLED'
                }
            }
        )

The Decision Flowchart

Here is how to decide:

  1. Is execution time under 15 minutes? No → Containers
  2. Is traffic bursty/event-driven? Yes → Lambda
  3. Do you need consistent sub-50ms latency? Yes → Containers (or Lambda with Provisioned Concurrency)
  4. Are you under 100M requests/month? Yes → Lambda is likely cheaper
  5. Does your team know Kubernetes? Yes, and 50+ services → EKS. Otherwise → ECS/Fargate
  6. Do you need WebSockets or gRPC? Yes → Containers
  7. Is this an internal tool or prototype? Yes → App Runner

Start with Lambda for new projects. It forces good architectural patterns (stateless, event-driven, small functions) and costs almost nothing at low scale. Migrate specific services to containers when you hit Lambda’s limitations — cold starts at scale, execution time limits, or cost crossover points.

Migration Path: Lambda to Containers

When you outgrow Lambda, the migration is straightforward if your code is well-structured:

  1. Wrap your Lambda handlers in a lightweight HTTP framework (FastAPI, Express)
  2. Create a Dockerfile that runs your application
  3. Set up an ECS service with Fargate and an ALB
  4. Shift traffic gradually using Route 53 weighted routing
# Your Lambda handler
def handler(event, context):
    order_id = event['pathParameters']['orderId']
    return get_order(order_id)

# Same logic in FastAPI (for containers)
from fastapi import FastAPI
app = FastAPI()

@app.get("/orders/{order_id}")
def get_order_endpoint(order_id: str):
    return get_order(order_id)  # Same business logic function

The business logic does not change. Only the entry point wrapper changes.

What Matters Most

The compute choice is important, but it is not the most important architectural decision. Your choice of database, your API design, your team’s ability to operate the system — these matter more.

Pick the compute model that lets your team ship fast and sleep well. For most teams starting fresh, that is Lambda. For teams with container experience running at scale, that is ECS on Fargate. For large platform teams with Kubernetes expertise, that is EKS.

There is no wrong answer. There is only the answer that fits your context.