“Should we use Lambda or containers?” is one of the most common architectural questions on AWS. The answer is almost never a clean either/or — it depends on your traffic patterns, cost constraints, team skills, and operational maturity.
This lesson gives you a decision framework backed by real numbers, not vibes.
The Options on AWS
AWS offers five main compute options for backend workloads:
| Service | What It Is | You Manage |
|---|---|---|
| Lambda | Function-as-a-Service | Code only |
| Fargate | Serverless containers | Container images, task definitions |
| ECS on EC2 | Container orchestration on your instances | Container images, EC2 instances, capacity |
| EKS on Fargate | Managed Kubernetes, serverless nodes | Kubernetes manifests, pod specs |
| EKS on EC2 | Managed Kubernetes on your instances | Kubernetes manifests, EC2 instances, node groups |
| App Runner | Fully managed container platform | Source code or container image |
The spectrum runs from least operational burden (Lambda) to most control (EKS on EC2).
The Comparison Matrix
Cold Starts vs Warm Performance
Cold starts are Lambda’s most discussed tradeoff. Here are real-world numbers:
| Runtime | Cold Start (p99) | Warm Invocation (p50) |
|---|---|---|
| Python 3.12 | 200-400 ms | 5-15 ms |
| Node.js 20 | 150-350 ms | 3-10 ms |
| Java 21 (SnapStart) | 200-500 ms | 5-20 ms |
| Java 21 (no SnapStart) | 3-8 seconds | 5-20 ms |
| .NET 8 (Native AOT) | 200-400 ms | 3-10 ms |
| Rust | 10-30 ms | 1-3 ms |
Container cold starts are different — they are about pulling the image and starting the process:
| Scenario | Cold Start |
|---|---|
| Fargate (new task) | 30-60 seconds |
| ECS on EC2 (image cached) | 5-15 seconds |
| ECS on EC2 (image pull) | 15-45 seconds |
| App Runner (from cold) | 15-30 seconds |
Key insight: Lambda cold starts are per-request, but they are short. Container cold starts are rare (only during scaling or deployment), but they are long. For APIs that need consistent latency, use Lambda Provisioned Concurrency or containers.
Lambda Provisioned Concurrency
If cold starts are unacceptable, Provisioned Concurrency keeps functions initialized:
# SAM template with provisioned concurrency
MyApiFunction:
Type: AWS::Serverless::Function
Properties:
Runtime: python3.12
Handler: app.handler
MemorySize: 512
Architectures: [arm64]
AutoPublishAlias: live
ProvisionedConcurrencyConfig:
ProvisionedConcurrentExecutions: 50
DeploymentPreference:
Type: Linear10PercentEvery1MinuteBut provisioned concurrency is expensive — you pay for it whether invocations happen or not. At that point, you are paying container-like prices for Lambda, which brings us to cost.
Cost Crossover Analysis
This is where most comparisons get hand-wavy. Let us use real numbers.
Scenario: REST API Handling 10 Million Requests/Month
Lambda (512 MB, 100ms average duration, ARM):
Requests: 10,000,000 × $0.20/million = $2.00
Duration: 10,000,000 × 0.1s × 0.5 GB × $0.0000133/GB-s = $6.65
API Gateway: 10,000,000 × $1.00/million = $10.00
Total: $18.65/monthFargate (1 vCPU, 2 GB RAM, 2 tasks for HA):
Compute: 2 tasks × 730 hours × ($0.04048/vCPU-hr + $0.004445/GB-hr × 2 GB)
= 2 × 730 × $0.04937 = $72.08/month
ALB: $16.20 base + ~$5 LCU = $21.20/month
Total: $93.28/monthAt 10M requests/month, Lambda is 5x cheaper.
Scenario: REST API Handling 500 Million Requests/Month
Lambda (512 MB, 100ms average, ARM):
Requests: 500,000,000 × $0.20/million = $100.00
Duration: 500,000,000 × 0.1s × 0.5 GB × $0.0000133/GB-s = $332.50
API Gateway: 500,000,000 × $1.00/million = $500.00
Total: $932.50/monthFargate (2 vCPU, 4 GB, 10 tasks with auto-scaling):
Compute: 10 tasks × 730 hours × ($0.04048 × 2 + $0.004445 × 4)
= 10 × 730 × $0.09874 = $720.80/month
ALB: $16.20 + ~$50 LCU = $66.20/month
Total: $787.00/monthAt 500M requests/month, Fargate is cheaper — primarily because API Gateway costs scale linearly. With an ALB, the load balancer cost barely changes.
The Crossover Point
For a typical API workload:
Lambda + API Gateway becomes more expensive than
Fargate + ALB at roughly 100-200 million requests/month.But this shifts dramatically if you:
- Use Lambda Function URLs instead of API Gateway (free, but fewer features)
- Run longer-duration Lambda functions (>1 second average)
- Need more memory per function (>1 GB)
# Quick cost estimator
def estimate_monthly_cost(requests_millions, avg_duration_ms, memory_mb):
# Lambda costs
lambda_request = requests_millions * 0.20
lambda_duration = (requests_millions * 1_000_000 *
(avg_duration_ms / 1000) *
(memory_mb / 1024) *
0.0000133)
apigw = requests_millions * 1.00
lambda_total = lambda_request + lambda_duration + apigw
# Fargate costs (auto-estimate task count)
rps = (requests_millions * 1_000_000) / (30 * 24 * 3600)
tasks = max(2, int(rps / 500)) # ~500 RPS per task
vcpu, memory_gb = 1, 2
fargate_compute = tasks * 730 * (0.04048 * vcpu + 0.004445 * memory_gb)
alb = 16.20 + (tasks * 3) # rough LCU estimate
fargate_total = fargate_compute + alb
return {
'lambda_total': round(lambda_total, 2),
'fargate_total': round(fargate_total, 2),
'cheaper': 'Lambda' if lambda_total < fargate_total else 'Fargate',
'tasks_needed': tasks
}
# Examples
print(estimate_monthly_cost(10, 100, 512))
# {'lambda_total': 18.65, 'fargate_total': 93.28, 'cheaper': 'Lambda', 'tasks_needed': 2}
print(estimate_monthly_cost(500, 100, 512))
# {'lambda_total': 932.50, 'fargate_total': 787.00, 'cheaper': 'Fargate', 'tasks_needed': 10}Scaling Behavior
Lambda Scaling
Lambda scales per-request. Each concurrent invocation gets its own execution environment:
Concurrent executions = Requests per second × Average duration in seconds
Example: 1,000 RPS × 0.2 seconds = 200 concurrent executionsLambda can burst to 3,000 concurrent executions instantly in most regions, then scales at 500/minute. The account-level default limit is 1,000 concurrent executions (request an increase for production).
# Reserve concurrency to prevent one function from starving others
import boto3
lambda_client = boto3.client('lambda')
lambda_client.put_function_concurrency(
FunctionName='critical-payment-processor',
ReservedConcurrentExecutions=200 # Guaranteed 200, capped at 200
)Container Scaling
ECS/Fargate scaling is based on CloudWatch metrics and takes 1-3 minutes to add new tasks:
{
"TargetTrackingScalingPolicyConfiguration": {
"TargetValue": 70.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ECSServiceAverageCPUUtilization"
},
"ScaleOutCooldown": 60,
"ScaleInCooldown": 300
}
}For traffic spikes, containers need headroom — you run more capacity than needed so you can absorb spikes while auto-scaling catches up. Lambda does not need headroom because it scales per-request.
Deployment and Debugging
Lambda Deployment
Lambda deployments are atomic — a new version replaces the old one. With aliases and traffic shifting, you can do canary deployments:
DeploymentPreference:
Type: Canary10Percent5Minutes
Alarms:
- !Ref ApiErrorAlarm
- !Ref ApiLatencyAlarmDebugging Lambda is harder than containers. You cannot SSH into a Lambda function. Your tools are:
- CloudWatch Logs (structured JSON logging is essential)
- X-Ray tracing
- Lambda Insights (memory, CPU, network metrics)
- Local testing with SAM CLI (
sam local invoke)
Container Deployment
ECS supports rolling updates, blue/green (via CodeDeploy), and circuit breaker:
{
"deploymentConfiguration": {
"maximumPercent": 200,
"minimumHealthyPercent": 100,
"deploymentCircuitBreaker": {
"enable": true,
"rollback": true
}
}
}Debugging containers is significantly easier:
aws ecs execute-command— SSH into a running container- Standard APM tools work (Datadog, New Relic, Prometheus)
- Familiar Docker tooling for local development
- Log directly to stdout, ECS sends to CloudWatch
# Shell into a running Fargate container
aws ecs execute-command \
--cluster production \
--task abc123def456 \
--container api \
--interactive \
--command "/bin/sh"ECS Concepts: Task Definitions and Services
If you choose containers, you need to understand ECS’s core abstractions:
Task Definition — A blueprint for your containers (like a docker-compose file). Specifies image, CPU, memory, ports, environment variables:
{
"family": "order-service",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"executionRoleArn": "arn:aws:iam::123456789:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789:role/order-service-role",
"containerDefinitions": [
{
"name": "api",
"image": "123456789.dkr.ecr.us-east-1.amazonaws.com/order-service:v1.2.3",
"portMappings": [
{"containerPort": 8080, "protocol": "tcp"}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/order-service",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "api"
}
},
"secrets": [
{
"name": "DB_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:us-east-1:123456789:secret:prod/db-password"
}
],
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3
}
}
]
}Service — Ensures a desired number of tasks are running. Handles placement, rolling updates, and integration with load balancers.
Fargate vs EC2 Launch Type:
| Aspect | Fargate | EC2 |
|---|---|---|
| Management | No instances to manage | You manage EC2 fleet |
| Cost | ~20% premium over equivalent EC2 | Cheaper (especially with Spot/RI) |
| Scaling | Per-task | Must scale instances + tasks |
| GPU | Not supported | Supported |
| Storage | 20 GB ephemeral (expandable to 200 GB) | Instance storage available |
Start with Fargate. Move to EC2 launch type only if you need GPU, specific instance types, or significant cost savings at scale.
EKS — When You Need Kubernetes
EKS makes sense when:
- Your team already knows Kubernetes
- You need multi-cloud portability
- You need the Kubernetes ecosystem (service mesh, custom operators, GitOps)
- You are running 50+ microservices
EKS does not make sense when:
- Your team does not know Kubernetes (the learning curve is steep)
- You have fewer than 10 services
- You are a small team (Kubernetes is operationally expensive)
The EKS control plane costs $0.10/hour ($73/month) before you add any worker nodes. With Fargate pods or managed node groups, total costs climb quickly.
App Runner — The Simple Option
App Runner is the “I just want to run this container” service:
aws apprunner create-service \
--service-name order-api \
--source-configuration '{
"ImageRepository": {
"ImageIdentifier": "123456789.dkr.ecr.us-east-1.amazonaws.com/order-service:latest",
"ImageRepositoryType": "ECR",
"ImageConfiguration": {
"Port": "8080",
"RuntimeEnvironmentVariables": {
"NODE_ENV": "production"
}
}
},
"AutoDeploymentsEnabled": true
}' \
--instance-configuration '{
"Cpu": "1024",
"Memory": "2048"
}' \
--auto-scaling-configuration-arn "arn:aws:apprunner:us-east-1:123456789:autoscalingconfiguration/default"App Runner handles TLS, load balancing, auto-scaling, and deployments. It scales to zero (you only pay for active instances). The tradeoff: limited configuration options. No VPC integration (it has it now but it is basic), no sidecar containers, and limited observability.
Use App Runner for: Internal tools, simple APIs, prototypes, and services where operational simplicity matters more than fine-grained control.
Hybrid Patterns — The Real-World Architecture
Most production architectures use both Lambda and containers:
┌─────────────────────────────────────────────────────────────┐
│ Architecture │
├─────────────────────────┬───────────────────────────────────┤
│ Event Processing │ API Layer │
│ ───────────────── │ ───────── │
│ S3 → Lambda │ ALB → ECS/Fargate │
│ SQS → Lambda │ (long-running, consistent │
│ EventBridge → Lambda │ latency, WebSocket support) │
│ (bursty, short-lived, │ │
│ event-driven) │ Background Workers │
│ │ ───────────────── │
│ Scheduled Tasks │ ECS/Fargate │
│ ─────────────── │ (long-running processes, │
│ EventBridge → Lambda │ queue consumers, cron jobs) │
│ (cron-like, <15 min) │ │
└─────────────────────────┴───────────────────────────────────┘This pattern uses each service where it excels:
- Lambda for event-driven, bursty workloads with short execution times
- ECS/Fargate for APIs that need consistent latency and long-running background workers
# Lambda for S3 event processing
def handle_upload(event, context):
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
# Process the uploaded file
process_image(bucket, key)
# Trigger downstream ECS task for heavy processing
ecs = boto3.client('ecs')
ecs.run_task(
cluster='production',
taskDefinition='heavy-processing:latest',
launchType='FARGATE',
overrides={
'containerOverrides': [{
'name': 'processor',
'environment': [
{'name': 'S3_BUCKET', 'value': bucket},
{'name': 'S3_KEY', 'value': key}
]
}]
},
networkConfiguration={
'awsvpcConfiguration': {
'subnets': ['subnet-abc123'],
'securityGroups': ['sg-def456'],
'assignPublicIp': 'DISABLED'
}
}
)The Decision Flowchart
Here is how to decide:
- Is execution time under 15 minutes? No → Containers
- Is traffic bursty/event-driven? Yes → Lambda
- Do you need consistent sub-50ms latency? Yes → Containers (or Lambda with Provisioned Concurrency)
- Are you under 100M requests/month? Yes → Lambda is likely cheaper
- Does your team know Kubernetes? Yes, and 50+ services → EKS. Otherwise → ECS/Fargate
- Do you need WebSockets or gRPC? Yes → Containers
- Is this an internal tool or prototype? Yes → App Runner
Start with Lambda for new projects. It forces good architectural patterns (stateless, event-driven, small functions) and costs almost nothing at low scale. Migrate specific services to containers when you hit Lambda’s limitations — cold starts at scale, execution time limits, or cost crossover points.
Migration Path: Lambda to Containers
When you outgrow Lambda, the migration is straightforward if your code is well-structured:
- Wrap your Lambda handlers in a lightweight HTTP framework (FastAPI, Express)
- Create a Dockerfile that runs your application
- Set up an ECS service with Fargate and an ALB
- Shift traffic gradually using Route 53 weighted routing
# Your Lambda handler
def handler(event, context):
order_id = event['pathParameters']['orderId']
return get_order(order_id)
# Same logic in FastAPI (for containers)
from fastapi import FastAPI
app = FastAPI()
@app.get("/orders/{order_id}")
def get_order_endpoint(order_id: str):
return get_order(order_id) # Same business logic functionThe business logic does not change. Only the entry point wrapper changes.
What Matters Most
The compute choice is important, but it is not the most important architectural decision. Your choice of database, your API design, your team’s ability to operate the system — these matter more.
Pick the compute model that lets your team ship fast and sleep well. For most teams starting fresh, that is Lambda. For teams with container experience running at scale, that is ECS on Fargate. For large platform teams with Kubernetes expertise, that is EKS.
There is no wrong answer. There is only the answer that fits your context.
