Security on AWS is not a single service or a checkbox. It is a layered strategy — defense in depth — where every layer assumes the layer above it has already been breached. If your security model depends on a single perimeter, you are one misconfiguration away from a headline.
This lesson covers the security services, patterns, and practices that backend engineers need to build production-grade workloads on AWS.
Defense in Depth — The Mental Model
The idea is simple: stack multiple independent security controls so that a failure in one does not compromise the entire system.
On AWS, these layers map to concrete services:
| Layer | AWS Services |
|---|---|
| Perimeter | WAF, Shield, CloudFront, Route 53 |
| Network | VPC, Security Groups, NACLs, PrivateLink |
| Identity | IAM, STS, Organizations SCPs |
| Application | Cognito, API Gateway authorizers, Lambda auth |
| Data | KMS, S3 SSE, EBS encryption, ACM (TLS) |
| Detection | GuardDuty, Security Hub, CloudTrail, Config |
| Response | EventBridge, Lambda, SNS, Systems Manager |
Every production workload should have controls at every layer. Let us walk through each one.
Encryption at Rest
KMS — The Key Management Service
KMS is the backbone of encryption on AWS. Almost every service that encrypts data does so through KMS. There are three types of keys:
- AWS managed keys — AWS creates and rotates them. Free, but you cannot control rotation or key policies.
- Customer managed keys (CMKs) — You create and control them. You set the key policy, enable/disable, and define rotation schedules.
- Customer provided keys — You bring your own key material. Maximum control, maximum operational burden.
For most backend workloads, customer managed keys hit the sweet spot.
import boto3
kms = boto3.client('kms')
# Create a customer managed key
response = kms.create_key(
Description='Order service encryption key',
KeyUsage='ENCRYPT_DECRYPT',
KeySpec='SYMMETRIC_DEFAULT',
Tags=[
{'TagKey': 'Service', 'TagValue': 'order-service'},
{'TagKey': 'Environment', 'TagValue': 'production'},
]
)
key_id = response['KeyMetadata']['KeyId']
# Create an alias for easy reference
kms.create_alias(
AliasName='alias/order-service-prod',
TargetKeyId=key_id
)
# Encrypt sensitive data
plaintext = b'customer-credit-card-token-abc123'
encrypt_response = kms.encrypt(
KeyId='alias/order-service-prod',
Plaintext=plaintext,
EncryptionContext={
'service': 'order-service',
'purpose': 'payment-token'
}
)
ciphertext = encrypt_response['CiphertextBlob']
# Decrypt it back
decrypt_response = kms.decrypt(
CiphertextBlob=ciphertext,
EncryptionContext={
'service': 'order-service',
'purpose': 'payment-token'
}
)
original = decrypt_response['Plaintext']The EncryptionContext is critical. It is an additional authentication check — if someone gets the ciphertext but does not know the encryption context, they cannot decrypt it. Always use it.
S3 Server-Side Encryption
S3 supports three encryption modes:
- SSE-S3 — Amazon manages the keys. Default since January 2023.
- SSE-KMS — Encrypted with a KMS key you control. Gives you audit trail via CloudTrail.
- SSE-C — You provide the key with every request. AWS never stores it.
For anything with compliance requirements, use SSE-KMS:
s3 = boto3.client('s3')
# Upload with KMS encryption
s3.put_object(
Bucket='my-secure-bucket',
Key='reports/q4-financials.pdf',
Body=report_data,
ServerSideEncryption='aws:kms',
SSEKMSKeyId='alias/finance-reports-key',
BucketKeyEnabled=True # Reduces KMS API calls by 99%
)Always enable BucketKeyEnabled. Without it, every S3 object operation calls KMS, which gets expensive at scale and can hit KMS rate limits (5,500 requests/second per key per region).
EBS Encryption
Enable EBS encryption by default at the account level:
aws ec2 enable-ebs-encryption-by-default --region us-east-1This ensures every new EBS volume is encrypted. Existing unencrypted volumes require a snapshot-copy-restore dance to encrypt.
Encryption in Transit
TLS Everywhere
Every data path should use TLS. AWS Certificate Manager (ACM) provides free public TLS certificates:
# Request a certificate
aws acm request-certificate \
--domain-name api.myapp.com \
--validation-method DNS \
--subject-alternative-names "*.myapp.com"ACM auto-renews certificates attached to ALB, CloudFront, or API Gateway. No more expired certificate outages.
For internal service-to-service communication, use ACM Private CA or enforce TLS on your internal ALBs. Never run HTTP between services, even inside a VPC. VPC traffic can be sniffed by anyone with access to the same VPC.
Secrets Management
Secrets Manager vs Parameter Store
Both store secrets. The choice is not obvious.
| Feature | Secrets Manager | Parameter Store (SecureString) |
|---|---|---|
| Auto-rotation | Built-in with Lambda | Manual |
| Cost | $0.40/secret/month + $0.05/10K API calls | Free tier for standard; $0.05/10K for advanced |
| Cross-account sharing | Native via resource policy | Requires custom solution |
| Database credential rotation | First-class RDS/Aurora support | Not built-in |
| Max size | 64 KB | 8 KB (advanced: 64 KB) |
Rule of thumb: Use Secrets Manager for database credentials, API keys, and anything that needs rotation. Use Parameter Store for configuration values and non-sensitive parameters.
import json
import boto3
secrets = boto3.client('secretsmanager')
# Store a database credential
secrets.create_secret(
Name='prod/order-service/db-credentials',
Description='RDS PostgreSQL credentials for order service',
SecretString=json.dumps({
'username': 'order_svc',
'password': 'super-secure-generated-password',
'engine': 'postgres',
'host': 'orders-db.cluster-abc123.us-east-1.rds.amazonaws.com',
'port': 5432,
'dbname': 'orders'
}),
Tags=[
{'Key': 'Service', 'Value': 'order-service'},
]
)
# Retrieve it in your application
def get_db_credentials():
response = secrets.get_secret_value(
SecretId='prod/order-service/db-credentials'
)
return json.loads(response['SecretString'])
# Enable automatic rotation (every 30 days)
secrets.rotate_secret(
SecretId='prod/order-service/db-credentials',
RotationLambdaARN='arn:aws:lambda:us-east-1:123456789:function:rotate-db-creds',
RotationRules={
'AutomaticallyAfterDays': 30
}
)Never put secrets in environment variables directly. Fetch them at runtime from Secrets Manager. Cache them in memory with a TTL to avoid excessive API calls.
Network Security
Security Groups vs NACLs
Security Groups and NACLs both filter traffic, but they work differently:
| Aspect | Security Groups | NACLs |
|---|---|---|
| Level | Instance/ENI | Subnet |
| State | Stateful (return traffic auto-allowed) | Stateless (must allow both directions) |
| Rules | Allow only | Allow and Deny |
| Evaluation | All rules evaluated | Rules evaluated in order |
| Default | Deny all inbound, allow all outbound | Allow all |
Best practice: Use Security Groups as your primary control. Use NACLs as a coarse-grained second layer.
# Create a security group that only allows HTTPS from ALB
aws ec2 create-security-group \
--group-name app-server-sg \
--description "Application servers - only ALB traffic" \
--vpc-id vpc-abc123
# Allow inbound only from ALB security group
aws ec2 authorize-security-group-ingress \
--group-id sg-app123 \
--protocol tcp \
--port 8080 \
--source-group sg-alb456Reference security groups by ID, not by CIDR blocks. This creates dynamic rules that automatically include new instances added to the source group.
VPC Flow Logs
VPC Flow Logs capture IP traffic metadata for your network interfaces. Essential for security forensics:
# Enable flow logs for the entire VPC
aws ec2 create-flow-logs \
--resource-type VPC \
--resource-ids vpc-abc123 \
--traffic-type ALL \
--log-destination-type cloud-watch-logs \
--log-group-name /vpc/flow-logs/production \
--deliver-logs-permission-arn arn:aws:iam::123456789:role/flow-logs-roleSend flow logs to CloudWatch for real-time alerting and to S3 for long-term retention. Use Athena to query S3 flow logs when investigating incidents.
WAF and Shield
AWS WAF
WAF protects your web applications from common exploits. Attach it to CloudFront, ALB, or API Gateway:
# Create a rate-limiting rule
aws wafv2 create-rule-group \
--name "rate-limit-api" \
--scope REGIONAL \
--capacity 10 \
--rules '[{
"Name": "RateLimit1000",
"Priority": 1,
"Action": {"Block": {}},
"Statement": {
"RateBasedStatement": {
"Limit": 1000,
"AggregateKeyType": "IP"
}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "RateLimit1000"
}
}]'Start with AWS Managed Rule Groups. They cover OWASP Top 10, SQL injection, XSS, and known bad inputs. Then add custom rules for your specific application:
- Rate limiting per IP
- Geo-blocking (if your app is region-specific)
- Request body size limits
- Custom header validation
AWS Shield
Shield Standard is free and automatically protects against common DDoS attacks (Layer 3/4). Shield Advanced ($3,000/month) adds Layer 7 DDoS protection, 24/7 DDoS response team access, and cost protection (AWS credits you for scaling costs during an attack).
Most applications are fine with Shield Standard + WAF.
Detection and Audit
CloudTrail
CloudTrail records every API call made in your account. This is your audit log. Enable it in every region, even regions you do not use — attackers will spin up resources in regions you are not watching.
# Create a multi-region trail
aws cloudtrail create-trail \
--name production-audit-trail \
--s3-bucket-name my-cloudtrail-logs \
--is-multi-region-trail \
--enable-log-file-validation \
--kms-key-id alias/cloudtrail-key
aws cloudtrail start-logging --name production-audit-trailEnable log file validation. This creates a digest file with a hash chain, so you can prove logs have not been tampered with.
GuardDuty
GuardDuty uses machine learning and threat intelligence to detect:
- Compromised EC2 instances (cryptocurrency mining, C&C communication)
- Compromised credentials (API calls from unusual locations)
- Unauthorized S3 access patterns
- Kubernetes audit log anomalies
Enable it. It costs roughly $1/million CloudTrail events analyzed. Worth every penny.
aws guardduty create-detector --enable --finding-publishing-frequency FIFTEEN_MINUTESSecurity Hub
Security Hub aggregates findings from GuardDuty, Inspector, Macie, Config, and third-party tools into a single dashboard. It also runs automated compliance checks against frameworks like CIS AWS Foundations Benchmark.
Advanced IAM Patterns
Resource Policies
Resource policies are attached to the resource itself (S3 bucket, SQS queue, KMS key), not to the IAM principal. They enable cross-account access without sharing credentials:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowCrossAccountRead",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::987654321:role/data-pipeline-role"
},
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::shared-data-bucket",
"arn:aws:s3:::shared-data-bucket/*"
],
"Condition": {
"StringEquals": {
"aws:PrincipalOrgID": "o-myorgid123"
}
}
}
]
}The aws:PrincipalOrgID condition ensures only accounts within your AWS Organization can access the bucket, even if they know the bucket name.
Service Control Policies (SCPs)
SCPs are guardrails for your entire AWS Organization. They restrict what member accounts can do, regardless of their IAM policies:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyNonApprovedRegions",
"Effect": "Deny",
"Action": "*",
"Resource": "*",
"Condition": {
"StringNotEquals": {
"aws:RequestedRegion": [
"us-east-1",
"us-west-2",
"eu-west-1"
]
},
"ArnNotLike": {
"aws:PrincipalARN": "arn:aws:iam::*:role/OrganizationAdmin"
}
}
},
{
"Sid": "DenyDeletingCloudTrail",
"Effect": "Deny",
"Action": [
"cloudtrail:StopLogging",
"cloudtrail:DeleteTrail"
],
"Resource": "*"
}
]
}This SCP prevents all member accounts from using unapproved regions and prevents anyone from disabling CloudTrail. Even an account admin cannot override an SCP.
Permission Boundaries
Permission boundaries set the maximum permissions that an IAM entity can have. They are essential for delegating IAM administration safely:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:*",
"dynamodb:*",
"lambda:*",
"logs:*",
"sqs:*"
],
"Resource": "*"
},
{
"Effect": "Deny",
"Action": [
"iam:*",
"organizations:*",
"account:*"
],
"Resource": "*"
}
]
}Even if a developer attaches AdministratorAccess to their role, the permission boundary caps what they can actually do.
Security Automation with Config Rules
AWS Config continuously evaluates your resources against rules. Use it to enforce compliance automatically:
# Check that all S3 buckets have encryption enabled
aws configservice put-config-rule --config-rule '{
"ConfigRuleName": "s3-bucket-encryption",
"Source": {
"Owner": "AWS",
"SourceIdentifier": "S3_BUCKET_SERVER_SIDE_ENCRYPTION_ENABLED"
}
}'
# Check that all EBS volumes are encrypted
aws configservice put-config-rule --config-rule '{
"ConfigRuleName": "ebs-encryption",
"Source": {
"Owner": "AWS",
"SourceIdentifier": "ENCRYPTED_VOLUMES"
}
}'Pair Config rules with auto-remediation using SSM Automation documents. When a rule detects a non-compliant resource, it can automatically fix it — for example, enabling encryption on an unencrypted S3 bucket.
Incident Response Basics
When something goes wrong, you need a plan:
- Isolate — Restrict the compromised resource’s security group to deny all traffic. Do not terminate it — you need it for forensics.
- Preserve evidence — Snapshot EBS volumes, export CloudTrail logs, capture VPC Flow Logs.
- Investigate — Use Athena to query CloudTrail logs. What API calls did the compromised credentials make? When? From where?
- Eradicate — Rotate all affected credentials. Revoke active sessions. Patch the vulnerability.
- Recover — Restore from known-good backups. Redeploy from your CI/CD pipeline.
- Learn — Write a post-mortem. Update your Config rules and SCPs to prevent recurrence.
# Emergency: isolate a compromised EC2 instance
def isolate_instance(instance_id):
ec2 = boto3.client('ec2')
# Create an isolation security group (no inbound, no outbound)
vpc_id = ec2.describe_instances(
InstanceIds=[instance_id]
)['Reservations'][0]['Instances'][0]['VpcId']
sg = ec2.create_security_group(
GroupName=f'isolation-{instance_id}',
Description='Incident isolation - no traffic allowed',
VpcId=vpc_id
)
# Revoke the default outbound rule
ec2.revoke_security_group_egress(
GroupId=sg['GroupId'],
IpPermissions=[{
'IpProtocol': '-1',
'IpRanges': [{'CidrIp': '0.0.0.0/0'}]
}]
)
# Replace all security groups with the isolation group
ec2.modify_instance_attribute(
InstanceId=instance_id,
Groups=[sg['GroupId']]
)
# Tag for tracking
ec2.create_tags(
Resources=[instance_id],
Tags=[
{'Key': 'SecurityIncident', 'Value': 'isolated'},
{'Key': 'IsolationDate', 'Value': '2026-03-31'}
]
)
print(f"Instance {instance_id} isolated in security group {sg['GroupId']}")The Security Checklist
Before any workload goes to production, verify:
- KMS encryption enabled for all data stores (S3, EBS, RDS, DynamoDB)
- TLS on all endpoints (ALB, API Gateway, CloudFront)
- Secrets in Secrets Manager, not environment variables or code
- CloudTrail enabled in all regions with log validation
- GuardDuty enabled
- VPC Flow Logs enabled
- Security Groups follow least-privilege (no 0.0.0.0/0 inbound)
- IAM roles use permission boundaries
- SCPs prevent dangerous actions (disabling CloudTrail, using unapproved regions)
- Config rules enforcing encryption and compliance
- WAF on all public-facing endpoints
- Incident response runbook documented and tested
What is Next
Security protects your workloads, but it does not protect your wallet. In the next lesson, we cover Cost Optimization — how to stop wasting money on AWS without sacrificing performance or reliability.
