I’ve watched a production database get wiped because someone committed a root password to a public GitHub repo. It took less than twelve minutes from push to compromise. Automated bots scan every single public commit on GitHub for secrets — AWS keys, database credentials, API tokens — and they find them constantly.
This is Part 3 of the Cloud Security Engineering crash course. If secrets management isn’t the first security problem you solve, nothing else matters. Let’s walk through the three tools I reach for in practice, when to use each, and the patterns that keep credentials safe in real production systems.
The Secrets Problem
A “secret” is any credential your application needs at runtime but should never be visible in source code, logs, or config files. Database passwords, API keys, TLS certificates, OAuth tokens, encryption keys — they all qualify.
The naive approach is environment variables or config files checked into version control. This fails in predictable ways:
- Git history is forever. Even if you delete the secret in a later commit, it’s still in the history.
- Env vars leak. Process listings, crash dumps, debug endpoints, and logging frameworks routinely expose environment variables.
- No rotation. If a secret is baked into a deployment artifact, rotating it means redeploying everything.
- No audit trail. You have no idea who accessed which secret and when.
The right approach is a dedicated secrets manager that handles encryption, access control, rotation, and auditing. The three tools I’ll cover — SSM Parameter Store, AWS Secrets Manager, and HashiCorp Vault — represent different points on the complexity/capability spectrum.
SSM Parameter Store
AWS Systems Manager Parameter Store is the simplest option. It’s a key-value store baked into AWS with native IAM integration. I use it for configuration values and secrets that don’t need automatic rotation.
Storing a Secret
# Store an encrypted parameter
aws ssm put-parameter \
--name "/prod/myapp/db-password" \
--value "s3cureP@ssw0rd!" \
--type SecureString \
--key-id "alias/myapp-key" \
--tags Key=Environment,Value=prod Key=Service,Value=myapp
# Retrieve it
aws ssm get-parameter \
--name "/prod/myapp/db-password" \
--with-decryption \
--query "Parameter.Value" \
--output textThe hierarchical naming (/prod/myapp/db-password) matters. It maps directly to IAM policies — you can grant access to /prod/myapp/* without exposing /prod/billing/*.
Reading from Application Code
import boto3
ssm = boto3.client('ssm', region_name='us-east-1')
def get_secret(name: str) -> str:
response = ssm.get_parameter(
Name=name,
WithDecryption=True
)
return response['Parameter']['Value']
# Fetch at startup, cache in memory
DB_PASSWORD = get_secret('/prod/myapp/db-password')
API_KEY = get_secret('/prod/myapp/stripe-api-key')Terraform Setup
resource "aws_ssm_parameter" "db_password" {
name = "/prod/myapp/db-password"
description = "Production database password"
type = "SecureString"
value = var.db_password # passed via CI/CD, never in .tf files
key_id = aws_kms_key.myapp.arn
tags = {
Environment = "prod"
Service = "myapp"
}
}
# IAM policy granting read access
resource "aws_iam_policy" "read_secrets" {
name = "myapp-read-secrets"
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"ssm:GetParameter",
"ssm:GetParameters",
"ssm:GetParametersByPath"
]
Resource = "arn:aws:ssm:us-east-1:123456789:parameter/prod/myapp/*"
},
{
Effect = "Allow"
Action = ["kms:Decrypt"]
Resource = aws_kms_key.myapp.arn
}
]
})
}When to use SSM: Configuration values, API keys, connection strings — anything that doesn’t need automatic rotation. The standard tier gives you up to 10,000 parameters for free. That covers most teams.
When not to use SSM: If you need automatic credential rotation, cross-account sharing at scale, or multi-cloud support.
AWS Secrets Manager
Secrets Manager is SSM’s more capable (and more expensive) sibling. The killer feature is built-in automatic rotation via Lambda functions. It also has first-class support for RDS, Redshift, and DocumentDB credentials.
Storing and Retrieving
# Create a secret
aws secretsmanager create-secret \
--name "prod/myapp/db-credentials" \
--description "Production RDS credentials" \
--secret-string '{"username":"app_user","password":"s3cureP@ssw0rd!"}'
# Retrieve it
aws secretsmanager get-secret-value \
--secret-id "prod/myapp/db-credentials" \
--query "SecretString" \
--output textApplication Code with Caching
import json
import boto3
from datetime import datetime
client = boto3.client('secretsmanager', region_name='us-east-1')
class SecretCache:
def __init__(self, ttl_seconds=300):
self._cache = {}
self._ttl = ttl_seconds
def get(self, secret_id: str) -> dict:
now = datetime.utcnow().timestamp()
if secret_id in self._cache:
value, fetched_at = self._cache[secret_id]
if now - fetched_at < self._ttl:
return value
response = client.get_secret_value(SecretId=secret_id)
secret = json.loads(response['SecretString'])
self._cache[secret_id] = (secret, now)
return secret
secrets = SecretCache(ttl_seconds=300)
# Usage
creds = secrets.get('prod/myapp/db-credentials')
connection = create_db_connection(
host='mydb.cluster-xxx.us-east-1.rds.amazonaws.com',
user=creds['username'],
password=creds['password']
)The caching layer is critical. Without it, every database connection will make an API call to Secrets Manager — adding latency and cost. A 5-minute TTL is a good default.
Automatic Rotation
This is where Secrets Manager shines. You wire up a Lambda function that rotates credentials on a schedule.
The rotation flow works in four steps: the rotation Lambda creates a new credential, sets it as the pending version, tests it against the target service, and then promotes it to the current version. If any step fails, the current secret remains untouched.
resource "aws_secretsmanager_secret" "db_creds" {
name = "prod/myapp/db-credentials"
description = "Auto-rotating RDS credentials"
kms_key_id = aws_kms_key.myapp.arn
}
resource "aws_secretsmanager_secret_rotation" "db_creds" {
secret_id = aws_secretsmanager_secret.db_creds.id
rotation_lambda_arn = aws_lambda_function.secret_rotator.arn
rotation_rules {
automatically_after_days = 30
}
}When to use Secrets Manager: Database credentials you want auto-rotated, any secret that benefits from versioning, cross-account secret sharing. The $0.40/secret/month pricing is worth it for credentials that need rotation.
HashiCorp Vault
Vault is a different beast. It’s not just a secrets store — it’s a secrets engine. It can generate short-lived, on-demand credentials for databases, cloud providers, PKI certificates, SSH keys, and more. This is the “dynamic secrets” pattern, and it’s genuinely transformative for security posture.
Basic Operations
# Start a dev server (never do this in production)
vault server -dev
# Store a static secret
vault kv put secret/myapp/config \
db_host="mydb.internal" \
db_name="myapp_prod" \
api_key="sk-abc123"
# Retrieve it
vault kv get -field=api_key secret/myapp/configDynamic Database Credentials
This is Vault’s superpower. Instead of storing a static password, Vault creates a temporary database user on demand with a configurable TTL:
# Configure the database secrets engine
vault secrets enable database
vault write database/config/myapp-postgres \
plugin_name=postgresql-database-plugin \
allowed_roles="myapp-readonly","myapp-readwrite" \
connection_url="postgresql://{{username}}:{{password}}@mydb.internal:5432/myapp" \
username="vault_admin" \
password="vault_admin_password"
# Define a role with a TTL
vault write database/roles/myapp-readonly \
db_name=myapp-postgres \
creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; \
GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
default_ttl="1h" \
max_ttl="24h"
# Get a dynamic credential (valid for 1 hour)
vault read database/creds/myapp-readonlyEvery time you run that last command, Vault creates a brand-new database user with a unique password. When the TTL expires, Vault revokes it automatically. No rotation needed — the credentials are ephemeral by design.
Application Integration (Node.js)
const vault = require('node-vault')({
apiVersion: 'v1',
endpoint: process.env.VAULT_ADDR,
token: process.env.VAULT_TOKEN // Use AppRole in production
});
async function getDatabaseCredentials() {
const result = await vault.read('database/creds/myapp-readonly');
return {
username: result.data.username,
password: result.data.password,
lease_id: result.lease_id,
ttl: result.lease_duration
};
}
// Renew the lease before it expires
async function renewLease(leaseId) {
await vault.write('sys/leases/renew', {
lease_id: leaseId,
increment: 3600 // request another hour
});
}When to use Vault: Multi-cloud environments, dynamic credentials, PKI certificate management, advanced access policies (Sentinel), encryption-as-a-service. Vault is the right choice when your security needs outgrow what AWS-native tools offer. The tradeoff is operational complexity — you’re running a stateful, highly sensitive service that itself needs high availability and backup.
Comparison Matrix
Here’s how I think about the decision:
| Dimension | SSM Parameter Store | Secrets Manager | HashiCorp Vault |
|---|---|---|---|
| Cost | Free (standard tier) | $0.40/secret/month | Self-hosted or HCP ($$$) |
| Auto-Rotation | Manual only | Built-in (Lambda) | Dynamic secrets (no rotation needed) |
| Multi-Cloud | AWS only | AWS only | Any cloud + on-prem |
| Dynamic Secrets | No | No | Yes (DB, AWS, PKI, SSH) |
| Complexity | Low | Medium | High |
| Audit Logging | CloudTrail | CloudTrail | Built-in audit device |
| Best For | Config + simple secrets | AWS DB credentials | Multi-cloud, dynamic secrets |
My rule of thumb: start with SSM. Graduate to Secrets Manager when you need rotation. Move to Vault when you need multi-cloud or dynamic secrets.
Secret Rotation Patterns
Rotation is where most teams trip up. There are three patterns I’ve seen work:
1. Single-User Rotation
The simplest approach. One user, one secret. During rotation, there’s a brief window where the old password is invalid but some app instances may still cache it.
# Rotation Lambda pseudocode
def rotate_secret(event, context):
# Step 1: Create new password
new_password = generate_secure_password()
# Step 2: Update the database
update_db_password(current_user, new_password)
# Step 3: Store in Secrets Manager
client.put_secret_value(
SecretId=event['SecretId'],
SecretString=json.dumps({
'username': current_user,
'password': new_password
}),
VersionStages=['AWSCURRENT']
)2. Alternating-User Rotation
Two database users, alternating. While one is active, the other gets its password rotated. Zero-downtime by design — the old credential remains valid until the next rotation cycle.
3. Dynamic / Ephemeral (Vault)
No rotation at all. Each consumer gets a unique, short-lived credential. When it expires, they request a new one. This is the gold standard — there’s no long-lived secret to leak, rotate, or manage.
Common Pitfalls
I’ve seen (and caused) all of these. Save yourself the incident.
1. Secrets in environment variables logged to stdout. Frameworks like Express, Django, and Spring will dump env vars in error pages and crash reports. Use a secrets SDK, not env vars, for anything sensitive.
2. No caching layer. Calling Secrets Manager on every request adds 5-15ms of latency and costs money. Cache secrets in memory with a reasonable TTL (5-10 minutes).
3. Terraform state containing plaintext secrets. When you create an aws_secretsmanager_secret_version in Terraform, the secret value is stored in plaintext in your state file. Encrypt your state backend (S3 + KMS) and restrict access to it.
# Always encrypt your Terraform state
terraform {
backend "s3" {
bucket = "myapp-terraform-state"
key = "prod/terraform.tfstate"
region = "us-east-1"
encrypt = true
kms_key_id = "arn:aws:kms:us-east-1:123456789:key/xxx"
dynamodb_table = "terraform-locks"
}
}4. Overly broad IAM policies. Granting ssm:GetParameter on * means every Lambda in your account can read every secret. Scope to specific paths: /prod/myapp/*.
5. No secret scanning in CI/CD. Tools like gitleaks, truffleHog, or GitHub’s built-in secret scanning should be mandatory in your pipeline. Catch secrets before they hit version control.
# Run gitleaks as a pre-commit hook
gitleaks detect --source . --verbose6. Forgetting to revoke after incidents. If a secret leaks, rotating it isn’t enough. You need to revoke the old one immediately, audit access logs for unauthorized usage, and trace the blast radius.
Key Takeaways
After years of building and breaking secrets management systems, here’s what I keep coming back to:
-
Never hardcode secrets. Not in code, not in config files, not in Docker images, not in environment variables that get logged.
-
Start simple, graduate up. SSM Parameter Store handles 80% of use cases. Move to Secrets Manager when you need rotation. Move to Vault when you need dynamic secrets or multi-cloud.
-
Encrypt at rest and in transit. Use KMS customer-managed keys for SSM and Secrets Manager. TLS everywhere. No exceptions.
-
Cache aggressively. A 5-minute TTL on cached secrets balances freshness against latency and cost.
-
Audit everything. CloudTrail for AWS-native tools, Vault’s audit device for Vault. If you can’t answer “who accessed this secret at 3am last Tuesday,” your secrets management is incomplete.
-
Rotate or go ephemeral. Long-lived, never-rotated secrets are ticking time bombs. Either automate rotation (Secrets Manager) or eliminate long-lived secrets entirely (Vault dynamic secrets).
-
Scan your repos. Automated secret detection in CI/CD is non-negotiable. The twelve-minute window from push-to-compromise is real.
Secrets management isn’t glamorous, but it’s foundational. Get it right early, and you eliminate an entire class of security incidents. Get it wrong, and everything else you build on top is compromised.
Next up in the series: container security and how to lock down Docker and Kubernetes workloads without losing your mind.











