Cloud5 Min Read

Security Ticketing and Incident Response

Gorav Singal

April 04, 2026

TL;DR

Every security team needs: severity classification (P1-P4), pre-written runbooks, a war room protocol, and blameless post-mortems. Automate ticket creation from alerts, and practice your response before you need it.

Security Ticketing and Incident Response

The worst time to figure out your incident response process is during an incident. I learned this the hard way when a credential leak hit production at 2 AM and we spent the first 45 minutes arguing about who should do what. That 45 minutes could have been containment time.

This article covers building an incident response process that works when everything is on fire.

Why Incident Response Matters

Security incidents are inevitable. The question isn’t if, but when. What separates good teams from bad ones is how fast they detect, contain, and recover.

Key metrics:

  • MTTD (Mean Time to Detect) — industry average: 197 days
  • MTTC (Mean Time to Contain) — industry average: 69 days
  • MTTR (Mean Time to Remediate) — your target: hours, not days

Incident Response Lifecycle

Incident Classification (P1-P4)

Every incident needs a severity level that drives the response urgency.

Incident Severity Matrix

Severity Definition Response Time Examples
P1 — Critical Active breach, data exfiltration, system compromise 15 min Credential leak in production, ransomware, active attacker
P2 — High Exploitable vulnerability, unauthorized access attempt 1 hour Open security group on production DB, suspicious IAM activity
P3 — Medium Potential vulnerability, policy violation 4 hours Unpatched critical CVE, MFA not enabled on admin account
P4 — Low Minor policy deviation, informational 24 hours Expired SSL cert (non-prod), minor config drift
# incident_classification.yml
severity_matrix:
  P1_critical:
    impact: "Data breach, system compromise, active attacker"
    response_time: "15 minutes"
    war_room: true
    executive_notification: true
    responders: ["security-lead", "on-call-engineer", "engineering-manager"]

  P2_high:
    impact: "Exploitable vulnerability, unauthorized access"
    response_time: "1 hour"
    war_room: false
    executive_notification: false
    responders: ["security-team", "on-call-engineer"]

  P3_medium:
    impact: "Potential vulnerability, policy violation"
    response_time: "4 hours"
    war_room: false
    executive_notification: false
    responders: ["security-team"]

  P4_low:
    impact: "Minor deviation, informational"
    response_time: "24 hours"
    war_room: false
    executive_notification: false
    responders: ["security-team"]

Building Runbooks

A runbook is a step-by-step guide for responding to a specific type of incident. It removes decision-making from crisis moments.

# runbooks/credential_leak.yml
name: "Credential Leak Response"
severity: P1
trigger: "API key, access key, or password found in public repo/logs"

steps:
  - name: "Immediate (0-5 min)"
    actions:
      - "Revoke the leaked credential immediately"
      - "Check CloudTrail for usage of the credential"
      - "Open war room Slack channel: #incident-YYYY-MM-DD"
      - "Page security lead and on-call engineer"

  - name: "Contain (5-30 min)"
    actions:
      - "Identify all services using the credential"
      - "Rotate the credential on all affected services"
      - "Check for lateral movement (unusual API calls, new resources)"
      - "Block source IP if identified"

  - name: "Investigate (30-120 min)"
    actions:
      - "Run Athena query: all API calls with leaked credential"
      - "Check for data access (S3 GetObject, DynamoDB scans)"
      - "Check for persistence (new IAM users, access keys, roles)"
      - "Document timeline in incident ticket"

  - name: "Recover"
    actions:
      - "Verify all credential rotations are complete"
      - "Remove any resources created by attacker"
      - "Enable additional monitoring for 72 hours"
      - "Schedule post-incident review within 48 hours"

  - name: "Prevention"
    actions:
      - "Add Gitleaks pre-commit hook to affected repo"
      - "Enable GitHub secret scanning"
      - "Review and tighten IAM permissions"

Ticketing Workflow

Every security incident should create a ticket automatically. Here’s a PagerDuty → Jira integration:

# webhook/incident_to_ticket.py
"""PagerDuty webhook → Jira ticket creation"""
import json
import requests
from datetime import datetime

JIRA_URL = "https://company.atlassian.net"
JIRA_TOKEN = "..."  # From Secrets Manager

def create_security_ticket(incident):
    severity = incident['severity']
    title = incident['title']
    description = incident.get('description', '')

    priority_map = {
        'P1': '1',  # Highest
        'P2': '2',  # High
        'P3': '3',  # Medium
        'P4': '4',  # Low
    }

    ticket = {
        "fields": {
            "project": {"key": "SEC"},
            "summary": f"[{severity}] {title}",
            "description": {
                "type": "doc",
                "version": 1,
                "content": [{
                    "type": "paragraph",
                    "content": [{"type": "text", "text": description}]
                }]
            },
            "issuetype": {"name": "Security Incident"},
            "priority": {"id": priority_map.get(severity, '3')},
            "labels": ["security-incident", severity.lower()],
            "customfield_10100": datetime.utcnow().isoformat(),  # Detection time
        }
    }

    response = requests.post(
        f"{JIRA_URL}/rest/api/3/issue",
        json=ticket,
        headers={
            "Authorization": f"Basic {JIRA_TOKEN}",
            "Content-Type": "application/json"
        }
    )
    return response.json()['key']

War Room Protocol

For P1 incidents, you need a structured war room:

P1 Alert Fires

Auto-create Slack #incident-YYYY-MM-DD

Page: Security Lead + On-Call + EM

Assign Roles

Incident Commander

Technical Lead

Communications Lead

Drive Timeline + Decisions

Investigation + Remediation

Stakeholder Updates

Roles:

  • Incident Commander — owns the timeline, makes decisions, keeps things moving
  • Technical Lead — hands on keyboard, investigating and remediating
  • Communications Lead — updates stakeholders, manages external comms if needed
  • Scribe — documents everything in the incident timeline

Rules of the war room:

  1. Start a shared document for the timeline — every action gets timestamped
  2. Update stakeholders every 30 minutes (even if the update is “still investigating”)
  3. Don’t fix and investigate simultaneously — contain first, then investigate
  4. Record every command you run — you’ll need this for the post-mortem

Communication Templates

Pre-written templates save precious minutes during incidents.

## Internal Update (every 30 min)

**Incident:** [Brief description]
**Severity:** P[X]
**Status:** [Investigating / Containing / Remediated / Resolved]
**Impact:** [What's affected, who's affected]
**Current Actions:** [What we're doing right now]
**Next Update:** [Time of next update]
**War Room:** #incident-YYYY-MM-DD

---

## Executive Summary (for P1/P2)

**What happened:** [1-2 sentences]
**Customer impact:** [Yes/No, scope]
**Current status:** [Contained/Investigating/Resolved]
**Root cause:** [If known, or "Under investigation"]
**ETA to resolution:** [Best estimate or "TBD"]

Post-Incident Review

Every P1 and P2 gets a post-incident review within 48 hours. The key: blameless.

# post_incident_template.yml
incident_id: "SEC-2026-042"
date: "2026-04-04"
severity: "P1"
duration: "2 hours 15 minutes"

timeline:
  - time: "02:15 UTC"
    event: "GuardDuty alert: unusual API calls from IAM user 'deploy-bot'"
  - time: "02:20 UTC"
    event: "On-call paged, acknowledged"
  - time: "02:25 UTC"
    event: "War room opened, investigation started"
  - time: "02:35 UTC"
    event: "Identified: access key leaked in public GitHub repo"
  - time: "02:37 UTC"
    event: "Access key deactivated"
  - time: "02:50 UTC"
    event: "CloudTrail audit shows 23 S3 GetObject calls to customer data"
  - time: "03:30 UTC"
    event: "All affected credentials rotated"
  - time: "04:30 UTC"
    event: "Incident resolved, monitoring elevated"

root_cause: "Access key was hardcoded in a config file committed to a public repository"

what_went_well:
  - "GuardDuty detected unusual activity within 10 minutes"
  - "On-call responded in 5 minutes"
  - "Credential was revoked within 20 minutes of detection"

what_went_wrong:
  - "No pre-commit hook to catch secrets"
  - "Access key had broader permissions than needed"
  - "Took 45 minutes to identify all affected services"

action_items:
  - owner: "security-team"
    action: "Deploy Gitleaks pre-commit hooks to all repos"
    due: "2026-04-11"
  - owner: "platform-team"
    action: "Reduce deploy-bot permissions to least privilege"
    due: "2026-04-11"
  - owner: "security-team"
    action: "Add automated credential rotation for all service accounts"
    due: "2026-04-25"

Key Takeaways

  1. Classify by severity — P1-P4 drives response time and escalation
  2. Write runbooks before incidents — remove decision-making from crisis moments
  3. Automate ticket creation — alerts should create tickets without human intervention
  4. War room protocol for P1s — Incident Commander, Tech Lead, Comms Lead, Scribe
  5. Blameless post-mortems — focus on systems and processes, not individuals
  6. Practice your response — tabletop exercises quarterly, game days annually
Share

Related Posts

Supply Chain Security — Protecting Your Software Pipeline

Supply Chain Security — Protecting Your Software Pipeline

In 2024, a single malicious contributor nearly compromised every Linux system on…

Security Mindset for Engineers — Think Like an Attacker

Security Mindset for Engineers — Think Like an Attacker

Most engineers think about security the way they think about flossing — they…

Secrets Management — Vault, SSM, and Secrets Manager

Secrets Management — Vault, SSM, and Secrets Manager

I’ve watched a production database get wiped because someone committed a root…

OWASP Top 10 for Cloud Applications

OWASP Top 10 for Cloud Applications

The OWASP Top 10 was written for traditional web applications. But in the cloud…

Penetration Testing Basics for Developers

Penetration Testing Basics for Developers

Most developers think of penetration testing as something a separate security…

Dependency Vulnerability Detection at Scale

Dependency Vulnerability Detection at Scale

The average application has over 200 transitive dependencies. Each one is code…

Latest Posts

AI Video Generation in 2025 — Models, Costs, and How to Build a Cost-Effective Pipeline

AI Video Generation in 2025 — Models, Costs, and How to Build a Cost-Effective Pipeline

AI video generation went from “cool demo” to “usable in production” in 2024-202…

AI Models in 2025 — Cost, Capabilities, and Which One to Use

AI Models in 2025 — Cost, Capabilities, and Which One to Use

Choosing the right AI model is one of the most impactful decisions you’ll make…

AI Image Generation in 2025 — Models, Costs, and How to Optimize Spend

AI Image Generation in 2025 — Models, Costs, and How to Optimize Spend

Generating one image with AI costs between $0.002 and $0.12. That might sound…

AI Agents Demystified — It's Just Automation With a Better Brain

AI Agents Demystified — It's Just Automation With a Better Brain

Let’s cut through the noise. If you read Twitter or LinkedIn, you’d think “AI…

AI Coding Assistants in 2025 — Every Tool Compared, and Which One to Actually Use

AI Coding Assistants in 2025 — Every Tool Compared, and Which One to Actually Use

Two years ago, AI coding meant one thing: GitHub Copilot autocompleting your…

Supply Chain Security — Protecting Your Software Pipeline

Supply Chain Security — Protecting Your Software Pipeline

In 2024, a single malicious contributor nearly compromised every Linux system on…