Skip to content

Alert Storms Overview

An alert storm happens when a single root cause (e.g., a database going down) triggers many different alerts — API timeouts, connection refused errors, query failures, and so on. Without storm detection, each unique alert title would dispatch its own agent, leading to multiple agents fixing symptoms instead of the root cause.

Storm detection is purely count-based. No AI is involved in the detection step — it simply counts alerts per relay within a time window.

When alerts arrive at a normal rate:

  1. Alert arrives via webhook
  2. Relay rules execute
  3. Agent rule dispatches a coding agent
  4. Agent investigates and creates a PR

When a burst of alerts arrives:

  1. First 2 alerts dispatch agents immediately (configurable via maxImmediateDispatches)
  2. Once the alert count exceeds the threshold within the time window, a storm is detected
  3. Subsequent alerts are held — no additional agents are dispatched
  4. After a debounced delay, an AI triage step analyzes all storm alerts
  5. The triage identifies the root cause alert
  6. A single agent is dispatched for the root cause with full storm context
StatusDescription
collectingStorm detected, alerts are being collected
triagingAI is analyzing the alerts to find the root cause
dispatchedRoot cause identified, agent dispatched
resolvedStorm resolved (future enhancement)

Storm detection is a layer above the existing same-title dedup:

  • Same-title dedup prevents duplicate agents for alerts with identical (normalized) titles
  • Storm detection prevents excessive agents when many different alert titles fire from a shared root cause

Both checks run in sequence. Storm detection runs first — if it holds an alert, the same-title dedup is skipped entirely.

Storm behavior is configured per agent rule. See Tuning Storm Detection for configuration options.

const rule = {
ruleType: "agent",
config: {
agentType: "devin",
integrationId: "int_abc123",
stormThreshold: 5,
stormWindowSeconds: 60,
maxImmediateDispatches: 2,
},
};

Storm detection is designed to never block alerts:

  • If the storm detection database query fails, the alert proceeds to normal dispatch
  • If the AI triage fails, the earliest high-severity alert is selected as root cause
  • Existing alert routing, notifications, and acknowledgment are unaffected by storm detection