Alert Storms Overview

An alert storm happens when a single root cause (e.g., a database going down) triggers many different alerts — API timeouts, connection refused errors, query failures, and so on. Without storm detection, each unique alert title would dispatch its own agent, leading to multiple agents fixing symptoms instead of the root cause.

How Storm Detection Works

Storm detection is purely count-based. No AI is involved in the detection step — it simply counts alerts per relay within a time window.

Normal Flow (No Storm)

When alerts arrive at a normal rate:

Alert arrives via webhook
Relay rules execute
Agent rule dispatches a coding agent
Agent investigates and creates a PR

Storm Flow

When a burst of alerts arrives:

First 2 alerts dispatch agents immediately (configurable via maxImmediateDispatches)
Once the alert count exceeds the threshold within the time window, a storm is detected
Subsequent alerts are held — no additional agents are dispatched
After a debounced delay, an AI triage step analyzes all storm alerts
The triage identifies the root cause alert
A single agent is dispatched for the root cause with full storm context

Storm Lifecycle

Status	Description
`collecting`	Storm detected, alerts are being collected
`triaging`	AI is analyzing the alerts to find the root cause
`dispatched`	Root cause identified, agent dispatched
`resolved`	Storm resolved (future enhancement)

Relationship to Alert Deduplication

Storm detection is a layer above the existing same-title dedup:

Same-title dedup prevents duplicate agents for alerts with identical (normalized) titles
Storm detection prevents excessive agents when many different alert titles fire from a shared root cause

Both checks run in sequence. Storm detection runs first — if it holds an alert, the same-title dedup is skipped entirely.

Configuration

Storm behavior is configured per agent rule. See Tuning Storm Detection for configuration options.

const rule = {
  ruleType: "agent",
  config: {
    agentType: "devin",
    integrationId: "int_abc123",
    stormThreshold: 5,
    stormWindowSeconds: 60,
    maxImmediateDispatches: 2,
  },
};

Fail-Open Design

Storm detection is designed to never block alerts:

If the storm detection database query fails, the alert proceeds to normal dispatch
If the AI triage fails, the earliest high-severity alert is selected as root cause
Existing alert routing, notifications, and acknowledgment are unaffected by storm detection