Velocity Rules for AI Agents: Why Human Thresholds Break

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "Why don't human payment velocity rules work for AI agents?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Human velocity rules assume a person clicking buy. They flag any account that exceeds a fixed transactions-per-minute threshold. AI agents transact at machine speed by design. A shopping assistant comparing 12 SKUs in 30 seconds or a finance optimizer rebalancing across 8 accounts will trip a human-tuned rule even when behaving normally. The result is high false positive rates that block legitimate agent traffic and erode trust in the rule engine."
      }
    },
    {
      "@type": "Question",
      "name": "What is a per-agent velocity baseline?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "A per-agent velocity baseline is the agent's own historical transactions-per-minute distribution under normal conditions. Instead of comparing every agent to a single global threshold, the rule engine compares each agent to its own learned cadence. An agent that normally fires 2 transactions per minute looks suspicious at 20, while an agent that normally fires 30 still looks normal at 35."
      }
    },
    {
      "@type": "Question",
      "name": "What is peer comparison in agent velocity detection?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Peer comparison groups agents by capability or vendor (shopping assistant, travel booker, finance optimizer) and computes velocity statistics across the cohort. An agent is flagged when its rate deviates significantly from peers performing the same job. This catches hijacked agents whose own baseline is also corrupted, since their rate now diverges from peers running the same workflow."
      }
    },
    {
      "@type": "Question",
      "name": "How does RisingWave compute agent velocity in real time?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "RisingWave maintains incremental materialized views over the agent transaction stream. Each new event updates per-agent counters, peer baselines, and cadence statistics in place, with no batch recomputation. Because RisingWave is PostgreSQL-compatible, fraud decisioning services query the velocity risk score directly over the standard Postgres protocol without an intermediate cache."
      }
    }
  ]
}

Velocity rules are the oldest tool in the payments fraud playbook. Count how many transactions a card or account fires in a rolling window, alert when the count crosses a threshold. This logic has guarded card networks for decades, and a recent companion piece on transaction velocity fraud detection with SQL windows walks through the canonical implementation in streaming SQL.

The world is changing fast. AI agents are now the buyers, the schedulers, the rebalancers. They book flights, restock pantries, optimize portfolios, and pay invoices. They do all of this at machine speed, fully authorized, on behalf of a human user.

Human velocity rules break the moment an AI agent touches the rail. A rule like "more than 5 transactions per minute is anomalous" was tuned to a person tapping a phone screen. An AI agent comparing 12 SKUs across 4 marketplaces in 15 seconds is not a fraudster; it is doing exactly what the user asked. The rule engine, blind to that distinction, flags the legitimate flow and blocks the user.

This article explains why human thresholds break for AI agents, and how to redesign velocity rules using three complementary signals: per-agent baselines, peer comparison across agent cohorts, and cadence pattern detection. Every example runs end to end on RisingWave v2.8.0.

Why Human Velocity Thresholds Break for AI Agents

The original assumption behind any per-account velocity rule is that a human sits behind every authorized transaction. A human types, hesitates, navigates a checkout page, taps a confirmation button, then waits. Their cadence is jittered. Their bursts top out at the speed of their fingers. A spike in count is therefore evidence of automation, and automation on a payment account is the historical signature of fraud.

AI agents invert this assumption. The agent is the authorization. The user delegates intent ("book me a flight under $400 to Tokyo next Tuesday") and the agent translates that intent into many small actions, each of which may touch payment surfaces. Three concrete failure modes follow.

Bursty workflows look like card testing. A shopping assistant comparing prices across vendors might issue multiple authorization holds in quick succession to verify availability. The pattern looks identical to a fraudster running stolen card numbers through a merchant API. Block the agent and the legitimate purchase fails.

Per-account thresholds penalize productive agents. A finance optimizer running on behalf of a power user may legitimately fire dozens of transactions per minute during a rebalance. A travel booker holding inventory across seven hotel chains will easily exceed a 5-transactions-per-minute global rule. There is no single threshold that fits both a casual cardholder and an active agent.

Static rules cannot tell good agents from hijacked agents. When an attacker compromises an agent's API key, the malicious traffic now flows through an account whose normal velocity is already high. A simple "transactions per minute" cap will not detect the change because the baseline was already permissive. Worse, raising the cap to admit legitimate agent traffic widens the attacker's window.

The fix is not to throw out velocity rules. Velocity is still one of the cleanest signals available in real time. The fix is to redefine what "anomalous velocity" means for agents.

A New Velocity Model for Agents

A robust agent velocity model rests on three signals.

Per-agent baselines. Every agent carries its own learned distribution. A shopping assistant might have a normal of 1 transaction per minute; a finance optimizer might have a normal of 30. The rule engine compares each agent to its own history, not to a global threshold.

Peer comparison. Agents performing the same job form a cohort. Shopping assistants behave like other shopping assistants; finance optimizers behave like other finance optimizers. When one agent's rate diverges sharply from peers running the same workflow, that is suspicious in a way that an absolute threshold cannot capture. Peer comparison also catches the hijacked-agent case: an attacker may match the agent's old baseline, but they cannot easily match the cohort baseline if their goal is to drain funds quickly.

Cadence pattern detection. Real workflows have noise. Network jitter, retry backoff, dependency on user inputs, and rate limiters from upstream APIs all introduce variance into the inter-arrival distribution of transactions. A truly machine-driven attack often shows up as evenly spaced events: 1 transaction every 8 seconds, like clockwork. Measuring the standard deviation of inter-arrival gaps gives a separate signal from raw count.

Together these three signals form a velocity risk score that reflects how an agent is behaving relative to itself, relative to its cohort, and relative to natural cadence noise. The rest of this article shows how to compute each signal in streaming SQL on RisingWave, then combine them into a real-time decision.

Computing Per-Agent Baselines with Streaming SQL

We start with a simple table representing agent transactions. In production this would be a Kafka source; the schema is identical.

CREATE TABLE aap07_agent_tx (
    tx_id      VARCHAR PRIMARY KEY,
    agent_id   VARCHAR,
    user_id    VARCHAR,
    agent_type VARCHAR,
    amount     DECIMAL,
    tx_time    TIMESTAMPTZ
);

A small seed dataset captures four behavior patterns: a normal shopping assistant, a hijacked shopping assistant firing 12 evenly spaced low-value transactions, two normal travel bookers, a normal finance optimizer, and a hijacked finance optimizer running 8 evenly spaced transactions. Each agent is tagged with its agent_type, which is the cohort key for peer comparison later.

The first materialized view computes per-agent velocity over a rolling 5-minute window: total count, total amount, and a derived transactions-per-minute rate.

CREATE MATERIALIZED VIEW aap07_per_agent_velocity_mv AS
SELECT
    agent_id,
    agent_type,
    COUNT(*)            AS tx_count_5min,
    SUM(amount)         AS total_amount_5min,
    AVG(amount)         AS avg_amount_5min,
    COUNT(*) / 5.0      AS tx_per_min
FROM aap07_agent_tx
WHERE tx_time > NOW() - INTERVAL '5 minutes'
GROUP BY agent_id, agent_type;

RisingWave maintains this view incrementally. Every new event updates the matching agent_id row in place; no batch job sweeps the table. Querying the live view returns the current state:

 agent_id  |     agent_type     | tx_count_5min | total_amount_5min | tx_per_min
-----------+--------------------+---------------+-------------------+------------
 agent_001 | shopping_assistant |             1 |             41.75 |        0.2
 agent_002 | shopping_assistant |            10 |             99.90 |        2.0
 agent_003 | travel_booker      |             1 |            145.00 |        0.2
 agent_004 | travel_booker      |             2 |            340.75 |        0.4
 agent_005 | finance_optimizer  |             2 |           2950.00 |        0.4
 agent_006 | finance_optimizer  |             8 |            400.00 |        1.6

Two things stand out. agent_002 is firing at 2 transactions per minute, ten times faster than its peer agent_001. agent_006 is at 1.6 transactions per minute, four times faster than agent_005. A flat global threshold of, say, 1 transaction per minute would also flag agent_005 as it ramps up, even though that agent is operating normally for its workflow. We need a smarter baseline.

In production you would back this view with a longer-horizon historical baseline (per-agent rolling 30-day median, 95th percentile, etc.) and store it in a separate slowly-changing dimension. The 5-minute aggregate above is the live signal that is compared against that baseline. For brevity in this walkthrough we lean on peer baselines instead, which require no historical store.

Peer Comparison: Detecting Outlier Agents

Peer comparison groups agents by agent_type and computes the median and 95th percentile of tx_per_min across the cohort. RisingWave's APPROX_PERCENTILE aggregate keeps the computation streaming-friendly. (See the RisingWave aggregate functions reference for the full list.)

CREATE MATERIALIZED VIEW aap07_peer_baseline_mv AS
SELECT
    agent_type,
    COUNT(*)                                                AS agents_in_peer_group,
    AVG(tx_per_min)                                         AS mean_tx_per_min,
    APPROX_PERCENTILE(0.5)  WITHIN GROUP (ORDER BY tx_per_min) AS median_tx_per_min,
    APPROX_PERCENTILE(0.95) WITHIN GROUP (ORDER BY tx_per_min) AS p95_tx_per_min
FROM aap07_per_agent_velocity_mv
GROUP BY agent_type;

Querying the peer baseline reveals what "normal" looks like for each cohort:

     agent_type     | agents_in_peer_group | median_tx_per_min | p95_tx_per_min
--------------------+----------------------+-------------------+----------------
 finance_optimizer  |                    2 |              0.40 |           0.40
 shopping_assistant |                    2 |              0.20 |           0.20
 travel_booker      |                    2 |              0.20 |           0.20

Each cohort has its own median, and a finance optimizer's normal rate is twice that of a shopping assistant. A flat global rule would have to be set high enough to admit the finance optimizer cohort, which means it admits the hijacked shopping assistant too.

The peer baseline becomes useful when joined back to per-agent velocity. The next view flags any agent whose rate exceeds twice or three times the peer median.

CREATE MATERIALIZED VIEW aap07_velocity_anomalies_mv AS
SELECT
    v.agent_id,
    v.agent_type,
    v.tx_count_5min,
    v.tx_per_min,
    b.median_tx_per_min                                AS peer_median,
    ROUND((v.tx_per_min / NULLIF(b.median_tx_per_min, 0))::numeric, 2)
                                                       AS ratio_vs_peer,
    CASE
      WHEN v.tx_per_min >= 3 * b.median_tx_per_min THEN 'OUTLIER_3X'
      WHEN v.tx_per_min >= 2 * b.median_tx_per_min THEN 'OUTLIER_2X'
      ELSE 'NORMAL'
    END                                                AS peer_flag
FROM aap07_per_agent_velocity_mv v
JOIN aap07_peer_baseline_mv b
  ON v.agent_type = b.agent_type;

Live output:

 agent_id  |     agent_type     | tx_count_5min | tx_per_min | peer_median | ratio_vs_peer | peer_flag
-----------+--------------------+---------------+------------+-------------+---------------+------------
 agent_002 | shopping_assistant |            10 |        2.0 |        0.20 |         10.01 | OUTLIER_3X
 agent_006 | finance_optimizer  |             8 |        1.6 |        0.40 |          3.98 | OUTLIER_3X
 agent_004 | travel_booker      |             2 |        0.4 |        0.20 |          2.00 | OUTLIER_2X
 agent_001 | shopping_assistant |             1 |        0.2 |        0.20 |          1.00 | NORMAL
 agent_003 | travel_booker      |             1 |        0.2 |        0.20 |          1.00 | NORMAL
 agent_005 | finance_optimizer  |             2 |        0.4 |        0.40 |          0.99 | NORMAL

The hijacked shopping assistant agent_002 is firing at 10x its peer median, and the hijacked finance optimizer agent_006 is at 4x. The legitimate finance optimizer agent_005 sits at the cohort median and stays NORMAL even though its absolute rate would have been flagged by a naive threshold.

Note that peer ratios are noisy when the cohort is small. In production the cohort key would typically combine agent type with vendor (e.g. vendor:openai-shopping-1.2) and the cohort would be hundreds or thousands of agents, making the median statistically stable. Tuning thresholds (2x, 3x) is also workload-specific and should be calibrated on a held-out historical week.

Cadence Pattern Detection

Peer comparison catches obvious volume outliers. It does not always catch the subtle case where an attacker keeps the volume modest but generates events with machine precision. Real workflows have variance: API retries, downstream rate limits, user think time, and dependency latency all jitter the inter-arrival distribution.

A pure scripted attack often produces events spaced too evenly to be human, and too evenly to be a real agent operating against an external API. Measuring the standard deviation of gaps gives a separate signal that catches this pattern even when total volume is modest.

CREATE MATERIALIZED VIEW aap07_cadence_anomaly_mv AS
WITH ordered AS (
    SELECT
        agent_id,
        agent_type,
        tx_time,
        LAG(tx_time) OVER (PARTITION BY agent_id ORDER BY tx_time) AS prev_tx_time
    FROM aap07_agent_tx
),
gaps AS (
    SELECT
        agent_id,
        agent_type,
        EXTRACT(EPOCH FROM (tx_time - prev_tx_time)) AS gap_seconds
    FROM ordered
    WHERE prev_tx_time IS NOT NULL
)
SELECT
    agent_id,
    agent_type,
    COUNT(*)                                       AS gap_count,
    ROUND(AVG(gap_seconds)::numeric, 2)            AS mean_gap_s,
    ROUND(STDDEV_SAMP(gap_seconds)::numeric, 2)    AS stddev_gap_s,
    ROUND(
        (STDDEV_SAMP(gap_seconds) / NULLIF(AVG(gap_seconds), 0))::numeric,
        3
    )                                              AS coeff_of_variation,
    CASE
      WHEN COUNT(*) >= 4
           AND STDDEV_SAMP(gap_seconds) / NULLIF(AVG(gap_seconds), 0) < 0.15
        THEN 'MACHINE_CADENCE'
      ELSE 'HUMAN_LIKE'
    END                                            AS cadence_flag
FROM gaps
GROUP BY agent_id, agent_type;

The coefficient of variation (stddev divided by mean) is the right unitless measure here: a 1-second standard deviation around a 10-second mean is suspicious; the same 1-second deviation around a 90-second mean is not. Below 0.15 means events are spaced within 15% of the mean gap, which is rarer in real workflows than most teams expect.

Live output:

 agent_id  |     agent_type     | gap_count | mean_gap_s | stddev_gap_s | coeff_of_variation |  cadence_flag
-----------+--------------------+-----------+------------+--------------+--------------------+-----------------
 agent_002 | shopping_assistant |        11 |      10.00 |            0 |                  0 | MACHINE_CADENCE
 agent_006 | finance_optimizer  |         7 |       8.00 |            0 |                  0 | MACHINE_CADENCE
 agent_005 | finance_optimizer  |         2 |      90.00 |            0 |                  0 | HUMAN_LIKE
 agent_003 | travel_booker      |         2 |      35.00 |         7.07 |              0.202 | HUMAN_LIKE
 agent_001 | shopping_assistant |         4 |      86.25 |        18.87 |              0.219 | HUMAN_LIKE
 agent_004 | travel_booker      |         1 |      65.00 |              |                    | HUMAN_LIKE

Both hijacked agents (agent_002 at 10s gaps, agent_006 at 8s gaps) are surfaced as MACHINE_CADENCE. The legitimate shopping assistant agent_001 sits at coefficient of variation 0.219, comfortably above the threshold even though it has more samples. The minimum-sample guard (gap_count >= 4) prevents false positives on agents with only one or two transactions in the window. agent_005 shows zero variation because it only has two gaps in the sample, which is why the gap-count guard exists.

In practice you will want to mix this with longer-window cadence statistics and combine with retry telemetry from your gateway: legitimate retries are bursty but not perfectly periodic.

Combining Signals into a Velocity Risk Score

The three signals each catch different attack shapes. Combining them into a weighted score reduces false positives while preserving recall. The final view joins peer-flag with cadence-flag and assigns a numeric risk score from which a downstream service derives an action.

CREATE MATERIALIZED VIEW aap07_velocity_risk_score_mv AS
SELECT
    a.agent_id,
    a.agent_type,
    a.tx_count_5min,
    a.ratio_vs_peer,
    a.peer_flag,
    c.coeff_of_variation,
    c.cadence_flag,
    (CASE a.peer_flag
        WHEN 'OUTLIER_3X' THEN 50
        WHEN 'OUTLIER_2X' THEN 30
        ELSE 0
     END
     +
     CASE c.cadence_flag
        WHEN 'MACHINE_CADENCE' THEN 40
        ELSE 0
     END
     +
     CASE
        WHEN a.tx_count_5min >= 8 THEN 20
        WHEN a.tx_count_5min >= 5 THEN 10
        ELSE 0
     END
    ) AS risk_score,
    CASE
      WHEN (CASE a.peer_flag WHEN 'OUTLIER_3X' THEN 50 WHEN 'OUTLIER_2X' THEN 30 ELSE 0 END
            + CASE c.cadence_flag WHEN 'MACHINE_CADENCE' THEN 40 ELSE 0 END
            + CASE WHEN a.tx_count_5min >= 8 THEN 20 WHEN a.tx_count_5min >= 5 THEN 10 ELSE 0 END) >= 70
        THEN 'BLOCK'
      WHEN (CASE a.peer_flag WHEN 'OUTLIER_3X' THEN 50 WHEN 'OUTLIER_2X' THEN 30 ELSE 0 END
            + CASE c.cadence_flag WHEN 'MACHINE_CADENCE' THEN 40 ELSE 0 END
            + CASE WHEN a.tx_count_5min >= 8 THEN 20 WHEN a.tx_count_5min >= 5 THEN 10 ELSE 0 END) >= 40
        THEN 'STEP_UP'
      ELSE 'ALLOW'
    END AS action
FROM aap07_velocity_anomalies_mv a
LEFT JOIN aap07_cadence_anomaly_mv c
  ON a.agent_id = c.agent_id;

The weights say: peer outlier on its own (50 pts) is enough to step up. Machine cadence on its own (40 pts) is enough to step up. Either signal combined with raw volume reaches the BLOCK threshold of 70. Two signals together comfortably clear the block line. In production these weights and thresholds belong in a configuration table so the risk team can tune without redeploying.

Live output:

 agent_id  |     agent_type     | peer_flag  |  cadence_flag   | risk_score | action
-----------+--------------------+------------+-----------------+------------+--------
 agent_002 | shopping_assistant | OUTLIER_3X | MACHINE_CADENCE |        110 | BLOCK
 agent_006 | finance_optimizer  | OUTLIER_3X | MACHINE_CADENCE |        110 | BLOCK
 agent_004 | travel_booker      | OUTLIER_2X | HUMAN_LIKE      |         30 | ALLOW
 agent_001 | shopping_assistant | NORMAL     | HUMAN_LIKE      |          0 | ALLOW
 agent_003 | travel_booker      | NORMAL     | HUMAN_LIKE      |          0 | ALLOW
 agent_005 | finance_optimizer  | NORMAL     | HUMAN_LIKE      |          0 | ALLOW

Both hijacked agents are flagged BLOCK with a score of 110. The legitimate finance optimizer that earlier looked suspicious to a flat global threshold (2 transactions per minute when its baseline was 0.4) is correctly cleared as ALLOW because its rate matches the cohort and its cadence is human-like. The travel booker agent_004 shows up as a 2x peer outlier on volume alone but stays under both the step-up and block lines because no other signal corroborates it. That is the desired behavior: a single weak signal should not be enough to interrupt a user.

The risk score view is the surface a fraud decisioning service polls. Because RisingWave is PostgreSQL-compatible, the service connects with any Postgres driver and issues a SELECT action FROM aap07_velocity_risk_score_mv WHERE agent_id = $1. There is no separate cache to keep in sync, no Kafka topic to consume, no Java job to redeploy when a threshold changes. For the operational pattern behind this approach, the transaction velocity fraud detection article walks through serving fraud scores directly from materialized views.

Production Considerations

A few operational notes before going live with this design.

Cohort definition matters. "Shopping assistant" is too coarse for production. Combine agent type with vendor and version (e.g. openai/shopping-1.2.0, anthropic/commerce-claude-2026-04). Cohort medians are noisier when cohorts are small, so favor wider cohorts where the workflow truly is comparable.

Baselines should drift. A new agent type ramps up over weeks. Lock baselines too tightly and you get false positives during launch. Lock too loosely and you miss subtle drift attacks. A common pattern is an exponentially weighted moving average over the per-agent rate, rebuilt nightly, with a separate fast 5-minute window for short-burst detection.

Combine with non-velocity signals. Velocity alone, even computed well, will misfire. Combine the score above with merchant-risk signals, geography mismatch, agent-key freshness, and amount distribution checks. Each lives in its own materialized view; the final decision view joins them all. The OWASP Application Security Verification Standard covers complementary controls.

Treat hijacked-agent recovery as a first-class flow. When the score crosses BLOCK, the rule engine should not just deny the transaction; it should also revoke the agent key and notify the user. Because RisingWave can sink the high-score rows directly to Kafka or to a webhook, the alerting wire-up is one CREATE SINK away.

Frequently Asked Questions

Why don't human payment velocity rules work for AI agents?

Human velocity rules assume a person clicking buy and flag any account that exceeds a fixed transactions-per-minute threshold. AI agents transact at machine speed by design. A shopping assistant comparing 12 SKUs in 30 seconds, or a finance optimizer rebalancing across 8 accounts, will trip a human-tuned rule even when behaving normally. The result is high false positive rates that block legitimate agent traffic.

What is a per-agent velocity baseline?

A per-agent velocity baseline is the agent's own historical transactions-per-minute distribution under normal conditions. Instead of comparing every agent to a single global threshold, the rule engine compares each agent to its own learned cadence. An agent that normally fires 2 transactions per minute looks suspicious at 20, while an agent that normally fires 30 still looks normal at 35.

What is peer comparison in agent velocity detection?

Peer comparison groups agents by capability or vendor (shopping assistant, travel booker, finance optimizer) and computes velocity statistics across the cohort. An agent is flagged when its rate deviates significantly from peers performing the same job. This catches hijacked agents whose own baseline is also corrupted, since their rate now diverges from peers running the same workflow.

How does RisingWave compute agent velocity in real time?

RisingWave maintains incremental materialized views over the agent transaction stream. Each new event updates per-agent counters, peer baselines, and cadence statistics in place, with no batch recomputation. Because RisingWave is PostgreSQL-compatible, fraud decisioning services query the velocity risk score directly over the standard Postgres protocol without an intermediate cache or sink.

Start Building Agent-Aware Velocity Rules

Human-tuned velocity rules were built for a world where every authorization came from a person tapping a screen. AI agents transact at machine speed by design, and any rule that confuses speed with malice will block legitimate agent traffic and erode user trust. The fix is a velocity model that compares each agent to its own baseline, to its peer cohort, and to natural cadence noise, then combines those signals into a single real-time risk score.

RisingWave makes this practical because every layer of the model is a streaming SQL view. There is no Java to rebuild, no Kafka topic to materialize separately, and no cache to invalidate when a threshold moves. For a deeper look at the underlying window primitives, see transaction velocity fraud detection with SQL windows, and the RisingWave streaming SQL documentation for the full reference.

Ready to redesign velocity rules for AI agents? Try RisingWave Cloud free →

Join our Slack community.