AI Agents Are Handling Payments. Here's What Breaks.

AI Agents Are Handling Payments. Here's What Breaks.

The scenario

You've deployed a procurement agent. It monitors your SaaS subscriptions, finds unused seats, processes routine invoices under $5,000 automatically, and escalates larger ones for human review. In its first week it processed 143 transactions. Your finance team found out on Friday when they ran their weekly report.

That gap — between when the agent acts and when humans see it — is a design failure. But it is not the agent's fault. It is your data infrastructure's fault.

This piece is about what actually breaks in payment systems when AI agents start handling money, and what you need to build to fix it.


Agents operate at machine speed. Payment infrastructure was designed for human speed.

A human making a purchase takes seconds to minutes. They read a product page, decide, click buy. The whole flow happens at human cognitive speed.

An autonomous agent operates at milliseconds. It can evaluate 50 products, select the best option, and confirm the purchase in the time it takes a human to read the first sentence of a product description. If you have given it a recurring task — "reorder office supplies when stock drops below threshold" — it might trigger 30 transactions while you are in a meeting.

This speed difference exposes assumptions baked into traditional payment infrastructure:

  • Fraud detection built on velocity thresholds flags legitimate agent behavior
  • Budget systems designed for monthly human reviews cannot track real-time agent spending
  • Audit trails designed for human-readable histories become inadequate when you need to know why the agent made a specific decision
  • Visibility dashboards built for end-of-day batch reports leave human overseers flying blind for hours

None of these were design flaws when payment systems were built. They become critical failures when agents enter the picture.


The two-layer architecture you actually need

Before going further, let's be precise about what "data infrastructure" means for payments, because conflating two distinct layers leads to bad architectural decisions.

Layer 1: The write layer. Handles the actual transaction. Deducting funds, updating ledgers, ensuring a $100 balance cannot be double-spent. This layer requires ACID guarantees and transactional consistency. PostgreSQL, traditional payment processors, and purpose-built ledger systems handle this well. Twenty years of engineering wisdom applies here.

Layer 2: The read layer. Handles observation, analysis, and monitoring. Computing how much an agent has spent today. Detecting anomalous transaction patterns. Alerting when spending approaches a limit. Building the audit trail of what the agent saw when it made its decision. This layer needs to be real-time, but it does not need to be the system that prevents a transaction.

Most payment teams design Layer 1 carefully. They use battle-tested patterns: idempotency keys, reservation patterns to prevent oversell, distributed locking. Layer 1 for agentic payments works fine — your existing transactional database handles it.

Layer 2 is where the problems are.

For human transactions, Layer 2 can be a data warehouse with hourly syncs. Nobody needs real-time fraud signals for a customer who makes three purchases a day. For agent transactions, Layer 2 needs to process every event in real time, maintain continuously updated aggregates, and surface signals in under a second. An agent that is going to exhaust its $10,000 monthly budget in 20 minutes needs to be caught in those 20 minutes, not in the next morning's report.

This is the gap. Here is what it costs you.


What breaks when Layer 2 is not real-time

1. Your fraud system treats your agent like a criminal

Traditional fraud detection is built around behavioral baselines for humans. High velocity is suspicious: ten transactions in five minutes flags a compromised card. Unusual merchant category sequences are suspicious: gas station → jewelry store → electronics store suggests card testing.

Your purchasing agent will trigger every one of these signals on a routine Tuesday.

The naive fix — whitelisting agent traffic entirely — is worse. Agents get compromised too. An agent with stolen credentials should be caught by fraud detection.

What you actually need is a separate behavioral baseline per agent. "This agent normally processes 10 to 30 invoices per hour from these merchant categories. This pattern is normal." Deviations from the agent's own established pattern are suspicious; the absolute velocity is not.

Building this requires computing per-agent behavioral baselines continuously and detecting deviations as they happen. The alternative — batch-computed baselines updated overnight — means your fraud system is always working with yesterday's model of how your agent behaves.

-- Per-agent transaction velocity, updated every 15 minutes
CREATE MATERIALIZED VIEW agent_velocity AS
SELECT
  agent_id,
  COUNT(*)                              AS tx_count,
  SUM(amount)                           AS total_amount,
  array_agg(DISTINCT merchant_category) AS categories
FROM TUMBLE(transactions, event_time, INTERVAL '15' MINUTE)
GROUP BY agent_id, window_start, window_end;

-- Flag agents deviating 3x from their historical average
CREATE MATERIALIZED VIEW agent_anomalies AS
SELECT
  v.agent_id,
  v.tx_count,
  b.avg_tx_count,
  v.tx_count / NULLIF(b.avg_tx_count, 0) AS velocity_ratio
FROM agent_velocity v
JOIN agent_baselines b ON v.agent_id = b.agent_id
WHERE v.tx_count > b.avg_tx_count * 3;

2. Budget exhaustion is not caught until it is too late

You give an agent a $5,000 monthly budget for software procurement. A SaaS vendor is running a 4-hour deal — 40% off annual subscriptions. The agent evaluates 23 tools in your stack that qualify, calculates the ROI, and starts processing annual renewals.

If your Layer 2 is a batch-updated warehouse, the agent does not have an accurate view of its remaining budget. It commits $6,200 before anything alerts. By the time your finance team sees the morning report, the transactions are done.

Note carefully: the right place to enforce a budget limit is still Layer 1 — a transactional check that prevents the deduction when the balance would go negative. What Layer 2 provides is the real-time aggregate that Layer 1 can query, and the early-warning monitoring that catches spending acceleration before the limit is reached.

-- Real-time budget consumption per agent
CREATE MATERIALIZED VIEW agent_budget_consumption AS
SELECT
  t.agent_id,
  b.budget_id,
  SUM(t.amount)                                 AS spent,
  MAX(b.budget_limit)                           AS budget_limit,
  MAX(b.budget_limit) - SUM(t.amount)           AS remaining,
  SUM(t.amount) / MAX(b.budget_limit)           AS utilization_pct
FROM transactions t
JOIN agent_budgets b ON t.agent_id = b.agent_id
WHERE t.created_at >= DATE_TRUNC('month', NOW())
GROUP BY t.agent_id, b.budget_id;

-- Alert when an agent reaches 80% of its budget
CREATE MATERIALIZED VIEW budget_alerts AS
SELECT agent_id, budget_id, spent, budget_limit, utilization_pct
FROM agent_budget_consumption
WHERE utilization_pct > 0.8;

The view updates with every transaction. A sink to a notification service means the human overseer gets an alert at 80%, 90%, and 95% — not after 100% is already exceeded.

3. You cannot audit why the agent bought what it bought

This one does not break immediately. It breaks six months later when your CFO asks why the company paid $4,200 for a tool nobody uses, or when a compliance audit asks for documentation of procurement decisions.

Traditional payment audit trails record what happened: amount, merchant, timestamp. For an agent, this is insufficient. You need to know why it happened: what prices it saw, what options it evaluated, what budget balance it believed it had when it decided.

The decision context is ephemeral. If you do not capture it at decision time, it is gone. By the time you investigate six months later, the prices the agent saw, the inventory it checked, and the budget balance it queried have all changed.

Building an audit trail with decision context means capturing, alongside every transaction, the state of the data the agent queried immediately before acting. The cost is low — a JSON blob per transaction. But it requires treating decision context as a first-class artifact, not an afterthought.

CREATE TABLE agent_decisions (
  decision_id             TEXT PRIMARY KEY,
  agent_id                TEXT,
  action                  TEXT,
  amount                  DECIMAL,
  vendor                  TEXT,
  -- Context the agent saw at decision time
  budget_remaining        DECIMAL,
  price_quoted            DECIMAL,
  competing_prices        JSONB,
  decision_rationale      TEXT,
  created_at              TIMESTAMPTZ DEFAULT NOW()
);

The reference architecture

Putting this together, an agentic payment system that works at agent speed looks like this:

Architecture: Agentic Payment System with RisingWave as real-time observation layer

The agent queries RisingWave before each decision to get fresh budget state and anomaly signals. The actual transaction goes through the transactional layer. The transaction event flows back into RisingWave via CDC, updating all materialized views within milliseconds.

RisingWave is not the system that prevents the agent from overspending — your transactional database does that with proper locking. RisingWave is the system that gives you real-time visibility, powers fraud detection, and enables the agent itself to act on fresh state.


What most teams do instead

Most teams building agentic payment systems inherit their data infrastructure from their existing analytics setup:

  • Snowflake or BigQuery: hours of lag. Budget alerts arrive in the morning report.
  • Redis counters: works for simple counts, breaks for complex aggregations, no SQL interface, consistency issues at scale.
  • Direct PostgreSQL queries: works for point lookups, does not scale for aggregations over millions of transactions.

None of these are wrong for the use cases they were built for. They are wrong for agentic payments because they assume a human-speed workflow where latency in the observation layer is acceptable.


The gap is in observation, not in transactions

Payment engineering has invested heavily in making the write path reliable. Idempotency keys, distributed locks, saga patterns — this is well-understood territory.

The observation layer — fraud detection, budget monitoring, real-time audit — was designed with humans in mind. Batch processing was fine because humans could not drain a budget in 20 minutes.

Agents can. As more organizations deploy agents that handle procurement, expense management, subscription billing, and autonomous purchasing, the observation layer has to catch up.

A streaming database does not replace your payment stack. It closes the gap between what your agents are doing and what your team can see — in real time, at agent speed.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.