How Prediction Markets Work: The Real-Time Data Architecture Behind the Odds

How Prediction Markets Work: The Real-Time Data Architecture Behind the Odds

How Prediction Markets Work: The Real-Time Data Architecture Behind the Odds

Prediction markets let users trade contracts on the outcome of future events — elections, sports, economic indicators, AI milestones — where contract prices reflect the crowd's implied probability of each outcome. A $0.70 YES contract means the market estimates a 70% probability. Behind this simple interface lies a complex real-time data architecture: order matching engines processing thousands of trades per second, pricing algorithms continuously recalculating odds, oracle systems for event resolution, and settlement engines that instantly distribute payouts across hundreds of thousands of positions.

This article breaks down the technical architecture of modern prediction markets and shows how stream processing makes it all work in real time.

The Prediction Market Data Flow

Every prediction market follows this core data flow:

User places order → Order matching → Price/odds update → Position tracking → Event resolution → Settlement
      │                    │                 │                    │                  │               │
   REST API          CLOB or AMM      Real-time MV        Streaming join      Oracle feed     Fan-out payout

Each step requires real-time data processing. Let's examine each component.

Order Matching: CLOB vs AMM

Modern prediction markets use one of two models to match buyers and sellers:

Central Limit Order Book (CLOB)

Polymarket and Kalshi use a CLOB model — the same order book structure used by stock exchanges.

How it works:

  • Buyers submit limit orders: "I'll buy YES at $0.65"
  • Sellers submit limit orders: "I'll sell YES at $0.67"
  • The matching engine continuously matches compatible orders
  • The best bid and best ask form the "spread" — tighter spreads mean more liquid markets

Polymarket's architecture:

  • Hybrid-decentralized CLOB running on Polygon
  • Orders are signed EIP-712 messages submitted off-chain to the matching operator
  • Settlement occurs on-chain via the Conditional Token Framework (CTF), using ERC-1155 tokens
  • Each market has two token types: YES and NO
  • The matching operator batches up to 15 orders per on-chain call for efficiency

Kalshi's architecture:

  • Fully centralized, CFTC-regulated Designated Contract Market
  • Traditional financial exchange matching engine
  • Regulated like a commodity futures exchange

Automated Market Maker (AMM)

Some prediction markets use algorithmic pricing instead of an order book:

Logarithmic Market Scoring Rule (LMSR):

  • Originated by economist Robin Hanson specifically for prediction markets
  • Guarantees continuous liquidity — you can always buy or sell without waiting for a counterparty
  • A liquidity parameter controls price sensitivity: higher values = slower price movement, lower = faster
  • Prices are bounded between 0 and 1, naturally representing probabilities
  • LMSR directly inspired the constant function market makers (CFMMs) used by Uniswap and other DeFi protocols

Trade-offs:

AspectCLOBAMM (LMSR)
LiquidityDepends on market makersAlways available (algorithmic)
Price discoverySupply/demand drivenFormula-driven
SpreadTight when liquid, wide when notConsistent but wider
Capital efficiencyHigher (no locked liquidity)Lower (subsidized by operator)
Best forHigh-volume marketsThin, long-tail markets

Real-Time Odds Recalculation

Every trade changes the odds. A prediction market must recalculate and publish updated prices after every order fill — in milliseconds.

What Needs to Be Computed in Real Time

-- Current market state (must update with every trade)
- Last trade price → implied probability
- Best bid / best ask → current spread
- 24-hour volume → market activity signal
- Open interest → total capital at risk
- Market depth at each price level → liquidity profile
- Weighted average price → VWAP for analytics

Implementing with a Streaming Database

In RisingWave, these computations are materialized views that update automatically:

-- Real-time market pricing from trade stream
CREATE MATERIALIZED VIEW market_pricing AS
SELECT
  market_id,
  -- Current price = last trade price
  last_value(price ORDER BY trade_time) as current_price,
  -- Implied probability
  last_value(price ORDER BY trade_time) as implied_probability,
  -- Volume metrics
  SUM(quantity) FILTER (WHERE trade_time > NOW() - INTERVAL '24 hours') as volume_24h,
  SUM(quantity * price) FILTER (WHERE trade_time > NOW() - INTERVAL '24 hours') as notional_24h,
  -- Trade count
  COUNT(*) FILTER (WHERE trade_time > NOW() - INTERVAL '24 hours') as trades_24h,
  -- Price range
  MIN(price) FILTER (WHERE trade_time > NOW() - INTERVAL '24 hours') as low_24h,
  MAX(price) FILTER (WHERE trade_time > NOW() - INTERVAL '24 hours') as high_24h,
  -- Latest update
  MAX(trade_time) as last_trade_time
FROM trades
GROUP BY market_id;

This view updates within milliseconds of every trade — no polling, no batch jobs.

Position Tracking and Real-Time P&L

Every user has a portfolio of positions across multiple markets. Computing real-time P&L requires joining the user's trade history with current market prices:

-- User position tracking
CREATE MATERIALIZED VIEW user_positions AS
SELECT
  user_id,
  market_id,
  side,
  SUM(quantity) as position_size,
  SUM(quantity * price) / SUM(quantity) as avg_entry_price,
  SUM(quantity * price) as total_cost
FROM trades
GROUP BY user_id, market_id, side;

-- Real-time P&L (join positions with live prices)
CREATE MATERIALIZED VIEW user_pnl AS
SELECT
  p.user_id,
  p.market_id,
  p.side,
  p.position_size,
  p.avg_entry_price,
  m.current_price,
  CASE
    WHEN p.side = 'YES'
    THEN p.position_size * (m.current_price - p.avg_entry_price)
    ELSE p.position_size * (p.avg_entry_price - m.current_price)
  END as unrealized_pnl
FROM user_positions p
JOIN market_pricing m ON p.market_id = m.market_id;

Event Resolution and Settlement

The most critical real-time operation is settlement: when an event resolves (the election is called, the game ends), the platform must instantly compute and distribute payouts to every winning position.

The Fan-Out Problem

A popular market like "Will the 2024 US election go to the Republican candidate?" had hundreds of thousands of positions. When the event resolves:

  1. Oracle submits resolution (YES or NO)
  2. Settlement engine must compute payout for every open position
  3. User balances must update atomically
  4. All of this should happen in seconds, not hours

Streaming Settlement with SQL

-- Settlement engine: join trades with oracle resolution
CREATE MATERIALIZED VIEW settlements AS
SELECT
  t.user_id,
  t.market_id,
  t.side,
  t.quantity,
  t.price as entry_price,
  o.outcome,
  CASE
    WHEN t.side = o.outcome THEN t.quantity * (1.0 - t.price)  -- Winner payout
    ELSE -t.quantity * t.price                                   -- Loser loss
  END as settlement_amount
FROM trades t
JOIN oracle_resolutions o ON t.market_id = o.market_id;

-- Aggregate into user balances
CREATE MATERIALIZED VIEW user_balances AS
SELECT
  user_id,
  SUM(settlement_amount) as total_balance
FROM (
  -- Cash deposits
  SELECT user_id, amount as settlement_amount FROM deposits
  UNION ALL
  -- Settled P&L
  SELECT user_id, settlement_amount FROM settlements
) combined
GROUP BY user_id;

When the oracle publishes a resolution, the settlements view instantly computes payouts for all positions via the streaming join. User balances update in real time.

Cross-Market Arbitrage Detection

Arbitrage opportunities arise when related markets on different platforms price the same event differently. For example, if Polymarket prices "Event X" at $0.70 YES and Kalshi prices it at $0.65 YES, a trader can buy on Kalshi and sell on Polymarket for a risk-free profit.

-- Detect cross-platform arbitrage opportunities
CREATE MATERIALIZED VIEW arbitrage_opportunities AS
SELECT
  p.event_name,
  p.platform as platform_a,
  p.current_price as price_a,
  k.platform as platform_b,
  k.current_price as price_b,
  ABS(p.current_price - k.current_price) as spread,
  CASE
    WHEN p.current_price > k.current_price THEN 'Buy on B, Sell on A'
    ELSE 'Buy on A, Sell on B'
  END as action
FROM polymarket_prices p
JOIN kalshi_prices k ON p.event_id = k.event_id
WHERE ABS(p.current_price - k.current_price) > 0.03;  -- 3% minimum spread

Risk Management

Real-time risk management prevents catastrophic losses:

-- Position exposure monitoring
CREATE MATERIALIZED VIEW risk_dashboard AS
SELECT
  user_id,
  COUNT(DISTINCT market_id) as active_markets,
  SUM(position_size * current_price) as total_exposure,
  MAX(position_size * current_price) as largest_position,
  SUM(ABS(unrealized_pnl)) as total_unrealized_risk
FROM user_pnl
GROUP BY user_id;

-- Alert on excessive concentration
CREATE MATERIALIZED VIEW risk_alerts AS
SELECT user_id, total_exposure, largest_position
FROM risk_dashboard
WHERE total_exposure > 100000  -- $100K exposure limit
   OR largest_position / total_exposure > 0.5;  -- >50% in single market

Why Streaming Databases Fit Prediction Markets

Traditional prediction market architectures use a fragmented stack:

Trade Events → Kafka → Flink (processing) → Redis (cache) → PostgreSQL (storage)

A streaming database like RisingWave collapses this into:

Trade Events → RisingWave (processing + serving + storage)

Advantages:

  • Single system for ingestion, processing, and serving
  • SQL-only — no Java, no complex stream processing code
  • Sub-second updates — materialized views refresh within milliseconds
  • PostgreSQL protocol — dashboards, APIs, and admin tools connect directly
  • State on S3 — elastic scaling, fast recovery, no local disk failures
  • Exactly-once — no double-counting of trades or settlements

The Market Opportunity

Prediction markets are the fastest-growing segment in financial technology:

  • 2024 US Election: Polymarket alone processed $3.7 billion in trading volume
  • 2025 Annual volume: $44 billion globally across all platforms
  • 2026 Weekly volume: ~$6 billion on Kalshi + Polymarket combined
  • Growth trajectory: Projections suggest $1.3 trillion in annual volume by end of 2026
  • Sports dominance: 80%+ of 2025-2026 trading volume is sports-related
  • Regulatory tailwind: Both Polymarket and Kalshi now have CFTC approval as Designated Contract Markets

This growth creates massive demand for real-time data infrastructure that can handle increasing trade volumes, market counts, and settlement complexity.

Frequently Asked Questions

How do prediction market odds work?

Prediction market prices represent implied probabilities. A YES contract trading at $0.70 means the market collectively estimates a 70% probability of the event occurring. If you buy YES at $0.70 and the event happens, you receive $1.00 — a profit of $0.30. If it doesn't happen, you lose your $0.70.

What is the difference between CLOB and AMM in prediction markets?

A Central Limit Order Book (CLOB) matches buyers and sellers based on price-time priority, like a stock exchange. An Automated Market Maker (AMM) uses a mathematical formula (like LMSR) to set prices algorithmically. CLOBs offer tighter spreads in liquid markets; AMMs guarantee continuous liquidity even in thin markets.

Why do prediction markets need stream processing?

Prediction markets require real-time updates for odds, user positions, P&L, risk management, and settlement. Batch processing with hourly or daily refreshes is fundamentally incompatible with live trading. A streaming database like RisingWave maintains all these computations as continuously updated materialized views, ensuring every query returns the latest state.

How does settlement work in prediction markets?

When an event resolves, an oracle (UMA Optimistic Oracle for Polymarket, internal for Kalshi) publishes the outcome. The settlement engine computes payouts for every open position: winners receive $1.00 per contract minus their purchase price; losers forfeit their purchase price. In a streaming database, settlement is a continuous join between the trades stream and the oracle feed — payouts compute instantly when the resolution arrives.

How big is the prediction market industry?

Prediction markets are growing explosively. Polymarket processed $3.7 billion in the 2024 US election alone. Global annual volume reached $44 billion in 2025. Weekly volume in early 2026 is approximately $6 billion. Industry projections suggest annual volume could reach $1.3 trillion by end of 2026, driven by sports betting, regulatory approval (CFTC), and institutional adoption.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.