Cross-Exchange Arbitrage Detection with Real-Time SQL

Cross-Exchange Arbitrage Detection with Real-Time SQL

Cross-exchange arbitrage detection identifies moments when the same asset trades at meaningfully different prices across Binance, Coinbase, Kraken, and other venues — giving traders a window to buy low on one exchange and sell high on another before the gap closes. With RisingWave, a PostgreSQL-compatible streaming database, you can detect these price divergences in real time using SQL materialized views that continuously track the cross-exchange delta for hundreds of trading pairs simultaneously.

Why Cross-Exchange Arbitrage Detection Matters

Price discovery in crypto is fragmented. Unlike equity markets with consolidated tape, each crypto exchange operates its own order book. Prices for BTC/USDT on Binance and Coinbase can diverge by 0.1% to 0.5% or more during volatile periods — a gap that represents profit for whoever closes it first.

The arbitrage opportunity is real but time-sensitive:

  • Price divergence windows typically last seconds to tens of seconds before market forces close them
  • Execution latency matters — a system that detects the gap 30 seconds late sees a gap that no longer exists
  • Fee-adjusted profitability requires knowing whether the spread exceeds trading fees plus transfer costs
  • Concurrent monitoring across hundreds of pairs is impossible with manual polling

Batch monitoring is fundamentally incompatible with arbitrage. A pipeline that runs every minute will report thousands of closed opportunities and miss the ones that are currently open. Real-time streaming is the only viable architecture.

How Streaming SQL Solves This

RisingWave ingests price data from all exchanges through a unified Kafka topic, then maintains materialized views that compute the current cross-exchange delta for each trading pair. A pivot-style view aligns prices from different exchanges in the same row, making comparison arithmetic trivial. Alert views filter for gaps that exceed the minimum profitable threshold.

Building It Step by Step

Step 1: Connect the Data Source

-- Unified price feed from multiple exchanges
CREATE SOURCE exchange_prices (
    exchange        VARCHAR,
    trading_pair    VARCHAR,
    asset           VARCHAR,
    best_bid        NUMERIC,
    best_ask        NUMERIC,
    last_trade      NUMERIC,
    volume_24h      NUMERIC,
    event_time      TIMESTAMPTZ
) WITH (
    connector = 'kafka',
    topic = 'crypto.prices.cross-exchange',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;

Step 2: Build the Core Materialized View

-- Latest price per exchange per trading pair
CREATE MATERIALIZED VIEW latest_exchange_prices AS
SELECT DISTINCT ON (exchange, trading_pair)
    exchange,
    trading_pair,
    best_bid,
    best_ask,
    last_trade,
    (best_bid + best_ask) / 2   AS mid_price,
    volume_24h,
    event_time
FROM exchange_prices
ORDER BY exchange, trading_pair, event_time DESC;
-- Cross-exchange price comparison: all exchange pairs for each trading pair
CREATE MATERIALIZED VIEW cross_exchange_delta AS
SELECT
    a.trading_pair,
    a.exchange                          AS exchange_low,
    b.exchange                          AS exchange_high,
    a.best_ask                          AS ask_low,       -- buy here
    b.best_bid                          AS bid_high,      -- sell here
    b.best_bid - a.best_ask             AS gross_spread,
    (b.best_bid - a.best_ask)
        / NULLIF(a.best_ask, 0) * 100   AS spread_pct,
    a.mid_price                         AS mid_low,
    b.mid_price                         AS mid_high,
    ABS(a.mid_price - b.mid_price)
        / NULLIF(LEAST(a.mid_price, b.mid_price), 0) * 100
                                        AS mid_divergence_pct,
    a.volume_24h                        AS volume_low,
    b.volume_24h                        AS volume_high,
    GREATEST(a.event_time, b.event_time) AS last_updated
FROM latest_exchange_prices a
JOIN latest_exchange_prices b
    ON  a.trading_pair = b.trading_pair
    AND a.exchange     < b.exchange          -- avoid duplicate pairs
    AND a.best_ask     < b.best_bid;         -- only positive spread
-- Fee-adjusted arbitrage view (assuming 0.1% maker + 0.1% taker each side)
CREATE MATERIALIZED VIEW arb_opportunities AS
SELECT
    trading_pair,
    exchange_low,
    exchange_high,
    ask_low,
    bid_high,
    ROUND(gross_spread, 6)  AS gross_spread,
    ROUND(spread_pct, 4)    AS spread_pct,
    -- 0.2% total fee (0.1% buy + 0.1% sell)
    ROUND(spread_pct - 0.20, 4) AS net_spread_pct,
    ROUND(mid_divergence_pct, 4) AS mid_divergence_pct,
    ROUND(LEAST(volume_low, volume_high), 0) AS liquidity_constraint_usd,
    last_updated
FROM cross_exchange_delta
WHERE spread_pct > 0.20;  -- profitable after fees

Step 3: Add Alerts and Detection Logic

-- Alert on large arbitrage opportunities (> 0.5% net after fees)
CREATE MATERIALIZED VIEW large_arb_alerts AS
SELECT
    trading_pair,
    exchange_low,
    exchange_high,
    ask_low,
    bid_high,
    gross_spread,
    spread_pct,
    net_spread_pct,
    liquidity_constraint_usd,
    last_updated
FROM arb_opportunities
WHERE net_spread_pct > 0.5;
-- Track historical divergence patterns per exchange pair
CREATE MATERIALIZED VIEW exchange_pair_divergence_history AS
SELECT
    exchange_low || '-' || exchange_high    AS exchange_pair,
    trading_pair,
    window_start,
    window_end,
    AVG(spread_pct)                         AS avg_spread_pct,
    MAX(spread_pct)                         AS max_spread_pct,
    COUNT(*) FILTER (WHERE spread_pct > 0.2) AS profitable_windows,
    COUNT(*)                                 AS total_windows
FROM TUMBLE(
    (SELECT *, event_time AS window_event_time FROM cross_exchange_delta),
    window_event_time,
    INTERVAL '5 MINUTES'
)
GROUP BY exchange_low || '-' || exchange_high, trading_pair, window_start, window_end;
-- Sink arbitrage alerts to Kafka for trading systems
CREATE SINK arb_alert_sink AS
SELECT * FROM large_arb_alerts
WITH (
    connector = 'kafka',
    topic = 'crypto.arbitrage.opportunities',
    properties.bootstrap.server = 'kafka:9092'
) FORMAT PLAIN ENCODE JSON;

Step 4: Query for Real-Time Insights

-- Current best arbitrage opportunities across all pairs
SELECT
    trading_pair,
    exchange_low,
    exchange_high,
    ROUND(ask_low, 4)           AS buy_price,
    ROUND(bid_high, 4)          AS sell_price,
    ROUND(spread_pct, 4)        AS gross_spread_pct,
    ROUND(net_spread_pct, 4)    AS net_spread_pct,
    last_updated
FROM arb_opportunities
ORDER BY net_spread_pct DESC
LIMIT 10;
-- Which exchange pairs show the most frequent divergence for BTC/USDT?
SELECT
    exchange_pair,
    avg_spread_pct,
    max_spread_pct,
    profitable_windows,
    total_windows,
    ROUND(profitable_windows::NUMERIC / NULLIF(total_windows, 0) * 100, 1)
                    AS profitable_pct
FROM exchange_pair_divergence_history
WHERE trading_pair = 'BTC/USDT'
ORDER BY profitable_windows DESC
LIMIT 10;

Comparison: Batch vs Streaming

AspectBatch ETLStreaming SQL (RisingWave)
LatencyMinutesSub-second
Opportunity DetectionPost-close analysisReal-time open windows
Fee-Adjusted FilteringOffline computationBuilt into materialized view
Multi-Pair CoverageSequential pollingParallel continuous join
Divergence HistoryRequires scheduled jobContinuously maintained
Alert SpeedNext batchMilliseconds

FAQ

Q: How do you account for execution latency when acting on detected opportunities?

The last_updated field reflects when the prices that created the spread were last refreshed. Subtract last_updated from current time to compute price staleness before acting. If the spread has been stable for the past few hundred milliseconds, it's more likely still executable. Additionally, incorporate your known execution latency (time to reach each exchange's matching engine) into your minimum spread threshold — if you take 200ms to execute, require a spread that has historically lasted longer than 200ms before closing.

Q: Does this handle triangular arbitrage (exploiting price imbalances across three assets)?

The current pattern covers direct cross-exchange arbitrage. Triangular arbitrage (e.g., BTC → ETH → USDT → BTC on the same exchange) requires a different view structure — join three asset pairs in a chain and check whether the circular product exceeds 1.0 after fees. This is a three-way self-join on the latest_exchange_prices view filtered to a single exchange, computationally heavier but expressible in SQL.

Q: What about transfer time between exchanges? The funds need to move to close the arb.

Cross-exchange arbitrage is most practical when you maintain balances on both sides simultaneously. The "buy on exchange A, sell on exchange B" execution is two simultaneous orders, not a sequential transfer. The actual transfer to rebalance occurs later, at lower urgency. The detection system identifies the price gap; your execution system needs pre-funded accounts on both exchanges to act on it.

Key Takeaways

  • A single Kafka source aggregates price feeds from all exchanges into a unified stream
  • A self-join on latest_exchange_prices computes the cross-exchange delta for every tradable pair in one view
  • Fee-adjusted filtering in the arb_opportunities view eliminates unprofitable gaps before alerts fire
  • Exchange pair divergence history reveals which venue combinations produce the most consistent opportunities
  • The Kafka sink delivers actionable opportunities to trading systems within milliseconds of detection

Ready to try this? Get started with RisingWave. Join our Slack community.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.