Building a Real-Time Crypto Arbitrage Monitor with RisingWave

Building a Real-Time Crypto Arbitrage Monitor with RisingWave

Arbitrage is a simple strategy. You buy an asset on one exchange where the price is low and sell it on another where the price is high.

In crypto markets, these price differences, or spreads, appear and vanish in milliseconds. If your data pipeline takes five seconds to process a batch of prices, the opportunity is already gone. To compete, you cannot rely on batch processing or databases that require constant polling. You need to detect spreads the moment they happen.

This post demonstrates how to use RisingWave to detect arbitrage opportunities with sub-second latency using standard SQL.

The Engineering Bottleneck

Arbitrage requires monitoring fragmented liquidity across Binance, Coinbase, OKX, and DEXs simultaneously. This creates three specific engineering hurdles.

  1. Velocity: During volatility, you might ingest over 10,000 price ticks per second.

  2. Synchronization: You cannot compare a Binance price from 10:00:01 with a Coinbase price from 10:00:05. The comparison must be time-aligned.

  3. Latency: The profit potential decays effectively to zero within hundreds of milliseconds.

Traditional ETL pipelines are too slow. We need a system that processes data as it arrives, not in batches.

Why RisingWave Fits

RisingWave is a streaming database designed exactly for this velocity. Unlike a traditional Postgres database where a materialized view is a static snapshot, a RisingWave materialized view updates continuously.

  • It connects natively to event streams like Kafka or WebSockets.

  • It calculates results incrementally, processing only new data points.

  • It handles time-windowing logic natively.

  • It uses SQL, so you don’t need to write complex Java or Flink jobs to define trading strategies.

Here is how to build the monitor.

1. Ingesting Market Data

We assume market data is flowing into a message queue like Kafka. We define two sources: one for Binance and one for Coinbase.

We treat these Kafka topics as tables. RisingWave consumes the JSON events directly.

CREATE SOURCE binance_prices (
    symbol VARCHAR,
    price DECIMAL,
    ts TIMESTAMP
) WITH (
    connector = 'kafka',
    topic = 'binance.ticker',
    properties.bootstrap.server = 'localhost:9092'
) FORMAT PLAIN ENCODE JSON;

CREATE SOURCE coinbase_prices (
    symbol VARCHAR,
    price DECIMAL,
    ts TIMESTAMP
) WITH (
    connector = 'kafka',
    topic = 'coinbase.ticker',
    properties.bootstrap.server = 'localhost:9092'
) FORMAT PLAIN ENCODE JSON;

2. The Core Logic: Detecting the Spread

We need to find instances where the price difference for the same symbol crosses a profitability threshold (e.g., 0.5%).

Crucially, we must ensure we compare prices that exist at roughly the same moment. A price mismatch is not a valid opportunity if the data points are seconds apart.

We use an interval join condition: ABS(EXTRACT(EPOCH FROM (b.ts - c.ts))) < 1. This restricts matches to price ticks that occurred within one second of each other.

CREATE MATERIALIZED VIEW arbitrage_opportunities AS
SELECT
    b.symbol,
    b.price AS binance_price,
    c.price AS coinbase_price,
    (c.price - b.price) / b.price * 100 AS spread_percent,
    b.ts
FROM binance_prices b
JOIN coinbase_prices c
  ON b.symbol = c.symbol
AND ABS(EXTRACT(EPOCH FROM (b.ts - c.ts))) < 1  -- align within 1s window
WHERE ABS((c.price - b.price) / b.price * 100) > 0.5;  -- threshold 0.5%

As new ticks arrive, RisingWave updates this view instantly. If a spread appears, it shows up here immediately.

3. Market-Wide Aggregation

You may also want to see the global "Best Bid and Offer" (BBO) across all exchanges to visualize market fragmentation.

We combine streams using UNION ALL and apply a hopping window via HOP to calculate spread statistics over short intervals.

CREATE MATERIALIZED VIEW best_bid_offer AS
WITH t AS (SELECT symbol, price, ts FROM binance_prices
           UNION ALL
           SELECT symbol, price, ts FROM coinbase_prices)
SELECT
    symbol,
    MIN(price) AS best_bid,
    MAX(price) AS best_offer,
    MAX(price) - MIN(price) AS spread
FROM HOP(t, ts, INTERVAL '1 second', INTERVAL '5 seconds') 
GROUP BY symbol, window_start;

4. Actionable Output (Subscriptions and Sinks)

Detecting the spread is half the battle. You need to trigger your trading bot immediately.

Traditionally, you would sink this data into Redis or Kafka and have your bot listen there. RisingWave simplifies this with Subscriptions.

Subscriptions allow your application to listen directly to changes in a materialized view. Your trading bot connects to RisingWave and receives a stream of updates—new arbitrage opportunities—as they are generated. This removes the need for an intermediate message queue, reducing both latency and infrastructure complexity.

CREATE SUBSCRIPTION arb_subscription
FROM arbitrage_opportunities
WITH (
    retention = '1h'
);

However, if you need to archive data for backtesting or integrate with other existing systems, you can still use Sinks. For example, you might sink to S3 for long-term storage or to Redis if multiple external services need to access the state.

CREATE SINK arb_signals
FROM arbitrage_opportunities
WITH (
    connector = 'redis',
    primary_key = 'arb_signals',
    redis.url= 'redis://localhost:6379/'
) FORMAT PLAIN ENCODE JSON (
    force_append_only='true'
);

Expanding the Strategy

Once the pipeline is running, you can use SQL to prototype more complex strategies.

  • Triangular Arbitrage: Join three pairs (BTC/USDT, ETH/BTC, ETH/USDT) within the same exchange.

  • Funding Rate Arbitrage: Monitor divergence between spot prices and perpetual swap funding rates.

  • Latency Monitoring: Compare exchange API timestamps against your ingestion timestamps to detect network lag.

Summary

Speed is the primary competitive advantage in crypto trading. By moving processing logic into RisingWave, you achieve sub-second detection of market anomalies without complex application code. The database handles the windowing and joining state, allowing you to define strategies in pure SQL.

Arbitrage is won or lost in milliseconds. RisingWave ensures you are on the right side of the clock.

Get Started with RisingWave

  • Try RisingWave Today:

  • Talk to Our Experts: Have a complex use case or want to see a personalized demo? Contact us to discuss how RisingWave can address your specific challenges.

  • Join Our Community: Connect with fellow developers, ask questions, and share your experiences in our vibrant Slack community.

If you’d like to see a personalized demo or discuss how this could work for your use case, please contact our sales team.

The Modern Backbone for
Real-Time Data and AI
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.