A high-frequency trading data pipeline with RisingWave uses Kafka for market data ingestion, materialized views for continuous signal computation, and the PostgreSQL-compatible query interface for low-latency reads—giving HFT firms a streaming SQL layer that computes VWAP, order book imbalance, and momentum signals in real time without custom application code.
HFT Data Infrastructure Challenges
High-frequency trading firms face a data engineering problem that is extreme in every dimension: millions of market data events per second, microsecond-sensitive latency requirements, and derived signal computation that must keep pace with the feed. Traditional approaches use custom C++ pipelines or in-memory databases with hand-coded aggregations.
RisingWave occupies a different tier: it is not a nanosecond-latency execution system, but it is an excellent streaming SQL layer for the analytical and signal-generation workloads that sit above raw execution—computing VWAP over sliding windows, detecting order book imbalance, tracking cross-instrument correlations, and feeding risk checks. These are the workloads that traditionally require weeks of custom code.
Market Data Ingestion
-- Level 2 order book updates from exchange feeds
CREATE SOURCE order_book_updates (
exchange_id VARCHAR,
symbol VARCHAR,
side VARCHAR, -- 'BID' or 'ASK'
price DECIMAL(18,8),
quantity DECIMAL(18,8),
update_type VARCHAR, -- 'ADD', 'UPDATE', 'DELETE'
sequence_num BIGINT,
event_time TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'order-book-l2',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;
-- Trade tape: executed trades
CREATE SOURCE trade_tape (
exchange_id VARCHAR,
symbol VARCHAR,
trade_id VARCHAR,
price DECIMAL(18,8),
quantity DECIMAL(18,8),
aggressor_side VARCHAR, -- 'BUY' or 'SELL'
trade_time TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'trade-tape',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;
VWAP Computation with Tumbling Windows
Volume-weighted average price is one of the most fundamental HFT signals. With RisingWave's TUMBLE() function, you maintain VWAP across multiple time horizons simultaneously:
-- 1-minute VWAP per symbol per exchange
CREATE MATERIALIZED VIEW vwap_1min AS
SELECT
window_start,
window_end,
symbol,
exchange_id,
SUM(price * quantity) / NULLIF(SUM(quantity), 0) AS vwap,
SUM(quantity) AS total_volume,
COUNT(*) AS trade_count,
MIN(price) AS low,
MAX(price) AS high,
MIN(price) FILTER (WHERE trade_time = window_start) AS open_price
FROM TUMBLE(trade_tape, trade_time, INTERVAL '1 MINUTE')
GROUP BY window_start, window_end, symbol, exchange_id;
-- 5-minute sliding window for momentum signal
CREATE MATERIALIZED VIEW vwap_5min_hop AS
SELECT
window_start,
window_end,
symbol,
SUM(price * quantity) / NULLIF(SUM(quantity), 0) AS vwap_5m,
SUM(quantity) AS volume_5m,
COUNT(DISTINCT exchange_id) AS exchanges_active
FROM HOP(trade_tape, trade_time, INTERVAL '1 MINUTE', INTERVAL '5 MINUTES')
GROUP BY window_start, window_end, symbol;
Order Book Imbalance Signal
Order book imbalance—the ratio of bid quantity to ask quantity at the top of the book—is a classic short-term directional signal. Computing it continuously over a streaming order book requires aggregating the current state of bids and asks:
-- Current best bid/ask aggregation (top-of-book state)
CREATE MATERIALIZED VIEW top_of_book AS
SELECT
symbol,
exchange_id,
MAX(price) FILTER (WHERE side = 'BID' AND update_type != 'DELETE') AS best_bid,
MIN(price) FILTER (WHERE side = 'ASK' AND update_type != 'DELETE') AS best_ask,
SUM(quantity) FILTER (WHERE side = 'BID' AND update_type != 'DELETE'
AND price >= MAX(price) FILTER (WHERE side = 'BID' AND update_type != 'DELETE') * 0.999)
AS bid_depth_near,
SUM(quantity) FILTER (WHERE side = 'ASK' AND update_type != 'DELETE'
AND price <= MIN(price) FILTER (WHERE side = 'ASK' AND update_type != 'DELETE') * 1.001)
AS ask_depth_near,
MAX(event_time) AS book_time
FROM order_book_updates
GROUP BY symbol, exchange_id;
-- Compute bid-ask imbalance ratio
CREATE MATERIALIZED VIEW book_imbalance AS
SELECT
symbol,
exchange_id,
best_bid,
best_ask,
best_ask - best_bid AS spread,
bid_depth_near,
ask_depth_near,
CASE
WHEN bid_depth_near + ask_depth_near = 0 THEN 0
ELSE (bid_depth_near - ask_depth_near)::DECIMAL /
(bid_depth_near + ask_depth_near)
END AS imbalance_ratio,
book_time
FROM top_of_book;
Sinking Signals to Execution Systems
-- Sink computed signals to Kafka for consumption by execution algorithms
CREATE SINK trading_signals_feed AS
SELECT
v.symbol,
v.window_end AS signal_time,
v.vwap AS vwap_1m,
v5.vwap_5m,
b.imbalance_ratio,
b.spread,
v.total_volume
FROM vwap_1min v
JOIN vwap_5min_hop v5
ON v.symbol = v5.symbol
AND v.window_end BETWEEN v5.window_start AND v5.window_end
JOIN book_imbalance b
ON v.symbol = b.symbol
AND v.exchange_id = b.exchange_id
WITH (
connector = 'kafka',
properties.bootstrap.server = 'kafka:9092',
topic = 'trading-signals'
) FORMAT PLAIN ENCODE JSON;
Comparison: Custom Code vs. Streaming SQL for HFT Analytics
| Dimension | Custom C++/Java Pipeline | Streaming SQL with RisingWave |
| Development time | Weeks to months | Days |
| Signal iteration speed | Slow (recompile, redeploy) | Fast (ALTER MATERIALIZED VIEW) |
| Latency tier | Microseconds (execution) | Milliseconds (analytics/signals) |
| Operational complexity | High (custom monitoring) | Low (SQL + Kafka ecosystem) |
| Multi-signal correlation | Complex custom joins | Standard SQL JOINs |
| Risk check integration | Custom integration code | PostgreSQL wire protocol |
| Use case fit | Order routing, execution | Signal generation, risk, analytics |
FAQ
Q: Is RisingWave fast enough for HFT execution? A: RisingWave is not designed for nanosecond-latency order execution—that remains the domain of FPGA-based or custom kernel-bypass systems. RisingWave excels at the analytical layer: computing signals, running risk checks, building real-time surveillance, and feeding decision support systems. Think of it as the real-time analytics brain that informs execution, not the execution engine itself.
Q: How many market data events per second can RisingWave process? A: Throughput depends on cluster size, complexity of materialized views, and hardware. Benchmark results show single-node RisingWave instances processing hundreds of thousands of events per second. For US equity markets (~1M quotes/second at peak), a properly sized cluster with Kafka partitioning handles the load. Colocate RisingWave workers with Kafka brokers for best throughput.
Q: How do we handle out-of-order market data events? A: RisingWave's watermark mechanism handles late-arriving events within configurable bounds. For market data, a watermark of a few hundred milliseconds is sufficient to absorb normal network jitter. Sequence number gaps (missing exchange sequence numbers) should be handled upstream in the feed handler before events reach Kafka.
Q: Can RisingWave join trade tape data with options pricing models?
A: Yes. Store options pricing parameters (implied volatility, Greeks) in CREATE TABLE reference tables, updated via CDC or periodic upserts. Join trade tape events against these tables using temporal joins (FOR SYSTEM_TIME AS OF) to get point-in-time accurate pricing data.
Q: How do we monitor signal freshness in production?
A: Query the MAX(signal_time) from your signal materialized views against current time. Alert when the lag exceeds your threshold. RisingWave also exposes internal metrics via a Prometheus endpoint for monitoring processing lag on each source.
Get Started
- Deploy your first HFT analytics pipeline with the RisingWave quickstart.
- Join the RisingWave Slack to discuss market data pipelines with the engineering team.

