Real-Time MEV Detection with Streaming SQL

Real-Time MEV Detection with Streaming SQL

MEV — Maximal Extractable Value — represents value captured by block builders and searchers who reorder, insert, or censor transactions within a block. With RisingWave, a PostgreSQL-compatible streaming database, you can detect sandwich attacks, frontrunning patterns, and backrunning opportunities in real time by streaming mempool and confirmed block data through SQL materialized views that update as each new block arrives.

Why MEV Detection Matters

MEV extraction is not a fringe activity. Hundreds of millions of dollars are extracted from ordinary DeFi users each year through tactics they never see coming:

  • Sandwich attacks: A searcher detects a large pending swap in the mempool, places a buy order immediately before it and a sell order immediately after, profiting from the price impact caused by the victim's trade
  • Frontrunning: Copying a profitable transaction (an arbitrage or liquidation) and submitting it with higher gas to execute first
  • Backrunning: Placing a transaction immediately after a known price-moving event to capture residual arbitrage

The problem extends beyond individual user harm. MEV creates:

  • Gas price manipulation as searchers bid aggressively for block position
  • Network congestion from failed MEV attempts that still consume block space
  • Unfair markets where ordinary users consistently receive worse execution than optimized bots

Real-time detection enables protocol teams to quantify MEV leakage, researchers to study MEV dynamics, and users to understand why their trades executed at unexpected prices.

How Streaming SQL Solves This

RisingWave ingests mempool pending transactions and confirmed block transactions from an Ethereum node or MEV data provider. Materialized views correlate transactions within the same block by their position (transaction_index) and gas price, identifying the multi-transaction patterns that characterize MEV. Because views update with each new block, detection is continuous rather than retrospective.

Building It Step by Step

Step 1: Connect the Data Source

-- Confirmed block transactions with position data
CREATE SOURCE block_transactions (
    tx_hash             VARCHAR,
    block_number        BIGINT,
    block_timestamp     TIMESTAMPTZ,
    transaction_index   INTEGER,        -- position within block
    from_address        VARCHAR,
    to_address          VARCHAR,        -- typically a DEX router or contract
    value_eth           NUMERIC,
    gas_price_gwei      NUMERIC,
    gas_used            BIGINT,
    input_function_sig  VARCHAR,
    token_in            VARCHAR,
    token_out           VARCHAR,
    amount_in           NUMERIC,
    amount_out          NUMERIC,
    is_swap             BOOLEAN,
    builder_address     VARCHAR,        -- block builder identity
    chain_id            INTEGER
) WITH (
    connector = 'kafka',
    topic = 'blockchain.blocks.transactions',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;

Step 2: Build the Core Materialized View

-- Identify transactions in the same block on the same token pair
-- ordered by position — prerequisite for sandwich detection
CREATE MATERIALIZED VIEW block_swap_sequences AS
SELECT
    block_number,
    block_timestamp,
    tx_hash,
    transaction_index,
    from_address,
    to_address,
    token_in,
    token_out,
    amount_in,
    amount_out,
    gas_price_gwei,
    amount_out / NULLIF(amount_in, 0)   AS execution_price,
    LAG(tx_hash)        OVER w          AS prev_tx_hash,
    LAG(from_address)   OVER w          AS prev_from,
    LAG(gas_price_gwei) OVER w          AS prev_gas_price,
    LEAD(tx_hash)       OVER w          AS next_tx_hash,
    LEAD(from_address)  OVER w          AS next_from,
    LEAD(gas_price_gwei) OVER w         AS next_gas_price,
    LEAD(token_in)      OVER w          AS next_token_in,
    LEAD(token_out)     OVER w          AS next_token_out
FROM block_transactions
WHERE is_swap = TRUE
WINDOW w AS (
    PARTITION BY block_number, token_in, token_out
    ORDER BY transaction_index
);

Step 3: Add Alerts and Detection Logic

-- Detect sandwich attacks:
-- Pattern: [attacker buy] → [victim swap] → [attacker sell]
-- Same block, same token pair, attacker appears before AND after victim
CREATE MATERIALIZED VIEW sandwich_attacks AS
SELECT
    victim.block_number,
    victim.block_timestamp,
    victim.tx_hash                          AS victim_tx,
    victim.from_address                     AS victim_address,
    victim.prev_tx_hash                     AS frontrun_tx,
    victim.prev_from                        AS attacker_address,
    victim.next_tx_hash                     AS backrun_tx,
    victim.token_in,
    victim.token_out,
    victim.amount_in                        AS victim_amount_in,
    victim.execution_price                  AS victim_execution_price,
    victim.prev_gas_price                   AS frontrun_gas_price,
    victim.gas_price_gwei                   AS victim_gas_price,
    victim.next_gas_price                   AS backrun_gas_price
FROM block_swap_sequences victim
WHERE victim.prev_from = victim.next_from    -- same attacker before and after
  AND victim.prev_from <> victim.from_address  -- attacker != victim
  AND victim.prev_gas_price > victim.gas_price_gwei  -- attacker paid more gas
  AND victim.next_token_in = victim.token_out  -- attacker reverses the trade
  AND victim.next_token_out = victim.token_in;
-- MEV summary per block: count and estimated value extracted
CREATE MATERIALIZED VIEW mev_block_summary AS
SELECT
    block_number,
    block_timestamp,
    COUNT(*)                AS sandwich_count,
    COUNT(DISTINCT attacker_address) AS unique_attackers,
    COUNT(DISTINCT victim_address)   AS unique_victims,
    MAX(block_timestamp)    AS processed_at
FROM sandwich_attacks
GROUP BY block_number, block_timestamp;
-- Track the most active MEV searchers
CREATE MATERIALIZED VIEW top_mev_searchers AS
SELECT
    attacker_address,
    COUNT(*)                AS sandwich_count,
    COUNT(DISTINCT block_number) AS active_blocks,
    MAX(block_timestamp)    AS last_seen
FROM sandwich_attacks
GROUP BY attacker_address;
-- Sink MEV events to Kafka for downstream consumers
CREATE SINK mev_events_sink AS
SELECT
    block_number,
    block_timestamp,
    victim_tx,
    victim_address,
    frontrun_tx,
    attacker_address,
    token_in,
    token_out,
    'SANDWICH' AS mev_type
FROM sandwich_attacks
WITH (
    connector = 'kafka',
    topic = 'blockchain.mev.events',
    properties.bootstrap.server = 'kafka:9092'
) FORMAT PLAIN ENCODE JSON;

Step 4: Query for Real-Time Insights

-- Recent sandwich attacks (last 100 blocks)
SELECT
    block_number,
    victim_tx,
    victim_address,
    attacker_address,
    token_in,
    token_out,
    ROUND(victim_execution_price, 6)    AS victim_price,
    ROUND(frontrun_gas_price, 2)        AS attacker_gas_gwei,
    block_timestamp
FROM sandwich_attacks
ORDER BY block_number DESC, block_timestamp DESC
LIMIT 20;
-- Top MEV extractors by sandwich count
SELECT
    attacker_address,
    sandwich_count,
    active_blocks,
    last_seen
FROM top_mev_searchers
ORDER BY sandwich_count DESC
LIMIT 10;

Comparison: Batch vs Streaming

AspectBatch ETLStreaming SQL (RisingWave)
LatencyMinutes to hoursSub-second
Sandwich DetectionPost-block analysisPer-block, real-time
Pattern CoverageManual SQL scriptsContinuously running views
Attacker TrackingDaily aggregatesRunning totals
Alert SpeedNext batchMilliseconds
Block Builder AnalysisOfflineLive per-block

FAQ

Q: Can this detect frontrunning of non-swap transactions, such as NFT mints or liquidations?

Yes, though the detection pattern changes. For frontrunning, the signature is a transaction with a higher gas price appearing at a lower transaction_index than a very similar transaction in the same block — same function selector, same contract, different sender. Extend the block_swap_sequences view to include non-swap contract calls and apply the same LAG/LEAD window function analysis. Liquidation frontrunning is particularly tractable because the target positions are identifiable from the DeFi protocol's event stream.

Q: How do you handle MEV extracted via Flashbots and other private mempools?

Private mempool transactions don't appear in the public mempool before inclusion, so mempool-based detection won't catch them. However, the transaction ordering pattern in confirmed blocks is still visible on-chain. The block_transactions source captures all confirmed transactions regardless of how they reached the builder, so sandwich and frontrunning patterns remain detectable from the confirmed block data. Adding builder_address to aggregations reveals which builders facilitate the most MEV.

Q: What's the difference between detecting MEV on L1 Ethereum vs. L2 rollups?

On optimistic rollups like Arbitrum and Optimism, the sequencer currently has centralized control over transaction ordering, making traditional MEV less prevalent. However, as rollups move toward decentralized sequencing, MEV dynamics will emerge. The detection approach is identical — stream confirmed transactions with their block position data. On L2s with faster block times, the volume of transactions per time unit increases, making streaming processing even more critical than on L1.

Key Takeaways

  • Sandwich attack detection uses SQL window functions (LAG/LEAD) to identify three-transaction patterns within the same block and token pair
  • block_swap_sequences maintains the ordered swap sequence per block as a continuously updated materialized view
  • MEV block summaries and attacker rankings update with every new block without manual reprocessing
  • The Kafka sink delivers detected MEV events to research dashboards, protocol alert systems, and user protection tools
  • The same windowing pattern extends to frontrunning and backrunning detection with minor modifications to the filter conditions

Ready to try this? Get started with RisingWave. Join our Slack community.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.