Real-Time Bid Optimization for RTB with Streaming SQL

RisingWave enables real-time bid optimization for RTB by continuously maintaining materialized views over live auction streams. By incrementally computing per-user bid scores, campaign budget pacing, and frequency caps using streaming SQL, DSPs can serve optimal bids in under 10 milliseconds without rebuilding analytical models on every auction.

Why RTB Needs Streaming SQL

Real-time bidding (RTB) moves at machine speed. An ad exchange calls your DSP with an impression opportunity, and you have roughly 100 milliseconds to respond with a bid price — or nothing. Most DSPs today respond with stale data: yesterday's campaign performance, last hour's frequency caps, last week's user segment data.

The gap between data freshness and auction speed creates a compounding problem. Overbid on users who already converted. Underbid on high-value segments that have warmed up. Blow through daily budgets in the first hour because pacing signals are an hour old. These inefficiencies translate directly to wasted ad spend.

Streaming SQL changes the calculus. Instead of querying a data warehouse on each bid request, you maintain continuously updated materialized views that the bidder reads like a cache — except the cache is always current.

Architecture: RisingWave in the Bidding Stack

A typical RTB pipeline with RisingWave looks like this:

The ad exchange sends auction events to a Kafka topic
User activity (clicks, conversions, page views) streams from your tag system into a second Kafka topic
Campaign configuration and budget data lives in PostgreSQL
RisingWave consumes all three, maintaining real-time bid modifier tables
Your bidder reads from RisingWave's PostgreSQL-compatible interface at bid time

This means your bidder always reads a single table — no joins, no aggregations, no latency. RisingWave does all the continuous computation upstream.

Setting Up the Auction Stream

First, create a source for incoming auction events from Kafka:

CREATE SOURCE auction_events (
    auction_id     VARCHAR,
    user_id        VARCHAR,
    publisher_id   VARCHAR,
    ad_slot_id     VARCHAR,
    floor_price    DOUBLE PRECISION,
    event_time     TIMESTAMPTZ
)
WITH (
    connector = 'kafka',
    topic = 'rtb.auction.events',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;

CREATE SOURCE conversion_events (
    user_id        VARCHAR,
    campaign_id    VARCHAR,
    conversion_value DOUBLE PRECISION,
    event_time     TIMESTAMPTZ
)
WITH (
    connector = 'kafka',
    topic = 'rtb.conversions',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;

Next, pull campaign configuration from PostgreSQL via CDC so any budget changes propagate automatically:

CREATE SOURCE campaigns
WITH (
    connector = 'postgres-cdc',
    hostname = 'postgres',
    port = '5432',
    username = 'rw_user',
    password = 'secret',
    database.name = 'adserver',
    schema.name = 'public',
    table.name = 'campaigns'
);

Continuous Bid Modifier Computation

Now build the materialized views that power bid decisions. The first view computes rolling 1-hour conversion rates per user-campaign pair using a HOP window — which slides every 5 minutes so the signal stays fresh:

CREATE MATERIALIZED VIEW user_campaign_cvr AS
SELECT
    c.user_id,
    c.campaign_id,
    window_start,
    window_end,
    COUNT(*) AS conversions,
    SUM(c.conversion_value) AS total_value,
    AVG(c.conversion_value) AS avg_value
FROM HOP(
    conversion_events,
    event_time,
    INTERVAL '5 minutes',
    INTERVAL '1 hour'
) c
GROUP BY c.user_id, c.campaign_id, window_start, window_end;

CREATE MATERIALIZED VIEW campaign_budget_pacing AS
SELECT
    a.publisher_id,
    a.ad_slot_id,
    window_start,
    window_end,
    COUNT(*)                                        AS auctions_seen,
    COUNT(*) FILTER (WHERE a.floor_price < 2.50)    AS cheap_inventory,
    AVG(a.floor_price)                              AS avg_floor
FROM TUMBLE(
    auction_events,
    event_time,
    INTERVAL '10 minutes'
) a
GROUP BY a.publisher_id, a.ad_slot_id, window_start, window_end;

The bidder joins these views at query time with a simple point lookup, getting a complete bid signal in a single indexed scan.

Comparison: Batch vs. Streaming Bid Optimization

Dimension	Batch (Hourly ETL)	Streaming with RisingWave
Data freshness	30–60 minutes stale	< 1 second
Frequency cap accuracy	Often over-serves	Real-time cap enforcement
Budget pacing	Hourly adjustment	Per-minute adjustment
Conversion signal latency	Hours	Seconds
Infrastructure complexity	High (Spark + warehouses)	Low (single SQL layer)
Query latency at bid time	5–50ms (cache miss)	< 1ms (MV lookup)

Frequency Capping in Real Time

Frequency caps are where stale data costs the most. A user who has seen your ad 10 times today should not see it again — but if your frequency counter is an hour old, you will serve them ad 11, 12, and 13 before the batch updates. RisingWave eliminates this:

CREATE MATERIALIZED VIEW user_ad_frequency AS
SELECT
    user_id,
    ad_slot_id,
    window_start,
    window_end,
    COUNT(*) AS impressions
FROM TUMBLE(
    auction_events,
    event_time,
    INTERVAL '1 day'
) 
GROUP BY user_id, ad_slot_id, window_start, window_end;

Your bidder queries this view on every auction. If impressions >= cap, return no bid. The logic is trivial; the freshness is what matters.

Sinking Bid Signals Downstream

RisingWave can also push bid-ready signals back into Kafka for consumption by multiple downstream bidders:

CREATE SINK bid_signals_sink
FROM user_campaign_cvr
WITH (
    connector = 'kafka',
    properties.bootstrap.server = 'kafka:9092',
    topic = 'rtb.bid.signals'
)
FORMAT UPSERT ENCODE JSON (
    force_append_only = false
);

This fan-out pattern lets multiple bidder instances subscribe to a single pre-computed signal stream, reducing database load on RisingWave during peak auction volume.

FAQ

Q: Can RisingWave handle the query volume of a high-throughput DSP? A: Yes. RisingWave maintains materialized views as pre-computed tables. At bid time, your DSP reads from these tables like any PostgreSQL query — there is no on-the-fly aggregation. A well-indexed materialized view can serve thousands of reads per second with sub-millisecond latency.

Q: How does RisingWave handle late-arriving auction data? A: RisingWave uses watermarks to handle late data in windowed aggregations. You configure a watermark delay on the source, and late records within that window are included in the computation. Records arriving after the watermark are discarded or handled by a separate late-data view depending on your policy.

Q: Does RisingWave replace my DSP's bidder logic? A: No. RisingWave replaces the data layer — the signals, scores, and aggregates the bidder relies on. Your bidder logic, business rules, and ML model inference run as before; they just read from fresher, more accurate data.

Q: How do I migrate from a Redis-based frequency cap to RisingWave? A: RisingWave exposes a PostgreSQL interface, so you change the data source in your bidder from a Redis GET to a SQL query on a materialized view. For teams that prefer key-value access, you can sink the frequency cap view to Redis using the JDBC connector pattern.

Q: What happens to bid decisions if RisingWave is unavailable? A: Best practice is to use RisingWave as a read-replica for signals, with a fallback to conservative defaults (e.g., bid floor price, skip frequency-capped users). RisingWave is designed for high availability with replication, so planned downtime is the primary scenario to plan for.

Get Started

Ready to bring real-time bid optimization to your RTB stack? Follow the RisingWave quickstart to set up your first streaming SQL pipeline in under 10 minutes.

Join the community on RisingWave Slack to discuss RTB architectures, share SQL patterns, and get help from the team.