Programmatic advertising data pipelines with RisingWave connect bid request streams, win notifications, impression events, and conversions into continuously updated materialized views. This replaces batch ETL jobs with a single streaming SQL layer that keeps bidding systems, pacing controls, and reporting dashboards updated within seconds of every auction.
Why Programmatic Pipelines Break Under Batch Architectures
Programmatic advertising moves fast. A demand-side platform (DSP) participates in billions of auctions daily, making bid price decisions in under 100 milliseconds. Yet most DSPs still rely on batch pipelines that summarize auction performance every 15–60 minutes. This creates a dangerous gap: your bidding algorithm is flying blind between batch refreshes.
The consequences are real:
- Overpacing: You blow through daily budget in the morning because yesterday's batch didn't reveal a publisher volume spike
- Underpacing: Bids drop too low after a false signal suggests overspend
- Stale frequency caps: Users see the same ad 20 times because the frequency counter updates too slowly
- Lost optimization opportunities: A placement that's converting at 3x average CPL stays at the same bid for hours
RisingWave solves this by replacing the batch layer with a streaming SQL engine that keeps all these signals current in real time.
Pipeline Architecture
A complete programmatic pipeline in RisingWave has five stages:
- Bid stream ingestion — auction results from your DSP's bid log Kafka topic
- Win/loss processing — match wins to bids and compute win rates by publisher and placement
- Impression and viewability — track served impressions and viewability signals
- Frequency capping state — maintain real-time per-user impression counts
- Pacing and budget tracking — monitor spend velocity against daily budget caps
Ingesting Bid and Win Events
CREATE SOURCE bid_responses (
bid_id VARCHAR,
auction_id VARCHAR,
campaign_id VARCHAR,
placement_id VARCHAR,
publisher_id VARCHAR,
bid_price DECIMAL,
bid_floor DECIMAL,
user_id VARCHAR,
event_time TIMESTAMPTZ
)
WITH (
connector = 'kafka',
topic = 'dsp-bid-responses',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;
CREATE SOURCE win_notifications (
bid_id VARCHAR,
auction_id VARCHAR,
clearing_price DECIMAL,
event_time TIMESTAMPTZ
)
WITH (
connector = 'kafka',
topic = 'dsp-win-notifications',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;
Real-Time Win Rate and CPM by Placement
CREATE MATERIALIZED VIEW placement_win_rates AS
SELECT
b.placement_id,
b.publisher_id,
b.campaign_id,
window_start,
window_end,
COUNT(b.bid_id) AS total_bids,
COUNT(w.bid_id) AS wins,
ROUND(COUNT(w.bid_id)::DECIMAL / NULLIF(COUNT(b.bid_id), 0) * 100, 2)
AS win_rate_pct,
AVG(b.bid_price) AS avg_bid_price,
AVG(w.clearing_price) AS avg_clearing_price,
ROUND(AVG(w.clearing_price) * 1000, 2) AS effective_cpm
FROM TUMBLE(bid_responses, event_time, INTERVAL '5 MINUTES') b
LEFT JOIN win_notifications w
ON b.bid_id = w.bid_id
GROUP BY b.placement_id, b.publisher_id, b.campaign_id, window_start, window_end;
Real-Time Frequency Capping State
Track per-user impression counts in a sliding window to enforce frequency caps:
CREATE SOURCE impression_events (
impression_id VARCHAR,
user_id VARCHAR,
campaign_id VARCHAR,
ad_id VARCHAR,
placement_id VARCHAR,
event_time TIMESTAMPTZ
)
WITH (
connector = 'kafka',
topic = 'ad-impressions',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;
CREATE MATERIALIZED VIEW user_frequency_caps AS
SELECT
user_id,
campaign_id,
window_start,
window_end,
COUNT(impression_id) AS impressions_in_window
FROM HOP(impression_events, event_time, INTERVAL '1 HOUR', INTERVAL '24 HOURS')
GROUP BY user_id, campaign_id, window_start, window_end;
Query this view from your bidding system before each auction to skip users who have already seen the ad the maximum number of times.
Budget Pacing Monitor
CREATE MATERIALIZED VIEW campaign_spend_pace AS
SELECT
campaign_id,
window_start,
window_end,
SUM(clearing_price) AS spend,
COUNT(*) AS impressions_won
FROM TUMBLE(
(SELECT w.*, b.campaign_id
FROM win_notifications w
JOIN bid_responses b ON w.bid_id = b.bid_id),
event_time,
INTERVAL '1 HOUR'
)
GROUP BY campaign_id, window_start, window_end;
Comparison: Programmatic Pipeline Architectures
| Capability | Traditional Batch DSP | Real-Time Streaming (RisingWave) |
| Win rate visibility | 15–60 min lag | Sub-10 second lag |
| Frequency cap accuracy | Approximate (stale) | Exact (current window) |
| Budget pacing granularity | Hourly | Per-minute |
| Placement performance feedback | Delayed | Immediate |
| Infrastructure pieces | Kafka + Spark/Flink + Warehouse | Kafka + RisingWave |
| SQL familiarity required | High (complex Spark SQL) | Standard SQL |
Sinking to Downstream Systems
Push real-time placement performance to your bid optimization API via Kafka:
CREATE SINK placement_performance_sink
FROM placement_win_rates
WITH (
connector = 'kafka',
topic = 'bid-optimization-signals',
properties.bootstrap.server = 'kafka:9092'
)
FORMAT PLAIN ENCODE JSON;
Archive all auction data to Iceberg for long-term analysis and model training:
CREATE SINK auction_archive_sink
FROM placement_win_rates
WITH (
connector = 'iceberg',
type = 'append-only',
warehouse.path = 's3://your-bucket/programmatic-data',
s3.region = 'us-east-1',
database.name = 'adtech',
table.name = 'placement_win_rates'
);
FAQ
Q: How does RisingWave handle out-of-order bid and win event streams? RisingWave supports event-time processing with configurable watermarks. You define a maximum allowed delay on the event_time column, and RisingWave includes late-arriving events in the correct window up to that threshold. Events arriving after the watermark deadline are handled according to your configured late-data policy.
Q: Can RisingWave process billions of bid events per day? Yes. RisingWave is designed for high-throughput streaming workloads. Horizontal scaling across compute nodes, combined with Kafka topic partitioning, allows the pipeline to grow with bid volume. Production deployments process hundreds of millions of events per hour.
Q: How do I connect my existing DSP to RisingWave?
If your DSP publishes bid logs and win notifications to Kafka (most modern DSPs do), connect RisingWave using the Kafka source connector. If your DSP writes to a PostgreSQL or MySQL database, use the postgres-cdc or mysql-cdc connector to stream changes into RisingWave.
Q: Does RisingWave support cookie-based user ID joins? Yes. You can store cookie-to-user-ID mapping in a RisingWave table (loaded via CDC from your identity system) and join it to impression events using a standard SQL join on the materialized view.
Get Started
Replace your batch programmatic pipeline with streaming SQL today.
- Documentation: docs.risingwave.com/get-started
- Community: risingwave.com/slack

