Streaming Analytics for Ad Campaigns with RisingWave

Streaming Analytics for Ad Campaigns with RisingWave

Streaming analytics for ad campaigns means computing impressions, clicks, conversions, and spend metrics continuously as events arrive—not after a batch job runs. With RisingWave, you write standard SQL against live data streams, and materialized views keep campaign dashboards current within seconds of each ad event.

The Problem with Batch Campaign Reporting

Most ad teams still rely on batch pipelines that refresh dashboards every 15 minutes, hourly, or overnight. During a high-spend campaign launch, that delay is catastrophic. A misconfigured bid, a broken landing page, or a creative underperforming by 40%—you won't know for hours while budget burns.

Real-time campaign analytics eliminate that blind spot. When every impression, click, and conversion flows through a streaming SQL layer, your dashboards reflect what's happening right now, not what happened before your last batch job ran.

Architecture for Real-Time Campaign Analytics

The pipeline connects your ad serving infrastructure to RisingWave via Kafka. Events flow through three materialized view layers:

  1. Raw event ingestion from Kafka (impressions, clicks, conversions)
  2. Per-campaign aggregations using tumbling windows
  3. Cross-campaign rollups for portfolio-level insights

Reference data—campaign metadata, bid targets, budget caps—lives in RisingWave tables and joins to stream data without any ETL.

Ingesting Ad Events from Kafka

CREATE SOURCE ad_impressions (
    impression_id   VARCHAR,
    campaign_id     VARCHAR,
    ad_id           VARCHAR,
    publisher_id    VARCHAR,
    placement_id    VARCHAR,
    bid_price       DECIMAL,
    event_time      TIMESTAMPTZ
)
WITH (
    connector = 'kafka',
    topic = 'ad-impressions',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;

CREATE SOURCE ad_clicks (
    click_id        VARCHAR,
    impression_id   VARCHAR,
    campaign_id     VARCHAR,
    ad_id           VARCHAR,
    event_time      TIMESTAMPTZ
)
WITH (
    connector = 'kafka',
    topic = 'ad-clicks',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;

Store campaign reference data in a regular table for fast joins:

CREATE TABLE campaigns (
    campaign_id     VARCHAR PRIMARY KEY,
    advertiser_id   VARCHAR,
    campaign_name   VARCHAR,
    daily_budget    DECIMAL,
    target_cpc      DECIMAL,
    target_cpm      DECIMAL,
    start_date      DATE,
    end_date        DATE
);

Real-Time Campaign Performance Metrics

Compute live KPIs per campaign using a 1-minute tumbling window:

CREATE MATERIALIZED VIEW campaign_performance_1min AS
SELECT
    i.campaign_id,
    c.campaign_name,
    window_start,
    window_end,
    COUNT(i.impression_id)                                  AS impressions,
    COUNT(cl.click_id)                                      AS clicks,
    SUM(i.bid_price)                                        AS spend,
    ROUND(COUNT(cl.click_id)::DECIMAL / NULLIF(COUNT(i.impression_id), 0) * 100, 4)
                                                            AS ctr_pct,
    ROUND(SUM(i.bid_price) / NULLIF(COUNT(cl.click_id), 0), 4)
                                                            AS cpc,
    ROUND(SUM(i.bid_price) / NULLIF(COUNT(i.impression_id), 0) * 1000, 4)
                                                            AS cpm
FROM TUMBLE(ad_impressions, event_time, INTERVAL '1 MINUTE') i
LEFT JOIN ad_clicks cl
    ON i.impression_id = cl.impression_id
LEFT JOIN campaigns c
    ON i.campaign_id = c.campaign_id
GROUP BY i.campaign_id, c.campaign_name, window_start, window_end;

For budget pacing, use a hopping window to see cumulative spend across the day with smooth overlap:

CREATE MATERIALIZED VIEW campaign_budget_pacing AS
SELECT
    campaign_id,
    window_start,
    window_end,
    SUM(bid_price)      AS spend_in_window,
    COUNT(*)            AS impressions_in_window
FROM HOP(ad_impressions, event_time, INTERVAL '5 MINUTES', INTERVAL '1 HOUR')
GROUP BY campaign_id, window_start, window_end;

Pushing Results to Downstream Systems

Sink live performance metrics to Kafka for real-time dashboards and bidding system feedback:

CREATE SINK campaign_performance_sink
FROM campaign_performance_1min
WITH (
    connector = 'kafka',
    topic = 'campaign-performance-live',
    properties.bootstrap.server = 'kafka:9092'
)
FORMAT PLAIN ENCODE JSON;

For warehousing and historical analysis, sink to Iceberg:

CREATE SINK campaign_performance_iceberg
FROM campaign_performance_1min
WITH (
    connector = 'iceberg',
    type = 'append-only',
    warehouse.path = 's3://your-bucket/campaign-analytics',
    s3.region = 'us-east-1',
    database.name = 'adtech',
    table.name = 'campaign_performance'
);

Comparison: Analytics Approaches for Ad Campaigns

FeatureNightly Batch15-Min RefreshStreaming (RisingWave)
Dashboard latency12–24 hours15–30 minutes1–5 seconds
Budget waste detectionNext daySame hourReal time
Bid optimization feedbackDelayedPartially timelyImmediate
Infrastructure neededData warehouse + ETLSame + schedulerKafka + RisingWave
SQL complexityStandardStandardStandard (same dialect)
Cost to operateHigh (warehouse compute)HighEfficient (incremental)

Operational Best Practices

Set budget alerts as materialized views. Rather than polling a dashboard, create a materialized view that flags campaigns nearing their daily budget cap. Query it directly from your bidding system to pause campaigns automatically.

Separate creative-level and campaign-level views. Keep granular views (per ad_id) separate from rollup views (per campaign_id). This avoids recomputing the full campaign rollup when you only need creative performance.

Use the PostgreSQL interface for BI tools. RisingWave exposes a PostgreSQL-compatible endpoint, which means Grafana, Metabase, and Looker can connect directly. No special drivers required.

FAQ

Q: How does RisingWave handle late-arriving events? RisingWave supports configurable watermarks for handling late data. You can define a watermark on your event time column to allow events arriving up to N minutes late to be included in the correct window, balancing completeness with latency.

Q: Can I run this alongside my existing data warehouse? Yes. RisingWave serves the real-time layer while your warehouse handles historical analysis. Use the Iceberg sink to write results from RisingWave to your data lake, making them available for warehouse queries without duplicating infrastructure.

Q: How many campaigns can RisingWave track simultaneously? RisingWave scales horizontally, so the number of concurrent campaigns is not a fixed limit. Production deployments regularly track tens of thousands of campaigns in a single deployment. Partitioning your Kafka topics by campaign prefix improves parallelism.

Q: Do I need to rebuild materialized views when I add new campaigns? No. Campaign metadata lives in a regular table. Add new rows to the campaigns table, and the materialized views automatically include new campaigns in subsequent windows without any redeployment.

Get Started

Build a live campaign analytics pipeline today—RisingWave is open source and ready to deploy.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.