Most A/B testing stacks make product teams wait. Events land in a warehouse, a nightly job rolls them up, and the dashboard refreshes the next morning. That cycle is fine for a year-long pricing study, but it is painful for a checkout redesign you want to roll back if the variant tanks.
This post shows how to compute variant conversion rates, rolling 1-hour funnel metrics, and lift versus control in real time, using streaming SQL in RisingWave. The pipeline is a few materialized views. Every event you ingest is reflected in the dashboard within seconds. All SQL has been verified against RisingWave 2.8.0.
Why Streaming SQL Beats Nightly Rollups for A/B Tests
Streaming SQL beats nightly rollups for A/B tests because experiment metrics decay in value the longer they take to compute. A 24-hour rollup hides early signals of a broken variant, blocks fast rollbacks, and forces engineers to ship code on hunches. A streaming pipeline keeps the same SQL you would write in a warehouse, but updates incrementally as each event arrives.
Traditional batch stacks treat experiment events the same as any other analytic event: ingest, partition, dedupe, aggregate on a schedule. That works when the goal is a quarterly report. It does not work when the variant under test is silently dropping checkout conversion by 10 percent for an hour.
| Aspect | Batch warehouse rollup | Streaming SQL (RisingWave) |
|---|---|---|
| Refresh cadence | Hours to a day | Sub-second per event |
| Pipeline shape | ETL → table → BI | Source → materialized view → BI |
| Code | SQL + scheduler + orchestrator | SQL only |
| Rollback signal latency | Same as cadence | Seconds |
| Cost of a 1 hour bad variant | Full hour of users exposed | Detected within minutes |
RisingWave is a PostgreSQL-compatible streaming database. You write CREATE MATERIALIZED VIEW over a stream of events, and the view is incrementally updated as each row lands. The wire protocol is Postgres, so your BI tool and your psql shell connect without any new driver.
The Event Model
A workable A/B test event model is simple. Each row is one user action, tagged with the experiment id and the variant the user was assigned to. For verification here, we use a regular table seeded with representative events. In production this would be a CREATE SOURCE backed by a Kafka topic, with the same columns.
CREATE TABLE ab_events (
user_id VARCHAR,
experiment_id VARCHAR,
variant VARCHAR,
event_type VARCHAR,
revenue DOUBLE PRECISION,
event_time TIMESTAMPTZ
);
Seed two variants for one experiment, checkout-redesign. Variant A is the control. Variant B is the candidate. The pattern is six exposures per arm, with B converting better.
INSERT INTO ab_events VALUES
('u01','checkout-redesign','A','exposure', NULL, NOW() - INTERVAL '50 minutes'),
('u02','checkout-redesign','A','exposure', NULL, NOW() - INTERVAL '49 minutes'),
('u03','checkout-redesign','A','exposure', NULL, NOW() - INTERVAL '48 minutes'),
('u04','checkout-redesign','A','exposure', NULL, NOW() - INTERVAL '47 minutes'),
('u05','checkout-redesign','A','exposure', NULL, NOW() - INTERVAL '46 minutes'),
('u06','checkout-redesign','A','exposure', NULL, NOW() - INTERVAL '45 minutes'),
('u01','checkout-redesign','A','add_to_cart', NULL, NOW() - INTERVAL '44 minutes'),
('u02','checkout-redesign','A','add_to_cart', NULL, NOW() - INTERVAL '43 minutes'),
('u01','checkout-redesign','A','purchase', 42.00, NOW() - INTERVAL '40 minutes'),
('u11','checkout-redesign','B','exposure', NULL, NOW() - INTERVAL '50 minutes'),
('u12','checkout-redesign','B','exposure', NULL, NOW() - INTERVAL '49 minutes'),
('u13','checkout-redesign','B','exposure', NULL, NOW() - INTERVAL '48 minutes'),
('u14','checkout-redesign','B','exposure', NULL, NOW() - INTERVAL '47 minutes'),
('u15','checkout-redesign','B','exposure', NULL, NOW() - INTERVAL '46 minutes'),
('u16','checkout-redesign','B','exposure', NULL, NOW() - INTERVAL '45 minutes'),
('u11','checkout-redesign','B','add_to_cart', NULL, NOW() - INTERVAL '44 minutes'),
('u12','checkout-redesign','B','add_to_cart', NULL, NOW() - INTERVAL '43 minutes'),
('u13','checkout-redesign','B','add_to_cart', NULL, NOW() - INTERVAL '42 minutes'),
('u14','checkout-redesign','B','add_to_cart', NULL, NOW() - INTERVAL '41 minutes'),
('u11','checkout-redesign','B','purchase', 58.00, NOW() - INTERVAL '38 minutes'),
('u12','checkout-redesign','B','purchase', 61.50, NOW() - INTERVAL '37 minutes'),
('u13','checkout-redesign','B','purchase', 44.25, NOW() - INTERVAL '36 minutes');
A few schema notes worth flagging:
- One row per user action, not one row per session. Per-action rows let you compute step-level funnel metrics later without re-instrumenting.
revenueisNULLfor non-purchase events. Treating revenue as nullable keeps the schema simple; the aggregations filter it out.variantis just a string. We do not encode the experiment in the column name. That way one query can cover all experiments.
Materialized View 1: Per-Variant Conversion Rates
The per-variant conversion view answers the first question every product manager asks: how does each arm compare on the headline funnel? It computes unique exposures, unique adds-to-cart, unique purchasers, and conversion percentages for each variant of each experiment.
CREATE MATERIALIZED VIEW ab_variant_conversion AS
SELECT
experiment_id,
variant,
COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'exposure') AS exposures,
COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'add_to_cart') AS adds_to_cart,
COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'purchase') AS purchasers,
ROUND(
COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'purchase')::NUMERIC
/ NULLIF(COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'exposure'), 0) * 100, 2
) AS purchase_conv_pct,
ROUND(
COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'add_to_cart')::NUMERIC
/ NULLIF(COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'exposure'), 0) * 100, 2
) AS add_to_cart_conv_pct,
COALESCE(SUM(revenue) FILTER (WHERE event_type = 'purchase'), 0) AS gross_revenue
FROM ab_events
GROUP BY experiment_id, variant;
Query it:
SELECT variant, exposures, adds_to_cart, purchasers,
purchase_conv_pct, add_to_cart_conv_pct, gross_revenue
FROM ab_variant_conversion
ORDER BY variant;
Output against the seed:
variant | exposures | adds_to_cart | purchasers | purchase_conv_pct | add_to_cart_conv_pct | gross_revenue
---------+-----------+--------------+------------+-------------------+----------------------+---------------
A | 6 | 2 | 1 | 16.67 | 33.33 | 42
B | 6 | 4 | 3 | 50.00 | 66.67 | 163.75
Variant B converts users to purchase at three times the rate of A in this seed. With real traffic, that number updates on every incoming event. The dashboard query you wire up in your BI tool is the same SELECT * FROM ab_variant_conversion you would write against a warehouse table. Behind the scenes, RisingWave maintains the aggregates incrementally, not by rescanning the events table.
Why COUNT(DISTINCT user_id) and not COUNT(*): A single user firing multiple exposure events should count once. Streaming sources are often noisy, with duplicate or retried events. Deduplicating in the aggregate keeps the conversion metric honest.
Materialized View 2: Rolling 1-Hour Funnel
A rolling 1-hour funnel shows how each variant is performing right now, in a moving window of the last 60 minutes. It is the view product teams stare at on launch day. Use the HOP windowing function to emit a new window every 5 minutes covering the prior hour, so the dashboard refreshes smoothly instead of jumping at the top of each hour.
CREATE MATERIALIZED VIEW ab_hourly_funnel AS
SELECT
experiment_id,
variant,
window_start,
window_end,
COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'exposure') AS exposures,
COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'add_to_cart') AS adds_to_cart,
COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'purchase') AS purchasers,
ROUND(
COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'purchase')::NUMERIC
/ NULLIF(COUNT(DISTINCT user_id) FILTER (WHERE event_type = 'exposure'), 0) * 100, 2
) AS purchase_conv_pct
FROM HOP(ab_events, event_time, INTERVAL '5 minutes', INTERVAL '1 hour')
GROUP BY experiment_id, variant, window_start, window_end;
Query the most recent windows:
SELECT variant, exposures, adds_to_cart, purchasers,
purchase_conv_pct, window_end
FROM ab_hourly_funnel
ORDER BY window_end DESC, variant
LIMIT 6;
Output:
variant | exposures | adds_to_cart | purchasers | purchase_conv_pct | window_end
---------+-----------+--------------+------------+-------------------+---------------------------
B | 0 | 0 | 3 | | 2026-05-22 21:35:00+00:00
A | 0 | 2 | 1 | | 2026-05-22 21:30:00+00:00
B | 0 | 4 | 3 | | 2026-05-22 21:30:00+00:00
A | 5 | 2 | 1 | 20.00 | 2026-05-22 21:25:00+00:00
B | 5 | 4 | 3 | 60.00 | 2026-05-22 21:25:00+00:00
A | 6 | 2 | 1 | 16.67 | 2026-05-22 21:20:00+00:00
A few things to notice:
HOP(events, event_time, INTERVAL '5 minutes', INTERVAL '1 hour')emits a new 1-hour window every 5 minutes. So at 21:00, 21:05, 21:10, and so on, you get a fresh row aggregating the prior 60 minutes.- Recent windows show
NULLfor the conversion percentage because no exposures fell into that slice. That is correct behavior, not a bug. TheNULLIFin the divisor keeps the division safe. - The rolling view is independent of the lifetime view. Together they answer two questions, lifetime totals and current trend, with no extra plumbing.
For more on RisingWave's windowing semantics, see the streaming SQL window functions guide.
Materialized View 3: Lift Versus Control
The lift view compares every variant against the control arm in the same experiment, so the dashboard can show a single number that says how much better or worse each candidate is. It is a self-join on the per-variant view: take the control row, join it to every other variant in the same experiment, and subtract.
CREATE MATERIALIZED VIEW ab_lift_vs_control AS
WITH control AS (
SELECT experiment_id,
exposures AS ctrl_exposures,
purchasers AS ctrl_purchasers,
purchase_conv_pct AS ctrl_conv
FROM ab_variant_conversion
WHERE variant = 'A'
)
SELECT
v.experiment_id,
v.variant,
v.exposures,
v.purchasers,
v.purchase_conv_pct,
c.ctrl_conv,
ROUND(v.purchase_conv_pct - c.ctrl_conv, 2) AS abs_lift_pct_points,
ROUND(
(v.purchase_conv_pct - c.ctrl_conv) / NULLIF(c.ctrl_conv, 0) * 100, 2
) AS rel_lift_pct
FROM ab_variant_conversion v
JOIN control c USING (experiment_id);
Query it:
SELECT variant, exposures, purchasers, purchase_conv_pct,
ctrl_conv, abs_lift_pct_points, rel_lift_pct
FROM ab_lift_vs_control
ORDER BY variant;
Output:
variant | exposures | purchasers | purchase_conv_pct | ctrl_conv | abs_lift_pct_points | rel_lift_pct
---------+-----------+------------+-------------------+-----------+---------------------+--------------
A | 6 | 1 | 16.67 | 16.67 | 0.00 | 0
B | 6 | 3 | 50.00 | 16.67 | 33.33 | 199.94
Variant B's relative lift is about 200 percent in this seed. The lift view updates whenever the upstream view updates. Layered materialized views are how RisingWave composes pipelines: the engine tracks the dependency, and when ab_variant_conversion changes, ab_lift_vs_control is automatically recomputed for the affected rows.
In a real launch dashboard, you would render exactly this query as a table next to the rolling 1-hour funnel. The two views answer different questions and update independently.
A Note on Statistical Significance
The lift view tells you how much better one arm appears, but it does not tell you whether the difference is statistically significant. With only six exposures per arm in the seed, the apparent 200 percent lift is noise. Significance testing belongs in a separate layer.
Two reasonable patterns:
- Pre-compute components in SQL, run the test in code. Have the streaming view emit
n_a, conv_a, n_b, conv_band let a Python or Go service pull that row and run a two-proportion z-test or a sequential test. This keeps the SQL boring and the stats library current. - Use a guardrail rule in SQL. Materialize a rule like "fire an alert when variant B has at least N exposures and the relative lift is more than X percent in the wrong direction." This is not a p-value, but it is a fast, monitored rollback signal.
Do not claim statistical significance from raw streaming SQL without a real test on top. The streaming layer's job is to keep the counts fresh, not to substitute for an experimentation platform's stats engine.
Sinking Results to a Dashboard
You can read these materialized views directly from any Postgres-compatible BI tool: Metabase, Superset, Grafana, Hex. The connection is standard Postgres on port 4566. No new driver.
For lower-latency dashboards or when the consumer is another service, sink to Kafka or Iceberg:
CREATE SINK ab_conversion_sink FROM ab_variant_conversion
WITH (
connector = 'kafka',
properties.bootstrap.server = 'kafka:9092',
topic = 'ab.variant_conversion'
) FORMAT PLAIN ENCODE JSON (force_append_only = 'false');
Downstream services can subscribe and trigger automated rollbacks, page on guardrail violations, or feed the numbers back into an in-app banner.
Key Takeaways
A real-time A/B testing pipeline does not require a new platform or a custom stream processor. Three materialized views, written in standard SQL, give you:
- Lifetime conversion rates per variant, updated per event
- A rolling 1-hour funnel that shows current trend without delay
- Lift versus control, automatically derived from the per-variant view
The pipeline is composed of layered views. Each one is incrementally maintained. The dashboard query is whatever SELECT your BI tool sends. The 24-hour wait disappears.
If you want to extend this pattern, the natural next steps are: add a sequential significance test in application code, sink alerts to Kafka for automated rollbacks, and add session-level funnel views with SESSION windowing for click-path analysis.
Try the pipeline against your own event stream with the RisingWave quickstart. The SQL above runs unchanged on a local single-node cluster.

