The Medallion architecture — Bronze for raw ingest, Silver for clean wide tables, Gold for serving-ready aggregates — is the most widely adopted data lakehouse pattern in e-commerce. With RisingWave, a PostgreSQL-compatible streaming database, you can implement all three layers in streaming SQL, keeping every layer continuously up to date as orders and events flow through Kafka.
The Problem: Static Medallion Architectures Miss the Real-Time Window
The classic Medallion architecture was designed for batch and micro-batch processing. Bronze is an S3 data lake. Silver is a dbt transformation run every 30 minutes. Gold is a BI-ready table refreshed nightly. This works for historical reporting but breaks down for operational use cases:
- Merchant dashboards need GMV and order counts that are current to the minute, not the last batch run.
- Operations teams need to detect inventory stockouts within seconds of them occurring, not hours.
- Finance teams need real-time payment method breakdowns during high-volume events like 11/11 or Black Friday.
SHOPLINE — Asia's leading commerce enablement platform — faced exactly this problem. Their original Lambda architecture (dual batch+streaming pipelines) created consistency windows and duplicated business logic. After adopting RisingWave, they achieved a 76.7% reduction in API response times by making all three Medallion layers streaming.
"By adopting RisingWave, we moved to a cost-effective Kappa architecture. It provides the low-latency real-time analytics we need and delivers a much smoother experience for our merchants." — Henry Chi, Manager, Data Team, SHOPLINE
How Streaming SQL Solves This
RisingWave implements the Medallion layers as materialized views that are incrementally maintained. When an order event arrives in Kafka, RisingWave propagates the update through Bronze → Silver → Gold in a single processing pass. The Gold layer is always consistent with the Bronze layer — there is no batch lag, no reconciliation window.
The practical implication: your serving layer (application databases, BI tools, alerting systems) always reads from pre-computed, continuously updated results.
Building the Streaming Medallion Architecture
Step 1: Bronze Layer — Data Source Setup
The Bronze layer connects directly to Kafka. Raw events land here without transformation — the goal is lossless ingest:
-- Raw order events from all channels
CREATE SOURCE orders (
order_id VARCHAR,
merchant_id VARCHAR,
customer_id VARCHAR,
channel VARCHAR, -- 'web', 'mobile', 'pos', 'marketplace'
payment_method VARCHAR, -- 'card', 'alipay', 'paypal', 'cod'
subtotal NUMERIC,
discount NUMERIC,
total NUMERIC,
status VARCHAR,
created_at TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'ecommerce.orders',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;
-- Product catalog updates
CREATE SOURCE product_updates (
sku_id VARCHAR,
merchant_id VARCHAR,
product_name VARCHAR,
category VARCHAR,
list_price NUMERIC,
updated_at TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'ecommerce.products',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;
For object storage durability, Kafka's own retention policy (or a Kafka-to-S3 connector like Kafka Connect S3 Sink) handles the Bronze archival layer independently.
Step 2: Silver Layer — Core Materialized View
The Silver layer cleans, validates, and enriches raw events. The key transformation here is joining orders to merchant and product metadata to produce wide, analysis-ready rows:
CREATE MATERIALIZED VIEW silver_orders AS
SELECT
o.order_id,
o.merchant_id,
o.customer_id,
o.channel,
o.payment_method,
o.subtotal,
o.discount,
o.total,
o.status,
o.created_at,
-- Enrichment from merchant dimension
m.store_name,
m.region,
m.tier,
m.country_code,
-- Computed fields
ROUND(o.discount / NULLIF(o.subtotal, 0) * 100, 2) AS discount_pct,
CASE
WHEN o.total >= 500 THEN 'high_value'
WHEN o.total >= 100 THEN 'mid_value'
ELSE 'low_value'
END AS order_tier,
DATE_TRUNC('hour', o.created_at) AS hour_bucket
FROM orders o
JOIN merchants m ON o.merchant_id = m.merchant_id
WHERE o.status NOT IN ('test', 'system_generated');
This is a streaming join — RisingWave maintains the join result incrementally as new orders arrive and as the merchants table updates.
Step 3: Gold Layer — Aggregations
The Gold layer pre-computes the aggregates that downstream consumers actually need. Each view is a separate serving artifact for a specific use case:
-- GMV and order metrics by merchant, channel, and hour
CREATE MATERIALIZED VIEW gold_merchant_hourly AS
SELECT
merchant_id,
store_name,
region,
channel,
WINDOW_START AS hour_start,
COUNT(*) AS orders,
SUM(total) AS gmv,
ROUND(AVG(total), 2) AS aov,
COUNT(DISTINCT customer_id) AS unique_buyers,
SUM(discount) AS discount_amount,
COUNT(CASE WHEN status = 'refunded'
THEN 1 END) AS refunds
FROM TUMBLE(silver_orders, created_at, INTERVAL '1 HOUR')
GROUP BY merchant_id, store_name, region, channel, WINDOW_START;
-- Payment method breakdown (near-real-time)
CREATE MATERIALIZED VIEW gold_payment_breakdown AS
SELECT
merchant_id,
payment_method,
WINDOW_START AS hour_start,
COUNT(*) AS transaction_count,
SUM(total) AS payment_volume,
ROUND(AVG(total), 2) AS avg_transaction_value
FROM TUMBLE(silver_orders, created_at, INTERVAL '1 HOUR')
WHERE status NOT IN ('cancelled', 'failed')
GROUP BY merchant_id, payment_method, WINDOW_START;
-- Platform-level daily summary
CREATE MATERIALIZED VIEW gold_platform_daily AS
SELECT
DATE_TRUNC('day', hour_start) AS day,
SUM(orders) AS total_orders,
SUM(gmv) AS total_gmv,
COUNT(DISTINCT merchant_id) AS active_merchants,
ROUND(AVG(aov), 2) AS platform_aov
FROM gold_merchant_hourly
GROUP BY DATE_TRUNC('day', hour_start);
Step 4: Serving Layer — Downstream Sinks
The Gold layer feeds into application databases, the data warehouse, and alerting systems via sinks:
-- Merchant dashboard application DB
CREATE SINK sink_merchant_dashboard
FROM gold_merchant_hourly
WITH (
connector = 'jdbc',
jdbc.url = 'jdbc:postgresql://app-db:5432/dashboard',
table.name = 'merchant_hourly_metrics',
type = 'upsert',
primary_key = 'merchant_id,channel,hour_start'
);
-- Data warehouse for historical analysis
CREATE SINK sink_to_warehouse
FROM gold_platform_daily
WITH (
connector = 'jdbc',
jdbc.url = 'jdbc:postgresql://warehouse:5432/reporting',
table.name = 'platform_daily_summary',
type = 'upsert',
primary_key = 'day'
);
Architecture Comparison
| Layer | Batch Medallion (dbt) | Streaming Medallion (RisingWave) |
| Bronze | S3 / raw Kafka | Kafka source (same) |
| Silver | dbt model, runs every 30 min | Streaming materialized view, always current |
| Gold | dbt model + BI tool aggregation | Streaming materialized view, pre-aggregated |
| Serving | Query warehouse at query time | Read pre-computed results from app DB |
| Latency | Minutes to hours | Sub-second to seconds |
| Consistency | Stale between runs | Continuously consistent |
| Backfill | Re-run dbt job | Replay Kafka offset |
| Language | SQL (dbt) | SQL (PostgreSQL-compatible) |
FAQ
Q: Can I use RisingWave alongside an existing dbt + data warehouse setup? Yes — and this is the recommended migration path. Keep dbt + warehouse for historical analysis and ad hoc queries. Add RisingWave's streaming Medallion layers for real-time serving. The Gold-layer sinks feed both your application database (for dashboards) and your warehouse (for history).
Q: How do I handle schema changes in Silver layer views?
RisingWave supports ALTER MATERIALIZED VIEW for additive changes (new columns). For breaking changes, you drop and recreate the view — RisingWave reprocesses from the Kafka offset automatically. Because Silver and Gold views are defined in SQL, the change is a one-line edit rather than a pipeline deployment.
Q: What is the operational overhead of running three layers of materialized views? Lower than running two separate pipelines. RisingWave handles checkpointing, fault recovery, and backpressure internally. The Silver and Gold views share the same Kafka consumption — data is not read multiple times from the broker.
Key Takeaways
- The Medallion architecture works best when all three layers are streaming — batch Silver and Gold layers introduce the same lag that dashboards are trying to eliminate.
- RisingWave implements each Medallion layer as a materialized view: Bronze (Kafka source), Silver (streaming join + enrichment), Gold (windowed aggregation).
- SHOPLINE's 76.7% API latency reduction came from making their Gold layer streaming — dashboards read pre-computed aggregates instead of querying the warehouse at request time.
- All three layers are defined in standard PostgreSQL-compatible SQL — no Flink, no Spark, no separate streaming language.
- Sinks at the Gold layer feed multiple downstream consumers (application DB, warehouse, alerting) from a single streaming source.

