Multi-Tenant E-Commerce Analytics with RisingWave

Multi-Tenant E-Commerce Analytics with RisingWave

Multi-tenant e-commerce analytics means serving real-time GMV, order volume, and performance metrics to hundreds of merchants simultaneously — with each merchant seeing only their own data — from a single streaming SQL platform without a separate pipeline per tenant.

Why This Matters for E-Commerce

E-commerce platforms like marketplaces, SaaS storefronts, and white-label commerce solutions serve many merchants from a single infrastructure. Each merchant wants their own analytics dashboard: current GMV, today's orders, top products, payment breakdowns. And critically, no merchant should ever see another merchant's data.

The naive approach — one pipeline per merchant — doesn't scale. With 100 merchants, you have 100 pipelines to maintain, 100 sets of batch jobs to schedule, and 100 data stores to manage. Adding a new metric means updating 100 pipelines.

The right approach is a single streaming layer with tenant isolation built in: all merchant events flow through one pipeline, and tenant isolation is enforced at the query and serving layer. This is the architecture SHOPLINE implemented with RisingWave: a single Kappa-architecture pipeline — Bronze (raw Kafka + object storage) → Silver (clean/wide tables) → Gold (serving aggregates) — that powers real-time dashboards for merchants across their platform. The result was a 76.7% reduction in API response times and a dramatically simpler operational model compared to their previous Lambda architecture.

How Streaming SQL Solves This

RisingWave is a PostgreSQL-compatible streaming database. Its materialized views can aggregate by merchant_id, providing natural tenant isolation within a single view. When a merchant's dashboard queries their metrics, a standard PostgreSQL WHERE merchant_id = ? filter applies at query time.

For stronger isolation — in regulated environments or where merchants have contractual data separation requirements — you can create per-merchant materialized views or partition sinks by merchant ID. But for most e-commerce platforms, aggregation scoping by merchant_id within a shared view is both simpler and more efficient.

Step-by-Step Tutorial

Step 1: Data Source

Ingest all order and event data from Kafka. The merchant_id field is the tenant identifier.

CREATE SOURCE platform_orders (
    order_id      VARCHAR,
    merchant_id   VARCHAR,
    customer_id   VARCHAR,
    order_value   NUMERIC,
    channel       VARCHAR,
    payment_method VARCHAR,
    product_id    VARCHAR,
    category      VARCHAR,
    order_ts      TIMESTAMPTZ,
    status        VARCHAR
)
WITH (
    connector = 'kafka',
    topic = 'platform_orders',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;

Step 2: Core Materialized View — Merchant GMV and Order Metrics

The primary merchant dashboard view: real-time GMV, order count, AOV, and channel breakdown, all scoped by merchant_id.

CREATE MATERIALIZED VIEW merchant_dashboard_metrics AS
SELECT
    merchant_id,
    COUNT(order_id)                                                AS total_orders,
    SUM(order_value)                                               AS total_gmv,
    AVG(order_value)                                               AS avg_order_value,
    COUNT(DISTINCT customer_id)                                    AS unique_customers,
    -- Today's metrics
    COUNT(order_id) FILTER (WHERE order_ts >= CURRENT_DATE)        AS orders_today,
    SUM(order_value) FILTER (WHERE order_ts >= CURRENT_DATE)       AS gmv_today,
    -- This week
    COUNT(order_id) FILTER (WHERE order_ts >= DATE_TRUNC('week', NOW())) AS orders_this_week,
    SUM(order_value) FILTER (WHERE order_ts >= DATE_TRUNC('week', NOW())) AS gmv_this_week,
    MAX(order_ts)                                                  AS last_order_ts
FROM platform_orders
WHERE status = 'completed'
GROUP BY merchant_id;

Step 3: Channel and Payment Breakdowns with Aggregation Scoping

Provide merchants with channel attribution and payment method analytics, maintaining tenant isolation through merchant_id grouping.

-- Channel breakdown per merchant (web, mobile, social, marketplace)
CREATE MATERIALIZED VIEW merchant_channel_breakdown AS
SELECT
    merchant_id,
    channel,
    COUNT(order_id)   AS orders,
    SUM(order_value)  AS gmv,
    AVG(order_value)  AS aov,
    COUNT(DISTINCT customer_id) AS unique_customers
FROM platform_orders
WHERE status = 'completed'
GROUP BY merchant_id, channel;

-- Payment method analytics per merchant
CREATE MATERIALIZED VIEW merchant_payment_analytics AS
SELECT
    merchant_id,
    payment_method,
    COUNT(order_id)       AS transaction_count,
    SUM(order_value)      AS payment_volume,
    AVG(order_value)      AS avg_transaction_value
FROM platform_orders
WHERE status = 'completed'
GROUP BY merchant_id, payment_method;

-- Top products per merchant
CREATE MATERIALIZED VIEW merchant_top_products AS
SELECT
    merchant_id,
    product_id,
    category,
    COUNT(order_id)   AS units_sold,
    SUM(order_value)  AS revenue
FROM platform_orders
WHERE status = 'completed'
GROUP BY merchant_id, product_id, category;

Step 4: Serving Layer — Sink to Merchant API Database

Push all merchant metrics to a PostgreSQL database. The application API applies a WHERE merchant_id = ? filter when serving each merchant's dashboard, enforcing tenant isolation at the serving layer.

CREATE SINK merchant_dashboard_sink
FROM merchant_dashboard_metrics
WITH (
    connector   = 'jdbc',
    jdbc.url    = 'jdbc:postgresql://apidb:5432/merchant_analytics?user=rw&password=secret',
    table.name  = 'merchant_dashboard_live',
    type        = 'upsert',
    primary_key = 'merchant_id'
);

CREATE SINK merchant_channel_sink
FROM merchant_channel_breakdown
WITH (
    connector   = 'jdbc',
    jdbc.url    = 'jdbc:postgresql://apidb:5432/merchant_analytics?user=rw&password=secret',
    table.name  = 'merchant_channel_live',
    type        = 'upsert',
    primary_key = 'merchant_id,channel'
);

Comparison Table

ApproachScalabilityIsolationComplexityFreshness
One pipeline per merchantPoor (100 pipelines = 100x ops)StrongVery highVariable
Shared batch DW with tenant columnGoodRow-level filterMediumHours
Streaming SQL with merchant_id groupingExcellentRow-level filterLowSeconds
Separate RisingWave schema per merchantGoodStrongMediumSeconds

FAQ

Q: How do I enforce that a merchant can only query their own data? Tenant isolation at the serving layer is handled by your API. The API authenticates the merchant (via JWT or session token), extracts merchant_id, and appends WHERE merchant_id = ? to every query against the analytics database. RisingWave's PostgreSQL-compatible interface makes this straightforward — standard row-level security or application-layer filtering both work.

Q: What if a merchant has extremely high order volume and needs dedicated resources? For high-volume tenants, create dedicated materialized views scoped to that merchant_id. These views will receive more frequent incremental updates but otherwise share the same RisingWave cluster. If full resource isolation is required, a dedicated RisingWave instance with that merchant's Kafka partition as source is the next step.

Q: How did SHOPLINE handle multi-tenant analytics at scale? SHOPLINE adopted a Kappa architecture with RisingWave handling a single SQL transformation layer for all merchants. The Bronze → Silver → Gold layering provides clean separation between raw ingestion, data cleaning, and serving aggregates. Merchant-scoped aggregates in the Gold layer feed the dashboard API, and the 76.7% reduction in response time reflects how much more efficient the streaming approach is compared to their previous Lambda architecture with batch and streaming paths maintained separately.

Key Takeaways

  • Multi-tenant analytics in a streaming database scales by scoping aggregations with merchant_id — one pipeline serves all merchants
  • Row-level tenant isolation is enforced at the API/serving layer, not the streaming layer — keeping the streaming logic simple
  • RisingWave's Bronze → Silver → Gold layering cleanly separates raw ingestion, data transformation, and per-merchant serving aggregates
  • Real-time GMV, channel breakdowns, and payment analytics are all expressible as standard SQL materialized views grouped by merchant
  • SHOPLINE's production architecture demonstrates this pattern at scale: a single streaming SQL layer powering merchant dashboards with 76.7% lower API latency than the previous batch approach

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.