How to Track Real-Time Return Rates in E-Commerce

How to Track Real-Time Return Rates in E-Commerce

Real-time return rate tracking means knowing, within seconds of a return being initiated, which SKUs are being sent back, what reasons customers are citing, and how much refund liability you're accumulating — without waiting for end-of-day batch reports.

Why This Matters for E-Commerce

Returns are one of the most expensive operational costs in e-commerce, typically running 15-40% of revenue in apparel and electronics. The damage compounds when problems aren't detected quickly.

Consider a product with a manufacturing defect. In a batch analytics environment, the elevated return rate might not surface until the next morning's report — after hundreds more units have shipped. With real-time return rate monitoring, the spike appears within minutes. Your operations team can halt shipments, alert the supplier, and begin triage before the problem scales.

Beyond defect detection, real-time return analytics enable:

  • Fraud detection: unusual return patterns by customer or SKU flagged immediately
  • Inventory management: restocking status updated as returned items are inspected
  • Refund liability: live view of pending refund amounts for finance teams
  • Carrier performance: return shipping time and delivery confirmation in real time

The return flow generates multiple events: return request (RMA creation), item received at warehouse, inspection complete, restocking decision, and refund issued. Each stage is a signal. Streaming SQL lets you track every stage continuously.

How Streaming SQL Solves This

RisingWave, a PostgreSQL-compatible streaming database, ingests return events from Kafka as they occur and maintains materialized views for return rates by SKU, category, return reason, and time window. Unlike batch jobs that reprocess everything periodically, RisingWave updates only the affected aggregates when new events arrive.

This means your return rate dashboard always reflects the current state. When a return is initiated, the SKU's return rate updates immediately. When the item is restocked, the restocking metric updates. The entire return lifecycle is tracked in real time.

Step-by-Step Tutorial

Step 1: Data Source

Create a source for return events from Kafka. Include the RMA ID, original order ID, SKU, return reason, and each stage of the return lifecycle.

CREATE SOURCE return_events (
    rma_id          VARCHAR,
    order_id        VARCHAR,
    customer_id     VARCHAR,
    sku             VARCHAR,
    category        VARCHAR,
    return_reason   VARCHAR,   -- 'defective', 'wrong_item', 'not_as_described', 'changed_mind', 'sizing'
    event_type      VARCHAR,   -- 'initiated', 'received', 'inspected', 'restocked', 'refunded', 'rejected'
    refund_amount   NUMERIC,
    restocking_status VARCHAR,
    event_ts        TIMESTAMPTZ
)
WITH (
    connector = 'kafka',
    topic = 'return_events',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;

-- Also source the original orders to compute return rates
CREATE SOURCE orders_source (
    order_id    VARCHAR,
    sku         VARCHAR,
    category    VARCHAR,
    quantity    INT,
    order_ts    TIMESTAMPTZ,
    status      VARCHAR
)
WITH (
    connector = 'kafka',
    topic = 'orders',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;

Step 2: Core Materialized View — Return Rates by SKU

Compute return counts, refund totals, and return rates per SKU. Join against order counts to calculate the return rate percentage.

CREATE MATERIALIZED VIEW sku_return_metrics AS
SELECT
    r.sku,
    r.category,
    COUNT(DISTINCT r.rma_id)                                        AS return_count,
    COUNT(DISTINCT r.rma_id) FILTER (WHERE r.event_type = 'refunded') AS refunded_count,
    COUNT(DISTINCT r.rma_id) FILTER (WHERE r.event_type = 'restocked') AS restocked_count,
    SUM(r.refund_amount) FILTER (WHERE r.event_type = 'refunded')   AS total_refund_amount,
    o.order_count,
    COUNT(DISTINCT r.rma_id)::NUMERIC
        / NULLIF(o.order_count, 0) * 100                            AS return_rate_pct
FROM return_events r
LEFT JOIN (
    SELECT sku, COUNT(DISTINCT order_id) AS order_count
    FROM orders_source
    WHERE status = 'completed'
    GROUP BY sku
) o USING (sku)
WHERE r.event_type IN ('initiated', 'received', 'inspected', 'restocked', 'refunded')
GROUP BY r.sku, r.category, o.order_count;

Step 3: Return Reason Breakdown and Anomaly Detection

Track return reasons in real time and flag SKUs with anomalously high return rates.

-- Return reasons by SKU and category
CREATE MATERIALIZED VIEW return_reason_breakdown AS
SELECT
    sku,
    category,
    return_reason,
    COUNT(*) AS reason_count,
    SUM(refund_amount) AS reason_refund_total
FROM return_events
WHERE event_type = 'initiated'
GROUP BY sku, category, return_reason;

-- Hourly return rate window for spike detection
CREATE MATERIALIZED VIEW return_rate_hourly AS
SELECT
    window_start,
    window_end,
    sku,
    category,
    COUNT(DISTINCT rma_id) AS returns_in_window,
    SUM(refund_amount)     AS refunds_in_window
FROM TUMBLE(
    return_events,
    event_ts,
    INTERVAL '1 HOUR'
)
WHERE event_type = 'initiated'
GROUP BY window_start, window_end, sku, category;

Step 4: Serving Layer — Sink Return Metrics to App DB

Push return rate data to a PostgreSQL operations database for dashboards, alerting, and supplier communication tools.

CREATE SINK sku_return_metrics_sink
FROM sku_return_metrics
WITH (
    connector   = 'jdbc',
    jdbc.url    = 'jdbc:postgresql://opsdb:5432/ecommerce?user=rw&password=secret',
    table.name  = 'sku_return_metrics_live',
    type        = 'upsert',
    primary_key = 'sku'
);

CREATE SINK return_reason_sink
FROM return_reason_breakdown
WITH (
    connector   = 'jdbc',
    jdbc.url    = 'jdbc:postgresql://opsdb:5432/ecommerce?user=rw&password=secret',
    table.name  = 'return_reason_live',
    type        = 'upsert',
    primary_key = 'sku,return_reason'
);

Comparison Table

MetricBatch ApproachStreaming SQL Approach
Return rate updateNightlyWithin seconds of return event
Defect spike detectionNext business dayWithin minutes of first returns
Refund liability viewEnd of dayContinuously updated
Restocking statusManual checkAuto-updated from warehouse events
Fraud pattern detectionRetroactiveReal-time alerting

FAQ

Q: How do I track RMA lifecycle stages (initiated → received → restocked → refunded)? Model each stage as a separate event type in your Kafka topic. The materialized view filters by event_type to count items at each stage. This gives you a live funnel view of your entire returns operation — how many RMAs are initiated, received, awaiting inspection, restocked, or refunded at any moment.

Q: Can I alert when a SKU's return rate exceeds a threshold? Yes. Create a materialized view that filters sku_return_metrics where return_rate_pct > threshold, then sink that view to a Kafka topic or webhook. Connect the sink to PagerDuty, Slack, or your alerting system for immediate notification.

Q: How does this integrate with warehouse management systems (WMS)? The restocking_status field in return events should be published by your WMS when items complete inspection. If your WMS uses a relational database, use RisingWave's CDC connector to stream change events directly rather than requiring a Kafka intermediary.

Key Takeaways

  • Real-time return rate tracking enables defect detection in minutes rather than the next day's batch report
  • RMA lifecycle events (initiated, received, inspected, restocked, refunded) can each be tracked as a distinct event type in a single Kafka topic
  • Streaming SQL aggregates return rates by SKU, category, and return reason continuously without reprocessing historical data
  • Hourly windowed views enable spike detection and can trigger automated alerts to operations teams
  • SHOPLINE's architecture — streaming all commerce events through a single SQL transformation layer — applies equally well to return analytics, unifying return metrics with GMV and order volume on the same platform

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.