Real-Time Cart Abandonment Detection with Streaming SQL

Real-Time Cart Abandonment Detection with Streaming SQL

Cart abandonment recovery works best when you reach the customer within minutes of them leaving — not hours later through a batch email campaign. With RisingWave, a PostgreSQL-compatible streaming database, you can detect cart abandonment in real time using session windows, trigger downstream recovery workflows instantly, and track funnel conversion rates continuously.

The Problem: Batch Abandonment Recovery Is Too Slow

The average e-commerce cart abandonment rate exceeds 70%. The standard recovery approach — a batch job that identifies abandoned carts every hour and sends a recovery email — suffers from a fundamental timing problem: by the time the email arrives, the customer is deep in competitor checkout.

Research consistently shows that abandonment recovery conversion rates decay rapidly with time. A recovery message sent within 15 minutes of abandonment performs dramatically better than one sent an hour later. Batch pipelines structurally cannot operate in this window.

Beyond timing, batch abandonment detection has other weaknesses:

Funnel blind spots: Batch jobs typically check for "cart created but no order" after a fixed time threshold. They miss nuances like customers who reach checkout, start entering payment information, and then leave — the highest-intent abandonment stage.

No real-time segmentation: A customer abandoning a $400 high-margin electronics item deserves a different recovery strategy than someone abandoning a $12 purchase. Real-time detection enables real-time segmentation and routing.

Session fragmentation: Modern shoppers switch between devices. Batch attribution can't easily link a mobile cart session to a desktop purchase that completes later, leading to both false abandonment flags and missed recoveries.

How Streaming SQL Solves This

RisingWave's session windows group events by inactivity gap rather than fixed time boundaries. A cart session closes when there has been no activity for (say) 30 minutes. RisingWave detects this automatically — you define the inactivity timeout in SQL, and the system handles the per-session timer logic.

When a session closes without a checkout_complete event, it is flagged as abandoned. The detection happens within minutes of the session closing — enabling recovery workflows while the customer is still at their desk.

Building Real-Time Cart Abandonment Detection

Step 1: Data Source Setup

Cart abandonment detection requires session events — page views, add-to-cart actions, checkout steps, and order completion — alongside the order stream for join-based confirmation:

CREATE SOURCE orders (
    order_id        VARCHAR,
    merchant_id     VARCHAR,
    customer_id     VARCHAR,
    channel         VARCHAR,        -- 'web', 'mobile', 'pos', 'marketplace'
    payment_method  VARCHAR,        -- 'card', 'alipay', 'paypal', 'cod'
    subtotal        NUMERIC,
    discount        NUMERIC,
    total           NUMERIC,
    status          VARCHAR,
    created_at      TIMESTAMPTZ
) WITH (
    connector = 'kafka',
    topic = 'ecommerce.orders',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;

CREATE SOURCE session_events (
    event_id        VARCHAR,
    session_id      VARCHAR,
    customer_id     VARCHAR,
    merchant_id     VARCHAR,
    channel         VARCHAR,
    event_type      VARCHAR,  -- 'page_view', 'add_to_cart', 'remove_from_cart',
                              --  'checkout_start', 'payment_entry', 'checkout_complete'
    sku_id          VARCHAR,
    cart_value      NUMERIC,  -- current cart value at event time
    occurred_at     TIMESTAMPTZ
) WITH (
    connector = 'kafka',
    topic = 'ecommerce.sessions',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;

Step 2: Core Materialized View

Use session windows to group all events within a single continuous browsing session. A session closes after 30 minutes of inactivity:

-- Session summary: furthest funnel stage reached and cart value
CREATE MATERIALIZED VIEW mv_session_summary AS
SELECT
    session_id,
    customer_id,
    merchant_id,
    channel,
    WINDOW_START                                AS session_start,
    WINDOW_END                                  AS session_end,
    MAX(cart_value)                             AS peak_cart_value,
    MAX(CASE event_type
        WHEN 'checkout_complete' THEN 5
        WHEN 'payment_entry'     THEN 4
        WHEN 'checkout_start'    THEN 3
        WHEN 'add_to_cart'       THEN 2
        WHEN 'page_view'         THEN 1
        ELSE 0
    END)                                        AS max_funnel_stage,
    BOOL_OR(event_type = 'checkout_complete')   AS converted,
    COUNT(*)                                    AS event_count
FROM SESSION(session_events, occurred_at, INTERVAL '30 MINUTES')
GROUP BY session_id, customer_id, merchant_id, channel,
         WINDOW_START, WINDOW_END;

Step 3: Aggregations — Abandonment Classification

Classify sessions by abandonment stage. High-intent abandonment (checkout started, payment entry started) warrants immediate recovery; low-intent abandonment (just browsed) may not:

-- Abandoned sessions with funnel stage classification
CREATE MATERIALIZED VIEW mv_cart_abandonments AS
SELECT
    session_id,
    customer_id,
    merchant_id,
    channel,
    session_start,
    session_end,
    peak_cart_value,
    max_funnel_stage,
    CASE max_funnel_stage
        WHEN 4 THEN 'payment_abandoned'   -- highest intent
        WHEN 3 THEN 'checkout_abandoned'
        WHEN 2 THEN 'cart_abandoned'
        WHEN 1 THEN 'browse_abandoned'    -- lowest intent
        ELSE 'unknown'
    END                                         AS abandonment_type,
    -- Recovery priority score
    CASE
        WHEN max_funnel_stage = 4 AND peak_cart_value >= 100 THEN 'high'
        WHEN max_funnel_stage >= 3                            THEN 'medium'
        ELSE 'low'
    END                                         AS recovery_priority
FROM mv_session_summary
WHERE NOT converted
  AND max_funnel_stage >= 2;  -- at least added to cart

-- Hourly abandonment rate by merchant and channel
CREATE MATERIALIZED VIEW mv_abandonment_rate_hourly AS
SELECT
    merchant_id,
    channel,
    WINDOW_START                                AS hour_start,
    COUNT(*)                                    AS total_sessions,
    SUM(CASE WHEN NOT converted
             AND max_funnel_stage >= 2
             THEN 1 ELSE 0 END)               AS abandonments,
    ROUND(
        SUM(CASE WHEN NOT converted
                 AND max_funnel_stage >= 2
                 THEN 1 ELSE 0 END)::NUMERIC /
        NULLIF(COUNT(*), 0) * 100,
        2
    )                                           AS abandonment_rate_pct,
    ROUND(
        SUM(CASE WHEN converted THEN 1 ELSE 0 END)::NUMERIC /
        NULLIF(SUM(CASE WHEN max_funnel_stage >= 2
                        THEN 1 ELSE 0 END), 0) * 100,
        2
    )                                           AS cart_to_order_rate_pct
FROM TUMBLE(mv_session_summary, session_start, INTERVAL '1 HOUR')
GROUP BY merchant_id, channel, WINDOW_START;

Step 4: Downstream / Serving

Trigger recovery workflows immediately and update analytics dashboards:

-- High-priority abandonments to recovery workflow topic
CREATE SINK sink_recovery_triggers
FROM mv_cart_abandonments
WITH (
    connector = 'kafka',
    properties.bootstrap.server = 'kafka:9092',
    topic = 'recovery.cart_abandonments'
) FORMAT PLAIN ENCODE JSON;

-- Hourly abandonment rates to application DB
CREATE SINK sink_abandonment_analytics
FROM mv_abandonment_rate_hourly
WITH (
    connector = 'jdbc',
    jdbc.url = 'jdbc:postgresql://app-db:5432/analytics',
    table.name = 'abandonment_rate_hourly',
    type = 'upsert',
    primary_key = 'merchant_id,channel,hour_start'
);

The recovery workflow (email, push notification, retargeting) consumes the recovery.cart_abandonments Kafka topic and acts on each abandonment event within minutes of the session closing.

Cart Abandonment: Batch vs. Real-Time

DimensionHourly Batch RecoveryReal-Time Streaming SQL
Detection lag30-60 minutesWithin minutes of session close
Funnel stage awarenessCart exists vs. no orderpayment_abandoned / checkout_abandoned / cart_abandoned
Recovery timingEmail sent hours laterRecovery triggered minutes after abandonment
SegmentationBasic (cart value threshold)Full (funnel stage, cart value, channel, priority)
Cart-to-order rate trackingHistorical reportContinuously updated per merchant
A/B test visibilityNext dayReal-time conversion rate by variant

FAQ

Q: How does the SESSION window handle customers who close and reopen their browser within 30 minutes? If events resume within the 30-minute inactivity gap, RisingWave extends the existing session window rather than starting a new one. The session only closes (and abandonment is detected) after 30 consecutive minutes with no new events for that session ID.

Q: Can I link a mobile abandonment to a desktop completion? The SESSION window groups by session_id. If your session management assigns a common customer-level session ID across devices (e.g., tied to a logged-in customer ID), cross-device linking works natively. If sessions are device-local, you need a separate resolution step that links sessions to customer IDs before the abandonment classification.

Q: How do I avoid sending recovery emails to customers who complete purchase from a different session? Join the mv_cart_abandonments view to the orders stream. If an order event arrives for the same customer_id within (say) 2 hours of the abandonment, suppress the recovery trigger. Model this as a temporal join or an anti-join in a downstream view.

Key Takeaways

  • Session windows in RisingWave group events by inactivity gap — a session closes automatically after 30 minutes without new events, enabling abandonment detection within minutes.
  • Funnel stage classification (browse → cart → checkout → payment) enables tiered recovery strategies with different urgency levels.
  • The cart-to-order rate, continuously computed via the hourly abandonment view, is a leading indicator of checkout experience quality.
  • Sending abandonment events to a Kafka topic enables downstream recovery systems to act within the high-conversion first-15-minute window.
  • Real-time cart value and funnel stage data enable personalization of recovery messages at the moment they are triggered.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.