Building a Dynamic ETA Prediction System

Building a Dynamic ETA Prediction System

Static ETA estimates computed at dispatch time become less accurate every minute. A dynamic ETA system continuously revises delivery estimates as vehicles move, traffic changes, and stop completions occur. With RisingWave, a PostgreSQL-compatible streaming database, you can build this using materialized views and streaming SQL — no custom prediction service required for the data layer.

Why Dynamic ETAs Matter

ETA accuracy is a competitive differentiator in modern logistics. Customers expect tight delivery windows and real-time updates. When a retailer promises "delivered by 3 PM" and the package arrives at 4:30 PM with no notification, the customer experience suffers whether or not the carrier technically met a 4-hour window.

The root problem is static ETA computation: a route is planned at 8 AM, ETAs are calculated based on average speed and stop count, and those estimates are published to customers. Three hours later, the driver is stuck in unexpected construction, has completed fewer stops than planned, and is running 45 minutes behind — but customers still see the 8 AM estimate.

Dynamic ETA means the estimate is a live computation that updates with every GPS ping, every stop completion, and every traffic event. The customer-facing ETA reflects what is actually happening, not what was planned to happen.

How Streaming ETA Computation Works

RisingWave maintains materialized views that combine four inputs:

  1. Current vehicle position (from GPS telemetry)
  2. Remaining stops (derived from completed scan events vs. planned stops)
  3. Current speed and traffic conditions (from telematics and traffic feeds)
  4. Historical dwell time at similar stop types (as a service time estimate)

When combined, these inputs produce a continuously updated ETA for each remaining stop on each route. The computation is incremental: only the affected rows update when a new GPS ping arrives, making the system efficient even at fleet scale.

Step-by-Step Tutorial

Step 1: Set Up the Data Source

Define the streams for vehicle telemetry, stop events, and planned routes:

-- Vehicle GPS telemetry
CREATE SOURCE vehicle_telemetry (
    vehicle_id   VARCHAR,
    route_id     VARCHAR,
    event_time   TIMESTAMPTZ,
    latitude     DOUBLE PRECISION,
    longitude    DOUBLE PRECISION,
    speed_kmh    DOUBLE PRECISION,
    heading_deg  DOUBLE PRECISION,
    status       VARCHAR
)
WITH (
    connector    = 'kafka',
    topic        = 'vehicle-telemetry',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;

-- Stop completion events
CREATE SOURCE stop_completion_events (
    stop_id      VARCHAR,
    route_id     VARCHAR,
    vehicle_id   VARCHAR,
    stop_seq     INTEGER,
    event_type   VARCHAR,  -- 'ARRIVED','DEPARTED','DELIVERED','ATTEMPTED'
    event_time   TIMESTAMPTZ,
    dwell_sec    INTEGER   -- seconds spent at this stop
)
WITH (
    connector    = 'kafka',
    topic        = 'stop-events',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;

-- Planned routes with stop details
CREATE TABLE planned_routes (
    route_id          VARCHAR,
    stop_id           VARCHAR,
    stop_seq          INTEGER,
    customer_id       VARCHAR,
    address           VARCHAR,
    latitude          DOUBLE PRECISION,
    longitude         DOUBLE PRECISION,
    time_window_start TIMESTAMPTZ,
    time_window_end   TIMESTAMPTZ,  -- committed delivery window
    service_time_sec  INTEGER,      -- planned dwell time in seconds
    PRIMARY KEY (route_id, stop_seq)
);

-- Historical dwell times by stop category (for improved service time estimates)
CREATE TABLE avg_service_times (
    stop_category     VARCHAR PRIMARY KEY,  -- 'RESIDENTIAL','COMMERCIAL','APARTMENT'
    avg_dwell_sec     INTEGER,
    p90_dwell_sec     INTEGER
);

Step 2: Build the Core Materialized View

Create the live vehicle position view and a view of remaining stops per route:

-- Latest vehicle position per route
CREATE MATERIALIZED VIEW live_vehicle_positions AS
SELECT DISTINCT ON (vehicle_id)
    vehicle_id,
    route_id,
    latitude,
    longitude,
    speed_kmh,
    heading_deg,
    status,
    event_time AS last_ping
FROM vehicle_telemetry
ORDER BY vehicle_id, event_time DESC;

-- Next stop status: which stop is each route working on?
CREATE MATERIALIZED VIEW route_completion_status AS
SELECT
    r.route_id,
    r.stop_id,
    r.stop_seq,
    r.customer_id,
    r.latitude          AS stop_lat,
    r.longitude         AS stop_lon,
    r.time_window_end   AS committed_by,
    r.service_time_sec,
    -- Has this stop been completed?
    MAX(CASE WHEN e.event_type IN ('DELIVERED','ATTEMPTED') THEN 1 ELSE 0 END) AS completed
FROM planned_routes r
LEFT JOIN (
    SELECT DISTINCT ON (stop_id)
        stop_id, event_type, event_time
    FROM stop_completion_events
    ORDER BY stop_id, event_time DESC
) e ON r.stop_id = e.stop_id
GROUP BY r.route_id, r.stop_id, r.stop_seq, r.customer_id,
         r.latitude, r.longitude, r.time_window_end, r.service_time_sec;

Build the core ETA computation view — the continuous prediction for each remaining stop:

CREATE MATERIALIZED VIEW dynamic_eta_predictions AS
SELECT
    s.route_id,
    s.stop_id,
    s.stop_seq,
    s.customer_id,
    s.committed_by,
    v.latitude          AS vehicle_lat,
    v.longitude         AS vehicle_lon,
    v.speed_kmh,
    v.last_ping,
    s.stop_lat,
    s.stop_lon,
    -- Distance to stop (Haversine approximation, km)
    111.0 * SQRT(
        POWER(v.latitude  - s.stop_lat,  2) +
        POWER(v.longitude - s.stop_lon,  2) * POWER(COS(RADIANS(v.latitude)), 2)
    ) AS km_to_stop,
    -- Estimated travel time (minutes) at current speed
    CASE
        WHEN v.speed_kmh > 5
        THEN (111.0 * SQRT(
            POWER(v.latitude - s.stop_lat, 2) +
            POWER(v.longitude - s.stop_lon, 2) * POWER(COS(RADIANS(v.latitude)), 2)
        ) / v.speed_kmh) * 60
        ELSE NULL
    END AS est_travel_min,
    -- Predicted arrival time
    CASE
        WHEN v.speed_kmh > 5
        THEN NOW() + (111.0 * SQRT(
            POWER(v.latitude - s.stop_lat, 2) +
            POWER(v.longitude - s.stop_lon, 2) * POWER(COS(RADIANS(v.latitude)), 2)
        ) / v.speed_kmh) * INTERVAL '1 hour'
        ELSE NULL
    END AS predicted_arrival,
    -- ETA vs. commitment
    CASE
        WHEN v.speed_kmh > 5
        THEN EXTRACT(EPOCH FROM (
            NOW() + (111.0 * SQRT(
                POWER(v.latitude - s.stop_lat, 2) +
                POWER(v.longitude - s.stop_lon, 2) * POWER(COS(RADIANS(v.latitude)), 2)
            ) / v.speed_kmh) * INTERVAL '1 hour'
            - s.committed_by
        )) / 60
        ELSE NULL
    END AS eta_vs_commitment_min  -- positive = late
FROM route_completion_status s
JOIN live_vehicle_positions v ON s.route_id = v.route_id
WHERE s.completed = 0;  -- only pending stops

Step 3: Add ETA Alert Aggregations

Surface stops where the current ETA projects a delivery outside the committed window:

-- Stops at risk of missing their delivery window
CREATE MATERIALIZED VIEW eta_breach_alerts AS
SELECT
    route_id,
    stop_id,
    customer_id,
    committed_by,
    predicted_arrival,
    eta_vs_commitment_min,
    speed_kmh,
    km_to_stop,
    last_ping
FROM dynamic_eta_predictions
WHERE eta_vs_commitment_min > 0
ORDER BY eta_vs_commitment_min DESC;

-- ETA accuracy audit: compare predicted vs. actual for completed stops
-- (used for model calibration and reporting)
CREATE MATERIALIZED VIEW eta_accuracy_audit AS
SELECT
    window_start,
    window_end,
    COUNT(*)                                    AS completed_stops,
    AVG(ABS(dwell_sec - service_time_sec)) / 60 AS avg_service_time_error_min,
    PERCENTILE_CONT(0.9) WITHIN GROUP (
        ORDER BY ABS(dwell_sec - service_time_sec)
    ) / 60                                      AS p90_service_time_error_min
FROM TUMBLE(stop_completion_events, event_time, INTERVAL '1 hour')
JOIN planned_routes pr USING (stop_id)
WHERE event_type = 'DEPARTED'
GROUP BY window_start, window_end;

How This Compares to Traditional Approaches

AspectStatic Dispatch ETAPeriodic ETA Refresh (30 min)RisingWave Dynamic ETA
Update frequencyOnce (at dispatch)Every 30 minutesEvery GPS ping
Traffic incorporationNonePartiallyContinuous
Stop completion effect on ETANoneAt next refreshImmediate
At-risk notificationNot possibleSlow (30-min lag)Sub-second alert
Customer communication timingSingle notificationInfrequent updatesContinuous live tracking
Computational costLow (one-time)Medium (periodic)Low (incremental)

FAQ

How do I publish the predicted ETAs to customer-facing systems?

Define a Kafka sink that streams rows from dynamic_eta_predictions to a Kafka topic. A lightweight consumer service reads from that topic and pushes updates to your customer notification system (push notifications, SMS, tracking page). The frequency of customer updates can be governed by the consumer (e.g., only notify if the ETA changes by more than 5 minutes).

How do I improve accuracy for stops with many concurrent vehicles nearby?

The distance-based ETA model in this tutorial is a simplified approach. For higher accuracy, incorporate actual road network distance (from a routing API) rather than straight-line distance, and adjust for traffic by factoring in the active_traffic_disruptions view from your traffic feed. The RisingWave materialized view pattern supports temporal joins against these external data inputs.

What happens to ETA predictions when a vehicle stops for a rest break?

When speed_kmh <= 5, the predicted_arrival and est_travel_min columns return NULL, preventing misleading ETA propagation during stops. The eta_breach_alerts view only surfaces stops with valid speed data, so a vehicle at rest does not generate false alerts. You can add business logic to recompute ETAs using the last known speed when a vehicle has been stationary for more than a configurable threshold.

Key Takeaways

  • RisingWave maintains dynamic_eta_predictions as a continuously updated materialized view — every GPS ping triggers an incremental recomputation for the affected vehicle's pending stops, keeping ETAs current without any scheduled refresh.
  • The eta_breach_alerts view provides an always-current list of stops projected to miss their committed window, giving dispatch teams a prioritized intervention queue that updates in real time.
  • The ETA computation pattern scales naturally: adding more vehicles means more GPS events, which RisingWave processes incrementally without re-scanning the entire fleet.

Ready to try this yourself? Get started with RisingWave in minutes. Join our Slack community.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.