Real-Time Freight Tracking with Streaming SQL

Real-Time Freight Tracking with Streaming SQL

Real-time freight tracking means knowing exactly where every shipment is, right now — not five minutes ago. With RisingWave, a PostgreSQL-compatible streaming database, you can ingest GPS and IoT data from trucks and containers and query up-to-the-second positions using standard SQL.

Why Freight Visibility Matters

The freight and logistics industry loses billions of dollars annually to inefficiencies caused by poor shipment visibility. When a consignment goes dark for even 30 minutes, dispatchers must make decisions based on stale data — rerouting trucks unnecessarily, missing time-sensitive delivery windows, or failing to meet customer SLAs.

Traditional tracking systems rely on batch ETL pipelines that aggregate GPS pings every few minutes. By the time data reaches a dashboard, it reflects history rather than reality. A truck that left the expected corridor 20 minutes ago shows up as still on-route, and corrective action happens too late.

The solution is a streaming-first architecture where position data flows continuously from vehicle telematics into a live query engine. RisingWave processes this stream in real time, maintaining always-fresh materialized views that any dashboard or alert system can query instantly.

How Streaming Freight Tracking Works

RisingWave ingests raw telematics events — GPS coordinates, speed, heading, and vehicle state — from a message bus like Apache Kafka. As each event arrives, RisingWave incrementally updates pre-defined materialized views. Downstream systems query those views using ordinary PostgreSQL-compatible SQL and always receive the latest state without triggering expensive full-table recomputations.

The key concepts at play are:

  • Kafka source: a connector that continuously reads telematics events from a Kafka topic
  • Materialized views: persistent query results that RisingWave keeps current as new data arrives
  • Temporal joins: joining a live stream against a slowly-changing reference table (e.g., shipment manifests) as of event time
  • Window functions: aggregating events over tumbling or hopping time windows to compute metrics like average speed per route segment

Step-by-Step Tutorial

Step 1: Set Up the Data Source

Connect RisingWave to the Kafka topic that carries vehicle telemetry. Each message contains a JSON payload with the vehicle ID, timestamp, GPS coordinates, and speed.

CREATE SOURCE vehicle_telemetry (
    vehicle_id     VARCHAR,
    shipment_id    VARCHAR,
    event_time     TIMESTAMPTZ,
    latitude       DOUBLE PRECISION,
    longitude      DOUBLE PRECISION,
    speed_kmh      DOUBLE PRECISION,
    heading_deg    DOUBLE PRECISION,
    status         VARCHAR
)
WITH (
    connector     = 'kafka',
    topic         = 'vehicle-telemetry',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;

Also define the reference table for shipments so you can join live events against manifest data:

CREATE TABLE shipments (
    shipment_id      VARCHAR PRIMARY KEY,
    origin_city      VARCHAR,
    destination_city VARCHAR,
    expected_arrival TIMESTAMPTZ,
    customer_id      VARCHAR,
    cargo_type       VARCHAR
);

Step 2: Build the Core Materialized View

Create a materialized view that maintains the latest known position and status for every active shipment. RisingWave incrementally updates this view with sub-second latency as new telemetry events arrive.

CREATE MATERIALIZED VIEW live_shipment_positions AS
SELECT
    t.shipment_id,
    s.origin_city,
    s.destination_city,
    s.expected_arrival,
    s.customer_id,
    t.latitude,
    t.longitude,
    t.speed_kmh,
    t.status,
    t.event_time                          AS last_seen,
    NOW() - t.event_time                  AS data_age
FROM (
    SELECT DISTINCT ON (shipment_id)
        shipment_id,
        latitude,
        longitude,
        speed_kmh,
        status,
        event_time
    FROM vehicle_telemetry
    ORDER BY shipment_id, event_time DESC
) t
JOIN shipments s ON t.shipment_id = s.shipment_id;

Dashboards query this view with a simple SELECT * FROM live_shipment_positions and receive results that are accurate to within milliseconds of the most recent telemetry ping.

Step 3: Add Alerts and Aggregations

Build a second materialized view that flags shipments that have gone silent for more than 10 minutes — a common indicator of a GPS failure, vehicle breakdown, or route deviation.

CREATE MATERIALIZED VIEW stale_shipment_alerts AS
SELECT
    shipment_id,
    origin_city,
    destination_city,
    last_seen,
    EXTRACT(EPOCH FROM (NOW() - last_seen)) / 60 AS minutes_since_update,
    customer_id
FROM live_shipment_positions
WHERE NOW() - last_seen > INTERVAL '10 minutes'
  AND status NOT IN ('DELIVERED', 'CANCELLED');

For route-level throughput reporting, use a tumbling window to compute average speed per shipment per 5-minute bucket:

CREATE MATERIALIZED VIEW shipment_speed_windows AS
SELECT
    shipment_id,
    window_start,
    window_end,
    AVG(speed_kmh)   AS avg_speed_kmh,
    MIN(speed_kmh)   AS min_speed_kmh,
    COUNT(*)         AS ping_count
FROM TUMBLE(vehicle_telemetry, event_time, INTERVAL '5 minutes')
GROUP BY shipment_id, window_start, window_end;

These views can feed directly into a BI tool, a notification webhook, or a downstream Kafka sink for further event processing.

How This Compares to Traditional Approaches

AspectBatch ETL PipelineRisingWave Streaming SQL
Data freshnessMinutes to hoursSub-second
Query modelScheduled jobs, full recomputeIncremental, always-current views
Alert latencyNext batch runMilliseconds after event
SQL compatibilityVaries by toolFull PostgreSQL-compatible SQL
Operational complexityHigh (orchestrators, schedulers)Low (single streaming database)
Scaling GPS pingsRequires separate stream processorNative stream ingestion

Traditional architectures typically combine a stream processor (Flink, Spark Streaming) with a separate serving database, requiring teams to maintain two systems with different APIs. RisingWave collapses both roles into a single PostgreSQL-compatible interface.

FAQ

Can RisingWave handle millions of GPS pings per second?

RisingWave is designed for high-throughput streaming workloads and scales horizontally by adding compute nodes. The Kafka source connector supports parallel consumption across multiple partitions, allowing ingestion rates to grow with your fleet size.

How do I connect a BI tool like Grafana or Tableau to these views?

Because RisingWave is PostgreSQL-compatible, any tool with a PostgreSQL driver can connect directly. Point Grafana at the RisingWave endpoint, use the PostgreSQL data source plugin, and query live_shipment_positions or any other materialized view just like a regular table.

What happens if a Kafka partition goes offline temporarily?

RisingWave tracks consumption offsets and resumes from the last committed position when connectivity is restored. Materialized views remain queryable during the gap and update automatically once the backlog is processed, maintaining consistency guarantees.

Key Takeaways

  • RisingWave ingests vehicle telemetry from Kafka and maintains always-fresh materialized views with sub-second latency, eliminating the stale-data problem of batch pipelines.
  • Standard PostgreSQL-compatible SQL is all you need to define sources, build tracking views, and create alerting logic — no separate stream processing framework required.
  • Combining DISTINCT ON for latest-position tracking with TUMBLE window aggregations gives you both a live map layer and historical speed analytics from the same streaming database.

Ready to try this yourself? Get started with RisingWave in minutes. Join our Slack community.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.