Building a Smart Water Network Leak Detection System

Building a Smart Water Network Leak Detection System

Water distribution networks lose an estimated 20–30% of treated water to leaks every year. A streaming SQL pipeline built on RisingWave can correlate pressure readings from hundreds of IoT sensors, compute leak scores continuously, and trigger alerts within seconds — turning reactive repairs into proactive prevention.

Why Water Network Leak Detection Matters

Undetected leaks in irrigation and municipal water networks waste billions of litres annually and erode pipe infrastructure over time. Traditional SCADA systems batch sensor data at 15-minute intervals, which means a pipe rupture can go unnoticed for hours. High-resolution pressure telemetry (sampled every few seconds) contains all the signal needed to detect leaks early — the challenge is processing it fast enough to act.

Key indicators of a leak event include:

  • Sustained pressure drop below a baseline threshold at a sensor node (measured in bar or PSI)
  • Differential pressure anomaly between adjacent nodes that exceeds normal hydraulic gradient
  • Night-flow elevation — higher-than-expected flow when consumption should be minimal
  • Acoustic correlation scores from pipe vibration sensors

A streaming database processes this signal continuously, so the gap between "leak starts" and "operator is alerted" collapses from hours to seconds.

How Streaming SQL Solves This

RisingWave is a PostgreSQL-compatible streaming database. You write standard SQL — CREATE MATERIALIZED VIEW, window functions, JOIN — and RisingWave executes it as an always-running streaming job. Results are stored in materialized views that are incrementally updated as new sensor events arrive, with no batch jobs or external stream processors required.

This makes it straightforward to:

  1. Ingest pressure and flow readings from MQTT/Kafka topics
  2. Compute rolling averages and differentials with time-windowed aggregations
  3. Derive a per-segment leak score from multiple signals
  4. Emit alerts to an operations dashboard or PagerDuty sink

Step-by-Step Tutorial

Step 1: Data Source

Create a source that reads pressure telemetry from a Kafka topic. Each message carries a sensor ID, the pipe segment it monitors, a pressure reading in bar, and a flow rate in litres per minute.

CREATE SOURCE pressure_readings (
    sensor_id       VARCHAR,
    segment_id      VARCHAR,
    pressure_bar    DOUBLE PRECISION,
    flow_lpm        DOUBLE PRECISION,
    event_time      TIMESTAMPTZ
)
WITH (
    connector      = 'kafka',
    topic          = 'water.pressure.telemetry',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON
( TIMESTAMP COLUMN = event_time );

Step 2: Core View

Compute a 5-minute rolling baseline for each segment and derive a leak score. The leak score combines a pressure-drop ratio with a flow-excess ratio. A score above 1.0 is a candidate leak event.

CREATE MATERIALIZED VIEW segment_baselines AS
SELECT
    segment_id,
    window_start,
    window_end,
    AVG(pressure_bar)                          AS avg_pressure,
    STDDEV(pressure_bar)                       AS stddev_pressure,
    AVG(flow_lpm)                              AS avg_flow,
    COUNT(*)                                   AS reading_count
FROM TUMBLE(pressure_readings, event_time, INTERVAL '5 MINUTES')
GROUP BY segment_id, window_start, window_end;

CREATE MATERIALIZED VIEW leak_scores AS
SELECT
    r.sensor_id,
    r.segment_id,
    r.event_time,
    r.pressure_bar,
    r.flow_lpm,
    b.avg_pressure,
    b.avg_flow,
    -- Pressure drop ratio: how far below baseline is current reading?
    CASE
        WHEN b.avg_pressure > 0
        THEN GREATEST(0.0, (b.avg_pressure - r.pressure_bar) / b.avg_pressure)
        ELSE 0.0
    END AS pressure_drop_ratio,
    -- Flow excess ratio: how much higher than baseline is current flow?
    CASE
        WHEN b.avg_flow > 0
        THEN GREATEST(0.0, (r.flow_lpm - b.avg_flow) / b.avg_flow)
        ELSE 0.0
    END AS flow_excess_ratio,
    -- Composite leak score
    CASE
        WHEN b.avg_pressure > 0 AND b.avg_flow > 0
        THEN (
            GREATEST(0.0, (b.avg_pressure - r.pressure_bar) / b.avg_pressure) * 0.6 +
            GREATEST(0.0, (r.flow_lpm - b.avg_flow) / b.avg_flow)              * 0.4
        )
        ELSE 0.0
    END AS leak_score
FROM pressure_readings r
LEFT JOIN segment_baselines b
    ON  r.segment_id  = b.segment_id
    AND r.event_time >= b.window_start
    AND r.event_time <  b.window_end;

Step 3: Alerts and Sinks

Flag readings where the leak score crosses the threshold, then push alerts to a Kafka topic consumed by the operations platform.

CREATE MATERIALIZED VIEW leak_alerts AS
SELECT
    sensor_id,
    segment_id,
    event_time,
    pressure_bar,
    flow_lpm,
    leak_score,
    CASE
        WHEN leak_score >= 0.5 THEN 'CRITICAL'
        WHEN leak_score >= 0.25 THEN 'WARNING'
        ELSE 'INFO'
    END AS severity
FROM leak_scores
WHERE leak_score >= 0.25;

CREATE SINK leak_alerts_sink
FROM leak_alerts
WITH (
    connector  = 'kafka',
    topic      = 'water.leak.alerts',
    properties.bootstrap.server = 'kafka:9092'
)
FORMAT PLAIN ENCODE JSON;

Comparison Table

ApproachDetection LatencySetup ComplexityScalability
Manual SCADA inspectionHoursLowPoor
Batch ETL + BI dashboard15–60 minMediumMedium
Custom Flink/Spark jobSecondsHighHigh
RisingWave streaming SQLSecondsLowHigh

FAQ

Q: How many sensors can RisingWave handle? RisingWave scales horizontally. A single cluster can ingest hundreds of thousands of sensor events per second. For a typical municipal water network with a few thousand sensors reporting every 10 seconds, this is well within a modest deployment.

Q: Can I use PSI instead of bar? Yes. The SQL is unit-agnostic. Store whichever unit your sensors emit and adjust your threshold constants accordingly (1 bar ≈ 14.5 PSI).

Q: What if a sensor goes offline and stops sending data? Use a separate materialized view that checks MAX(event_time) per sensor over a tumbling window. If a sensor hasn't reported for more than N minutes, emit a "sensor offline" alert — a missing signal can itself indicate a catastrophic pressure failure.

Key Takeaways

  • RisingWave turns raw IoT pressure telemetry into actionable leak scores with plain SQL — no custom stream-processing code required.
  • A composite leak score combining pressure drop and flow excess gives better signal-to-noise than either metric alone.
  • Alerts reach operators in seconds rather than hours, enabling crews to isolate pipe segments before significant water loss or infrastructure damage occurs.
  • The same pipeline scales from a small irrigation network to a large municipal distribution system with no architectural changes.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.