Real-time fault detection in power distribution networks uses a PostgreSQL-compatible streaming database to continuously evaluate protection relay events, current and voltage measurements, and SCADA state changes—correlating signals across multiple devices to identify and locate faults within seconds rather than waiting for field crew reports or manual SCADA analysis.
Why This Matters for Energy Operators
Distribution network faults are a daily operational reality. In a typical utility, a medium-voltage distribution feeder experiences several faults per year. Most are temporary (transient faults cleared by automatic reclosers) but some are sustained, requiring crew dispatch and restoration switching. The critical metric is restoration time: every minute of outage multiplies customer minutes interrupted, the primary reliability index regulators track.
The challenge is fault location. Traditional fault analysis relies on:
- Relay event records (usually reviewed post-fault)
- Customer calls to the outage management system (delayed by call volume)
- Field crew inspection (slow and expensive)
Streaming SQL adds a real-time layer: it correlates relay events, current measurements, and voltage signatures from across the network the moment they arrive, producing a fault location estimate before the first customer call reaches the call center.
How Streaming SQL Works for Energy Data
RisingWave ingests protection system events, SCADA state changes, and smart meter last-gasp signals from Kafka. Materialized views correlate these signals by location and time to identify fault patterns, estimate affected segments, and generate isolation and restoration switching sequences. The Postgres interface exposes this intelligence directly to OMS (Outage Management Systems) and mobile crew dispatch tools.
Building the System: Step by Step
Step 1: Connect the Data Source
Ingest relay events, SCADA states, and meter last-gasp signals:
-- Protection relay events (trip, reclose, lockout)
CREATE SOURCE relay_events (
relay_id VARCHAR,
feeder_id VARCHAR,
segment_id VARCHAR,
event_type VARCHAR, -- 'TRIP', 'RECLOSE', 'LOCKOUT', 'CLOSE'
fault_current_a DOUBLE PRECISION,
phase VARCHAR, -- 'A', 'B', 'C', 'ABC', 'G'
event_ts TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'protection.relay.events',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;
-- SCADA device state (breaker/switch position)
CREATE SOURCE scada_states (
device_id VARCHAR,
feeder_id VARCHAR,
segment_id VARCHAR,
device_type VARCHAR,
state VARCHAR, -- 'OPEN', 'CLOSED'
state_ts TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'scada.device.states',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;
Step 2: Build Real-Time Aggregations
Identify relay operation clusters and correlate with device states:
-- Relay trips in the last 5 minutes grouped by feeder
CREATE MATERIALIZED VIEW recent_relay_trips AS
SELECT
feeder_id,
window_start,
window_end,
COUNT(*) AS trip_count,
ARRAY_AGG(relay_id) AS tripped_relays,
ARRAY_AGG(segment_id) AS affected_segments,
MAX(fault_current_a) AS max_fault_current_a,
MODE() WITHIN GROUP (ORDER BY phase) AS dominant_phase
FROM TUMBLE(
(SELECT * FROM relay_events WHERE event_type IN ('TRIP', 'LOCKOUT')),
event_ts,
INTERVAL '5' MINUTE
)
GROUP BY feeder_id, window_start, window_end;
-- Current breaker/switch positions (latest state per device)
CREATE MATERIALIZED VIEW current_device_states AS
SELECT DISTINCT ON (device_id)
device_id,
feeder_id,
segment_id,
device_type,
state,
state_ts
FROM scada_states
ORDER BY device_id, state_ts DESC;
Step 3: Detect Anomalies and Generate Alerts
Correlate relay trips with device states to identify isolated fault segments:
CREATE MATERIALIZED VIEW fault_detections AS
SELECT
t.feeder_id,
t.window_end AS detection_time,
t.trip_count,
t.tripped_relays,
t.max_fault_current_a,
t.dominant_phase,
COUNT(DISTINCT d.segment_id) AS de_energized_segments,
ARRAY_AGG(DISTINCT d.segment_id) AS isolated_segments,
CASE
WHEN t.trip_count >= 3 THEN 'SUSTAINED_FAULT'
WHEN t.trip_count = 1 THEN 'POSSIBLE_TRANSIENT'
ELSE 'MULTIPLE_TRIP_SEQUENCE'
END AS fault_classification,
CASE
WHEN t.max_fault_current_a > 5000 THEN 'CRITICAL'
WHEN t.max_fault_current_a > 2000 THEN 'HIGH'
ELSE 'MODERATE'
END AS severity
FROM recent_relay_trips t
LEFT JOIN current_device_states d
ON t.feeder_id = d.feeder_id
AND d.state = 'OPEN'
AND d.state_ts BETWEEN t.window_start AND t.window_end + INTERVAL '2' MINUTE
GROUP BY t.feeder_id, t.window_end, t.trip_count, t.tripped_relays,
t.max_fault_current_a, t.dominant_phase;
Step 4: Integrate with SCADA/EMS Downstream
Send fault detections to OMS for work order creation and crew dispatch:
CREATE SINK fault_detection_sink
FROM fault_detections
WITH (
connector = 'kafka',
topic = 'distribution.faults',
properties.bootstrap.server = 'kafka:9092'
) FORMAT PLAIN ENCODE JSON;
OMS systems consume the topic to auto-create outage events, pre-populate affected customer counts, and suggest isolation/restoration switching sequences based on network topology.
Comparison: Batch vs Streaming
| Capability | Traditional Fault Analysis | Streaming SQL |
| Fault detection time | Minutes to hours (manual) | Seconds after relay events |
| Fault location estimate | Field crew or morning review | Automated segment correlation |
| Customer impact estimate | Manual GIS lookup | JOIN with customer-segment map |
| OMS integration | Manual data entry | Automatic Kafka consumer |
| Relay event correlation | Post-event log review | Real-time multi-relay JOIN |
| Transient vs. sustained classification | Manual | Automated trip-count logic |
| Crew dispatch information | Phone or radio | Structured JSON to mobile app |
FAQ
How does the system distinguish between a transient fault (cleared by a recloser) and a sustained fault requiring crew dispatch?
The relay event stream includes event types for TRIP, RECLOSE, and LOCKOUT. A recloser that trips and successfully recloses produces a TRIP followed by a CLOSE (or RECLOSE) event within seconds. A lockout produces a TRIP followed by no CLOSE within the observation window. The trip_count logic in the fault classification view captures this pattern.
Our protection system uses IEC 61850 GOOSE messages, not Kafka. How do we bridge this? An IEC 61850 to Kafka gateway (available from several industrial IoT vendors or implementable with open-source libraries) subscribes to GOOSE publications and republishes as structured JSON on Kafka topics. The RisingWave source schema above then applies without modification.
Can the system recommend the switching sequence for fault isolation and restoration? The streaming SQL layer identifies the fault segment and currently open devices. A topology engine (implemented as a reference table join or a separate microservice consuming the fault detection Kafka topic) can translate this into a switching sequence. This logic is typically graph-based and best implemented outside the SQL layer, consuming the fault detection output.
Key Takeaways
- Streaming SQL correlates relay events and SCADA state changes in real time, producing fault location estimates within seconds of the fault event.
- Transient vs. sustained fault classification is expressible as SQL logic based on trip count and reclose event sequences.
- Kafka sinks deliver structured fault records to OMS platforms, enabling automatic outage event creation without manual data entry.
- The architecture is additive: existing protection systems and SCADA infrastructure continue to operate unchanged, with streaming SQL as an intelligence layer above them.
- PostgreSQL compatibility means OMS vendors, GIS systems, and field crew mobile applications can query fault status and affected segment data through standard database connections.
Further reading:

