Smart grid analytics at scale means ingesting millions of smart meter interval reads continuously, applying streaming SQL aggregations in a PostgreSQL-compatible database, and delivering distribution-level grid intelligence—voltage profiles, load balancing, outage detection—to operators in milliseconds rather than the hours or days typical of batch meter data pipelines.
Why This Matters for Energy Operators
A medium-sized utility with 2 million smart meters generates over 96 million 15-minute interval reads per day. This data contains signals that are operationally critical but only actionable if they arrive in time:
- Voltage violations that affect equipment and customer satisfaction, detectable from meter voltage readings but typically aggregated in nightly batches
- Phase imbalance causing transformer inefficiency, hidden until monthly reports
- Outage detection possible from "last gasp" signals meters transmit when power is lost, but delayed when those signals sit in a batch queue
- Theft and tampering patterns visible in consumption time-series, traditionally found weeks after the fact
The problem is not data volume—utilities already collect this data. The problem is pipeline architecture: batch ETL that treats meter data as something to be analyzed later rather than monitored continuously.
How Streaming SQL Works for Energy Data
RisingWave, a PostgreSQL-compatible streaming database, ingests smart meter data from the AMI head-end system via Kafka and maintains continuously updated materialized views for voltage profiles, load distribution, and anomaly detection. The views are queryable from any Postgres-compatible tool, making the intelligence available to GIS systems, outage management systems (OMS), and operator dashboards simultaneously.
Building the System: Step by Step
Step 1: Connect the Data Source
Ingest AMI interval data from Kafka:
CREATE SOURCE ami_reads (
meter_id VARCHAR,
customer_id VARCHAR,
transformer_id VARCHAR,
feeder_id VARCHAR,
substation_id VARCHAR,
consumption_kwh DOUBLE PRECISION,
demand_kw DOUBLE PRECISION,
voltage_v DOUBLE PRECISION,
power_factor DOUBLE PRECISION,
tamper_flag BOOLEAN,
read_ts TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'ami.interval.reads',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;
-- Last-gasp outage signals (separate topic, high priority)
CREATE SOURCE ami_outage_signals (
meter_id VARCHAR,
signal_type VARCHAR, -- 'LAST_GASP', 'POWER_RESTORE'
signal_ts TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'ami.outage.signals',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;
Step 2: Build Real-Time Aggregations
Compute transformer-level voltage profiles and load distribution:
-- Transformer-level voltage profile: 15-minute windows
CREATE MATERIALIZED VIEW transformer_voltage_profile AS
SELECT
transformer_id,
feeder_id,
window_start,
window_end,
COUNT(DISTINCT meter_id) AS meter_count,
AVG(voltage_v) AS avg_voltage_v,
MIN(voltage_v) AS min_voltage_v,
MAX(voltage_v) AS max_voltage_v,
MAX(voltage_v) - MIN(voltage_v) AS voltage_spread_v,
SUM(demand_kw) AS total_demand_kw,
AVG(power_factor) AS avg_power_factor
FROM TUMBLE(ami_reads, read_ts, INTERVAL '15' MINUTE)
GROUP BY transformer_id, feeder_id, window_start, window_end;
-- Feeder load summary
CREATE MATERIALIZED VIEW feeder_load_summary AS
SELECT
feeder_id,
substation_id,
window_start,
window_end,
SUM(demand_kw) AS total_demand_kw,
COUNT(DISTINCT meter_id) AS active_meters,
AVG(voltage_v) AS avg_voltage_v,
SUM(consumption_kwh) AS total_consumption_kwh
FROM TUMBLE(ami_reads, read_ts, INTERVAL '15' MINUTE)
GROUP BY feeder_id, substation_id, window_start, window_end;
Step 3: Detect Anomalies and Generate Alerts
Detect voltage violations, last-gasp clusters (outages), and tamper events:
-- Voltage violations
CREATE MATERIALIZED VIEW voltage_violations AS
SELECT
t.transformer_id,
t.feeder_id,
t.window_end AS violation_time,
t.avg_voltage_v,
t.min_voltage_v,
t.max_voltage_v,
CASE
WHEN t.min_voltage_v < 207 THEN 'UNDER_VOLTAGE'
WHEN t.max_voltage_v > 253 THEN 'OVER_VOLTAGE'
WHEN t.voltage_spread_v > 20 THEN 'PHASE_IMBALANCE'
ELSE NULL
END AS violation_type
FROM transformer_voltage_profile t
WHERE t.min_voltage_v < 207 OR t.max_voltage_v > 253 OR t.voltage_spread_v > 20;
-- Last-gasp outage cluster detection
CREATE MATERIALIZED VIEW outage_clusters AS
SELECT
m.transformer_id,
s.signal_ts,
COUNT(DISTINCT s.meter_id) AS affected_meters,
MIN(s.signal_ts) AS first_signal_ts,
MAX(s.signal_ts) AS last_signal_ts
FROM ami_outage_signals s
JOIN ami_reads m ON s.meter_id = m.meter_id
WHERE s.signal_type = 'LAST_GASP'
AND s.signal_ts > NOW() - INTERVAL '30' MINUTE
GROUP BY m.transformer_id, s.signal_ts
HAVING COUNT(DISTINCT s.meter_id) >= 3;
Step 4: Integrate with SCADA/EMS Downstream
Push violations and outage clusters to OMS and SCADA:
CREATE SINK voltage_violation_sink
FROM voltage_violations
WITH (
connector = 'kafka',
topic = 'grid.violations.voltage',
properties.bootstrap.server = 'kafka:9092'
) FORMAT PLAIN ENCODE JSON;
CREATE SINK outage_cluster_sink
FROM outage_clusters
WITH (
connector = 'kafka',
topic = 'grid.outages.clusters',
properties.bootstrap.server = 'kafka:9092'
) FORMAT PLAIN ENCODE JSON;
Comparison: Batch vs Streaming
| Capability | Batch Meter Analytics | Streaming SQL |
| Voltage violation detection | Nightly reports | Within 15-minute read window |
| Outage detection (last-gasp) | Delayed by batch queue | Near-real-time clustering |
| Transformer load visibility | Daily or hourly | Continuous 15-min windows |
| Tamper alert latency | Days to weeks | Within hours of read receipt |
| Scale to 2M+ meters | Multiple ETL shards | Horizontally scaled views |
| OMS integration | File-based interface | Kafka topic consumed by OMS |
| New analytics feature | New ETL job | New materialized view |
FAQ
How does RisingWave handle the volume of 2 million meters reporting every 15 minutes? RisingWave scales horizontally by adding compute nodes. Kafka partitioning distributes meter reads across parallel consumers. The incremental view maintenance engine processes only new data, not full re-scans, so throughput scales with cluster size rather than data volume.
Smart meters often report late or with gaps. How does the system handle missing reads?
Watermark-based late event handling allows RisingWave to hold window state open for a configurable period (e.g., 30 minutes) to accommodate late-arriving meter reads. Views can also include a sample_count column to flag windows with fewer reads than expected, indicating communication gaps rather than zero consumption.
Can we use this data for customer-facing applications like energy usage dashboards? Yes. Because the materialized views are queryable over the Postgres wire protocol, a customer portal application can query per-customer consumption aggregations directly, without a separate data warehouse extract. Row-level access controls ensure customers only see their own data.
Key Takeaways
- Smart grid analytics requires sub-hour latency to be operationally useful; streaming SQL provides this without replacing existing AMI head-end systems.
- Voltage violation detection, last-gasp outage clustering, and phase imbalance monitoring are expressible as SQL views over AMI data streams.
- The architecture handles millions of meters through Kafka partitioning and RisingWave's horizontal scaling.
- OMS, SCADA, and GIS systems integrate via Kafka sinks, making streaming analytics additive to the existing technology stack.
- The Postgres interface means no new query languages or custom APIs—distribution engineers work with familiar SQL tools.
Further reading:

