Smart Grid Analytics: From Meters to Milliseconds

Smart grid analytics at scale means ingesting millions of smart meter interval reads continuously, applying streaming SQL aggregations in a PostgreSQL-compatible database, and delivering distribution-level grid intelligence—voltage profiles, load balancing, outage detection—to operators in milliseconds rather than the hours or days typical of batch meter data pipelines.

Why This Matters for Energy Operators

A medium-sized utility with 2 million smart meters generates over 96 million 15-minute interval reads per day. This data contains signals that are operationally critical but only actionable if they arrive in time:

Voltage violations that affect equipment and customer satisfaction, detectable from meter voltage readings but typically aggregated in nightly batches
Phase imbalance causing transformer inefficiency, hidden until monthly reports
Outage detection possible from "last gasp" signals meters transmit when power is lost, but delayed when those signals sit in a batch queue
Theft and tampering patterns visible in consumption time-series, traditionally found weeks after the fact

The problem is not data volume—utilities already collect this data. The problem is pipeline architecture: batch ETL that treats meter data as something to be analyzed later rather than monitored continuously.

How Streaming SQL Works for Energy Data

RisingWave, a PostgreSQL-compatible streaming database, ingests smart meter data from the AMI head-end system via Kafka and maintains continuously updated materialized views for voltage profiles, load distribution, and anomaly detection. The views are queryable from any Postgres-compatible tool, making the intelligence available to GIS systems, outage management systems (OMS), and operator dashboards simultaneously.

Building the System: Step by Step

Step 1: Connect the Data Source

Ingest AMI interval data from Kafka:

CREATE SOURCE ami_reads (
    meter_id           VARCHAR,
    customer_id        VARCHAR,
    transformer_id     VARCHAR,
    feeder_id          VARCHAR,
    substation_id      VARCHAR,
    consumption_kwh    DOUBLE PRECISION,
    demand_kw          DOUBLE PRECISION,
    voltage_v          DOUBLE PRECISION,
    power_factor       DOUBLE PRECISION,
    tamper_flag        BOOLEAN,
    read_ts            TIMESTAMPTZ
) WITH (
    connector     = 'kafka',
    topic         = 'ami.interval.reads',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;

-- Last-gasp outage signals (separate topic, high priority)
CREATE SOURCE ami_outage_signals (
    meter_id    VARCHAR,
    signal_type VARCHAR,   -- 'LAST_GASP', 'POWER_RESTORE'
    signal_ts   TIMESTAMPTZ
) WITH (
    connector     = 'kafka',
    topic         = 'ami.outage.signals',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;

Step 2: Build Real-Time Aggregations

Compute transformer-level voltage profiles and load distribution:

-- Transformer-level voltage profile: 15-minute windows
CREATE MATERIALIZED VIEW transformer_voltage_profile AS
SELECT
    transformer_id,
    feeder_id,
    window_start,
    window_end,
    COUNT(DISTINCT meter_id)    AS meter_count,
    AVG(voltage_v)              AS avg_voltage_v,
    MIN(voltage_v)              AS min_voltage_v,
    MAX(voltage_v)              AS max_voltage_v,
    MAX(voltage_v) - MIN(voltage_v) AS voltage_spread_v,
    SUM(demand_kw)              AS total_demand_kw,
    AVG(power_factor)           AS avg_power_factor
FROM TUMBLE(ami_reads, read_ts, INTERVAL '15' MINUTE)
GROUP BY transformer_id, feeder_id, window_start, window_end;

-- Feeder load summary
CREATE MATERIALIZED VIEW feeder_load_summary AS
SELECT
    feeder_id,
    substation_id,
    window_start,
    window_end,
    SUM(demand_kw)         AS total_demand_kw,
    COUNT(DISTINCT meter_id) AS active_meters,
    AVG(voltage_v)         AS avg_voltage_v,
    SUM(consumption_kwh)   AS total_consumption_kwh
FROM TUMBLE(ami_reads, read_ts, INTERVAL '15' MINUTE)
GROUP BY feeder_id, substation_id, window_start, window_end;

Step 3: Detect Anomalies and Generate Alerts

Detect voltage violations, last-gasp clusters (outages), and tamper events:

-- Voltage violations
CREATE MATERIALIZED VIEW voltage_violations AS
SELECT
    t.transformer_id,
    t.feeder_id,
    t.window_end          AS violation_time,
    t.avg_voltage_v,
    t.min_voltage_v,
    t.max_voltage_v,
    CASE
        WHEN t.min_voltage_v < 207    THEN 'UNDER_VOLTAGE'
        WHEN t.max_voltage_v > 253    THEN 'OVER_VOLTAGE'
        WHEN t.voltage_spread_v > 20  THEN 'PHASE_IMBALANCE'
        ELSE NULL
    END AS violation_type
FROM transformer_voltage_profile t
WHERE t.min_voltage_v < 207 OR t.max_voltage_v > 253 OR t.voltage_spread_v > 20;

-- Last-gasp outage cluster detection
CREATE MATERIALIZED VIEW outage_clusters AS
SELECT
    m.transformer_id,
    s.signal_ts,
    COUNT(DISTINCT s.meter_id) AS affected_meters,
    MIN(s.signal_ts)           AS first_signal_ts,
    MAX(s.signal_ts)           AS last_signal_ts
FROM ami_outage_signals s
JOIN ami_reads m ON s.meter_id = m.meter_id
WHERE s.signal_type = 'LAST_GASP'
  AND s.signal_ts > NOW() - INTERVAL '30' MINUTE
GROUP BY m.transformer_id, s.signal_ts
HAVING COUNT(DISTINCT s.meter_id) >= 3;

Step 4: Integrate with SCADA/EMS Downstream

Push violations and outage clusters to OMS and SCADA:

CREATE SINK voltage_violation_sink
FROM voltage_violations
WITH (
    connector = 'kafka',
    topic     = 'grid.violations.voltage',
    properties.bootstrap.server = 'kafka:9092'
) FORMAT PLAIN ENCODE JSON;

CREATE SINK outage_cluster_sink
FROM outage_clusters
WITH (
    connector = 'kafka',
    topic     = 'grid.outages.clusters',
    properties.bootstrap.server = 'kafka:9092'
) FORMAT PLAIN ENCODE JSON;

Comparison: Batch vs Streaming

Capability	Batch Meter Analytics	Streaming SQL
Voltage violation detection	Nightly reports	Within 15-minute read window
Outage detection (last-gasp)	Delayed by batch queue	Near-real-time clustering
Transformer load visibility	Daily or hourly	Continuous 15-min windows
Tamper alert latency	Days to weeks	Within hours of read receipt
Scale to 2M+ meters	Multiple ETL shards	Horizontally scaled views
OMS integration	File-based interface	Kafka topic consumed by OMS
New analytics feature	New ETL job	New materialized view

FAQ

How does RisingWave handle the volume of 2 million meters reporting every 15 minutes? RisingWave scales horizontally by adding compute nodes. Kafka partitioning distributes meter reads across parallel consumers. The incremental view maintenance engine processes only new data, not full re-scans, so throughput scales with cluster size rather than data volume.

Smart meters often report late or with gaps. How does the system handle missing reads? Watermark-based late event handling allows RisingWave to hold window state open for a configurable period (e.g., 30 minutes) to accommodate late-arriving meter reads. Views can also include a sample_count column to flag windows with fewer reads than expected, indicating communication gaps rather than zero consumption.

Can we use this data for customer-facing applications like energy usage dashboards? Yes. Because the materialized views are queryable over the Postgres wire protocol, a customer portal application can query per-customer consumption aggregations directly, without a separate data warehouse extract. Row-level access controls ensure customers only see their own data.

Key Takeaways

Smart grid analytics requires sub-hour latency to be operationally useful; streaming SQL provides this without replacing existing AMI head-end systems.
Voltage violation detection, last-gasp outage clustering, and phase imbalance monitoring are expressible as SQL views over AMI data streams.
The architecture handles millions of meters through Kafka partitioning and RisingWave's horizontal scaling.
OMS, SCADA, and GIS systems integrate via Kafka sinks, making streaming analytics additive to the existing technology stack.
The Postgres interface means no new query languages or custom APIs—distribution engineers work with familiar SQL tools.

Further reading: