Customer Story

Siemens: From Nightly Batch Jobs to Real-Time Streaming

Siemens is one of the world's largest technology companies. They replaced nightly batch jobs with RisingWave's streaming Medallion architecture, cutting data latency from hours to seconds and infrastructure costs by over 50%.

~1000x
Faster Data
>50%
Cost Reduction
3
Streaming Layers
SQL
Replaced Scripts

The Challenge

What challenge did Siemens face with batch data processing?

Siemens relied on nightly batch jobs to process data from thousands of field devices and sensors, creating hours of latency between data generation and availability. Complex ETL script stacks across multiple languages required specialized maintenance, and dedicated scheduling clusters added significant infrastructure costs.

Business teams could only access next-day reports, limiting decision-making speed. Massive volumes of operational data were trapped in a batch bottleneck that could not keep pace with real-time business needs.

The Solution

How did Siemens implement a streaming Medallion architecture?

Working with Hivemind Technologies, Siemens built a three-layer streaming Medallion architecture using RisingWave. After evaluating Flink, Kafka Streams, and Spark Structured Streaming, they chose RisingWave for its PostgreSQL-compatible SQL, native materialized views, and built-in storage.

Bronze Layer

Raw data ingestion from Kafka topics into RisingWave sources. No disk landing, no pre-processing -- every data point is immediately available.

Silver Layer

SQL-based cleaning, deduplication, and standardization using materialized views. Unify field names, convert units, filter invalid data.

Gold Layer

Real-time business metrics computed with windowed materialized views. Serves dashboards, Iceberg tables, and alerting systems simultaneously.

-- Bronze: Raw ingestion
CREATE SOURCE sensor_raw FROM KAFKA (...)
FORMAT PLAIN ENCODE JSON;

-- Silver: Clean and standardize
CREATE MATERIALIZED VIEW sensor_cleaned AS
SELECT device_id, standardized_value, location, event_time
FROM sensor_raw WHERE value IS NOT NULL;

-- Gold: Real-time metrics
CREATE MATERIALIZED VIEW device_health AS
SELECT location, AVG(value), COUNT(*) FILTER (WHERE value > threshold)
FROM TUMBLE(sensor_cleaned, event_time, INTERVAL '5 MINUTES')
GROUP BY location, window_start;

Results

What results did Siemens achieve with RisingWave?

Siemens achieved approximately 1000x reduction in data latency, from hours to seconds, and over 50% reduction in infrastructure costs by eliminating scheduling clusters and intermediate storage. Business teams now access real-time dashboards powered by direct materialized view queries.

MetricBeforeAfter (RisingWave)
Data latencyHours (overnight batch)Seconds
Cleaning logicComplex script stacksSQL materialized views
InfrastructureScheduling clusters + intermediate storageSingle platform (>50% reduction)
Data availabilityNext-day reportsReal-time views
MaintenanceHigh (scripts, schedulers, landing zones)Low (SQL, no scheduling)

Transform Your Data Pipeline

Replace nightly batch jobs with real-time streaming in minutes.

Transform Your Data Pipeline →
Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.