Customer Story
Siemens is one of the world's largest technology companies. They replaced nightly batch jobs with RisingWave's streaming Medallion architecture, cutting data latency from hours to seconds and infrastructure costs by over 50%.
The Challenge
Siemens relied on nightly batch jobs to process data from thousands of field devices and sensors, creating hours of latency between data generation and availability. Complex ETL script stacks across multiple languages required specialized maintenance, and dedicated scheduling clusters added significant infrastructure costs.
Business teams could only access next-day reports, limiting decision-making speed. Massive volumes of operational data were trapped in a batch bottleneck that could not keep pace with real-time business needs.
The Solution
Working with Hivemind Technologies, Siemens built a three-layer streaming Medallion architecture using RisingWave. After evaluating Flink, Kafka Streams, and Spark Structured Streaming, they chose RisingWave for its PostgreSQL-compatible SQL, native materialized views, and built-in storage.
Raw data ingestion from Kafka topics into RisingWave sources. No disk landing, no pre-processing -- every data point is immediately available.
SQL-based cleaning, deduplication, and standardization using materialized views. Unify field names, convert units, filter invalid data.
Real-time business metrics computed with windowed materialized views. Serves dashboards, Iceberg tables, and alerting systems simultaneously.
-- Bronze: Raw ingestion CREATE SOURCE sensor_raw FROM KAFKA (...) FORMAT PLAIN ENCODE JSON; -- Silver: Clean and standardize CREATE MATERIALIZED VIEW sensor_cleaned AS SELECT device_id, standardized_value, location, event_time FROM sensor_raw WHERE value IS NOT NULL; -- Gold: Real-time metrics CREATE MATERIALIZED VIEW device_health AS SELECT location, AVG(value), COUNT(*) FILTER (WHERE value > threshold) FROM TUMBLE(sensor_cleaned, event_time, INTERVAL '5 MINUTES') GROUP BY location, window_start;
Results
Siemens achieved approximately 1000x reduction in data latency, from hours to seconds, and over 50% reduction in infrastructure costs by eliminating scheduling clusters and intermediate storage. Business teams now access real-time dashboards powered by direct materialized view queries.
| Metric | Before | After (RisingWave) |
|---|---|---|
| Data latency | Hours (overnight batch) | Seconds |
| Cleaning logic | Complex script stacks | SQL materialized views |
| Infrastructure | Scheduling clusters + intermediate storage | Single platform (>50% reduction) |
| Data availability | Next-day reports | Real-time views |
| Maintenance | High (scripts, schedulers, landing zones) | Low (SQL, no scheduling) |
Replace nightly batch jobs with real-time streaming in minutes.
Transform Your Data Pipeline →