Building a Live Service Game Data Platform with RisingWave

Building a Live Service Game Data Platform with RisingWave

A live service game data platform built on RisingWave unifies event ingestion, stream processing, and analytical serving in a single PostgreSQL-compatible layer. Instead of stitching together Kafka, Spark, a warehouse, and separate serving databases, you write SQL materialized views that continuously process player events, monetization data, and server telemetry — giving liveops, UA, and design teams a shared source of truth that updates in seconds.

What a Live Service Game Platform Must Do

Live-service games — battle passes, seasonal content, persistent progression — generate continuous data streams across multiple systems: match servers, payment processors, CDNs, ad networks, and social features. The data platform must:

  1. Ingest events reliably at peak scale (millions of players, thousands of events per second)
  2. Process events into business metrics (retention, ROAS, balance, server health) with minimal latency
  3. Serve results to downstream consumers: dashboards, game backends, marketing tools, data scientists
  4. Store historical data for analysis, compliance, and ML model training

Traditional architectures split this across four or more separate systems with complex orchestration. RisingWave handles the ingest, process, and serve layers in one tool, while sinking to external storage for the historical layer.

Core Data Sources

A complete live service platform ingests from several sources simultaneously:

-- Player gameplay events from game servers
CREATE SOURCE gameplay_events (
    player_id     BIGINT,
    session_id    VARCHAR,
    event_type    VARCHAR,
    level         INT,
    region        VARCHAR,
    event_time    TIMESTAMPTZ
) WITH (
    connector = 'kafka',
    topic = 'gameplay-events',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;

-- Revenue events from payment processor
CREATE SOURCE revenue_events (
    player_id     BIGINT,
    product_id    VARCHAR,
    revenue_usd   NUMERIC,
    currency      VARCHAR,
    event_time    TIMESTAMPTZ
) WITH (
    connector = 'kafka',
    topic = 'revenue-events',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;

Pull player account data from the live PostgreSQL player database via CDC:

CREATE SOURCE player_accounts (
    player_id         BIGINT PRIMARY KEY,
    signup_date       DATE,
    country           VARCHAR,
    acquisition_channel VARCHAR,
    vip_tier          INT
) WITH (
    connector = 'postgres-cdc',
    hostname = 'gamedb-primary',
    port = '5432',
    username = 'rw_replicator',
    password = 'secret',
    database.name = 'gamedb',
    schema.name = 'public',
    table.name = 'player_accounts'
);

Unified Player Activity View

Build the foundation of the platform: a single materialized view that captures every player's current-day activity:

CREATE MATERIALIZED VIEW player_daily_activity AS
SELECT
    g.player_id,
    p.country,
    p.vip_tier,
    p.acquisition_channel,
    window_start::DATE                                          AS activity_date,
    COUNT(*) FILTER (WHERE g.event_type = 'session_start')    AS sessions,
    COUNT(*) FILTER (WHERE g.event_type = 'level_complete')   AS levels_completed,
    MAX(g.level)                                               AS max_level_reached,
    COALESCE(SUM(r.revenue_usd), 0)                           AS daily_revenue
FROM TUMBLE(gameplay_events, g.event_time, INTERVAL '1 day') g
LEFT JOIN revenue_events r
    ON g.player_id = r.player_id
   AND r.event_time::DATE = g.event_time::DATE
JOIN player_accounts FOR SYSTEM_TIME AS OF NOW() p
    ON g.player_id = p.player_id
GROUP BY g.player_id, p.country, p.vip_tier, p.acquisition_channel, window_start;

Platform-Wide KPI Dashboard View

Build a single view that aggregates the platform-level KPIs your executive dashboard needs:

CREATE MATERIALIZED VIEW platform_kpis_hourly AS
SELECT
    window_start,
    window_end,
    COUNT(DISTINCT player_id)                                               AS dau,
    SUM(sessions)                                                           AS total_sessions,
    AVG(sessions)                                                           AS avg_sessions_per_player,
    SUM(daily_revenue)                                                      AS total_revenue,
    CASE WHEN COUNT(DISTINCT player_id) > 0
         THEN SUM(daily_revenue) / COUNT(DISTINCT player_id)
         ELSE 0
    END                                                                     AS arpu,
    COUNT(DISTINCT player_id) FILTER (WHERE daily_revenue > 0)             AS paying_players,
    CASE WHEN COUNT(DISTINCT player_id) > 0
         THEN COUNT(DISTINCT player_id) FILTER (WHERE daily_revenue > 0)::NUMERIC
              / COUNT(DISTINCT player_id)
         ELSE 0
    END                                                                     AS conversion_rate
FROM TUMBLE(player_daily_activity, activity_date::TIMESTAMPTZ, INTERVAL '1 hour')
GROUP BY window_start, window_end;

This KPI view updates continuously as player activity flows in. Your BI tool queries it via PostgreSQL and always sees the current-hour state of the platform.

Multi-Layer Data Platform Architecture Comparison

LayerSeparate Tool StackRisingWave-Unified Platform
Event ingestionKafka + Kafka ConnectRisingWave CREATE SOURCE
Stream processingFlink or Spark StreamingRisingWave materialized views
Analytical servingRedis / DruidRisingWave PostgreSQL interface
Historical storageS3 + IcebergIceberg sink from RisingWave
Operational complexityVery highModerate (SQL-centric)
Time to first insightDays to set upMinutes to first query

Sinking to the Data Lake

For long-term retention, ML training data, and historical analysis, sink processed views to Apache Iceberg:

CREATE SINK platform_kpis_to_iceberg
FROM platform_kpis_hourly
WITH (
    connector = 'iceberg',
    warehouse.path = 's3://game-data-lake/platform-kpis/',
    catalog.type = 'rest',
    catalog.uri = 'http://iceberg-catalog:8181',
    database.name = 'analytics',
    table.name = 'platform_kpis_hourly'
) FORMAT PLAIN ENCODE AVRO;

Data scientists and ML engineers query historical KPIs from the Iceberg table using Spark or Trino, while the game backend and dashboards continue reading from RisingWave's always-fresh materialized views.

FAQ

Q: How does RisingWave compare to Apache Flink for a live service game platform? A: Flink is a powerful stream processor but requires Java/Scala or the Table API for SQL. RisingWave is wire-compatible with PostgreSQL, meaning any team member who knows SQL can write and query materialized views without learning a new framework. For gaming teams where SQL is more common than JVM development, RisingWave has a significant productivity advantage.

Q: Can the same RisingWave cluster serve both real-time operational queries and heavy analytical queries? A: Workload isolation is achieved by separating materialized view computation nodes from serving nodes. For heavy analytical workloads, direct them to the Iceberg sink and run them in Spark or Trino. RisingWave serves low-latency point queries and pre-aggregated views efficiently.

Q: How do we handle schema evolution as the game adds new event types over time? A: Add new fields to the Kafka schema using a schema registry with backward-compatible evolution. RisingWave CDC sources support schema changes in the upstream database automatically. For Kafka sources, re-create sources with the updated schema as needed.

Q: What is the disaster recovery story for a RisingWave-based platform? A: RisingWave persists state to object storage (S3, GCS, or Azure Blob). In a failure, it replays from the last checkpoint and from Kafka's offset retention. RTO depends on checkpoint frequency and Kafka retention settings, typically minutes.

Q: Can we run RisingWave on Kubernetes for our cloud-native game infrastructure? A: Yes. RisingWave is designed for Kubernetes deployment with separate compute and storage nodes. The RisingWave Kubernetes operator simplifies cluster management.

Consolidate Your Game Data Stack

A live service game data platform should not require a team of data engineers to maintain six separate systems. RisingWave consolidates ingest, processing, and serving into SQL materialized views, freeing your team to build features rather than pipelines.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.