Building a Real-Time Data Quality Monitoring System

Building a Real-Time Data Quality Monitoring System

Your analytics dashboard reports a 15% drop in checkout conversions. You escalate to the data team. Three hours later, the root cause turns out to be an upstream ETL job that started emitting orders with null customer_id values two days ago. The data was flowing, dashboards were updating, and nobody noticed because nobody was watching the data itself.

Data quality failures are silent killers. Unlike infrastructure outages that trigger obvious alerts, bad data silently corrupts downstream reports, machine learning models, and business decisions. By the time someone spots an anomaly in a KPI, the bad data has already spread.

The fix is not more manual audits. It is continuous, automated data quality monitoring that runs the same checks your data team would run manually, but in real time, against every row as it arrives. RisingWave makes this possible with materialized views that evaluate your quality rules incrementally, updating in milliseconds as new data flows in.

In this guide, you will build a complete real-time data quality monitoring system using RisingWave. You will track null rates, detect schema violations, verify referential integrity, enforce value range constraints, and catch duplicate records, all in SQL, all verified against RisingWave 2.8.0.

Why Streaming SQL for Data Quality Monitoring?

Traditional data quality tools run batch checks on a schedule: every hour, every night, or before a report refresh. This creates an unacceptably wide detection window for production systems where data flows continuously.

Streaming SQL closes that window. RisingWave's materialized views evaluate quality rules incrementally: when a new row arrives in the source table, only the affected views are updated, not the entire dataset rescanned. The result is quality violations surfaced in milliseconds, not hours.

This approach has several advantages over batch-based quality tools:

  • Continuous coverage: every row is checked at ingestion time, not sampled on a schedule
  • Single language: SQL handles ingestion, quality logic, and alerting without separate tools
  • Composable checks: materialized views build on each other, creating layered quality pipelines
  • Zero operational overhead: no separate quality service to deploy, monitor, or scale

Batch vs. Streaming Data Quality

AspectBatch Quality ToolsStreaming SQL (RisingWave)
Detection latencyMinutes to hoursMilliseconds
CoverageSampled or scheduledEvery row, continuously
Rule languageCustom DSL or PythonStandard SQL
State managementExternal schedulerBuilt-in incremental updates
Alerting integrationSeparate systemSQL UNION + Kafka sink
Operational complexityMultiple servicesSingle streaming database

Data Model: The Tables We Will Monitor

For this guide, we will monitor three tables: a dq_customers reference table, a dq_orders fact table, and a dq_events behavioral log. In production, these would be backed by Kafka sources; here we use regular tables so you can insert test data directly.

CREATE TABLE dq_customers (
    customer_id VARCHAR,
    name        VARCHAR,
    email       VARCHAR,
    created_at  TIMESTAMPTZ
);

CREATE TABLE dq_orders (
    order_id    VARCHAR,
    customer_id VARCHAR,
    product_id  VARCHAR,
    amount      DECIMAL,
    quantity    INT,
    status      VARCHAR,
    country     VARCHAR,
    event_time  TIMESTAMPTZ
);

CREATE TABLE dq_events (
    event_id    VARCHAR,
    user_id     VARCHAR,
    event_type  VARCHAR,
    payload     VARCHAR,
    session_id  VARCHAR,
    received_at TIMESTAMPTZ
);

Now insert some sample data. We include intentional quality issues: null foreign keys, negative amounts, duplicate event IDs, and an invalid email format.

-- Reference customers
INSERT INTO dq_customers VALUES
    ('C001', 'Alice Smith', 'alice@example.com', '2026-01-01 00:00:00+00'),
    ('C002', 'Bob Jones',   NULL,                '2026-01-02 00:00:00+00'),
    ('C003', 'Carol White', 'carol@example.com', '2026-01-03 00:00:00+00'),
    ('C004', NULL,          'dan@example.com',   '2026-01-04 00:00:00+00'),
    ('C005', 'Eve Brown',   'eve@example.com',   '2026-01-05 00:00:00+00');

-- Orders with quality issues seeded in
INSERT INTO dq_orders VALUES
    ('O001', 'C001', 'P100',   49.99, 1, 'completed', 'US', '2026-04-01 10:00:00+00'),
    ('O002', 'C002', 'P101',   -5.00, 2, 'completed', 'UK', '2026-04-01 10:01:00+00'),
    ('O003', 'C003', 'P102',  199.00, 0, 'shipped',   'DE', '2026-04-01 10:02:00+00'),
    ('O004',  NULL,  'P103',   75.00, 1, 'pending',   'FR', '2026-04-01 10:03:00+00'),
    ('O005', 'C999', 'P104',   25.00, 1, 'completed', 'US', '2026-04-01 10:04:00+00'),
    ('O006', 'C001',  NULL,   150.00, 1, 'completed', 'US', '2026-04-01 10:05:00+00'),
    ('O007', 'C002', 'P105',   89.99, 3, 'refunded',  'UK', '2026-04-01 10:06:00+00'),
    ('O008', 'C003', 'P106', 9999.00, 1, 'completed', 'DE', '2026-04-01 10:07:00+00'),
    ('O009', 'C005', 'P107',   39.99, 1, 'pending',   'AU', '2026-04-01 10:08:00+00'),
    ('O010', 'C001', 'P100',   49.99, 1, 'completed', 'US', '2026-04-01 10:09:00+00');

-- Events with duplicates and a null user_id
INSERT INTO dq_events VALUES
    ('E001', 'U001', 'page_view', '{"page":"/home"}',   'S001', '2026-04-01 10:00:00+00'),
    ('E002', 'U002', 'purchase',  '{"item":"P100"}',    'S002', '2026-04-01 10:01:00+00'),
    ('E001', 'U001', 'page_view', '{"page":"/home"}',   'S001', '2026-04-01 10:00:00+00'),
    ('E003', 'U003', 'click',     '{"btn":"checkout"}', 'S003', '2026-04-01 10:02:00+00'),
    ('E002', 'U002', 'purchase',  '{"item":"P100"}',    'S002', '2026-04-01 10:01:00+00'),
    ('E004', 'U004', 'page_view', '{"page":"/cart"}',   'S004', '2026-04-01 10:03:00+00'),
    ('E005',  NULL,  'click',     '{"btn":"buy"}',      'S005', '2026-04-01 10:04:00+00');

Check 1: Null Rate Monitoring

Null values in columns that should be populated are one of the most common data quality failures. A null customer_id on an order, or a null email on a customer record, means that downstream joins will silently drop rows and aggregations will undercount.

The following materialized view tracks null rates continuously across the columns you care about:

CREATE MATERIALIZED VIEW dq_null_rate_monitor AS
SELECT
    'dq_orders'           AS table_name,
    'customer_id'         AS column_name,
    COUNT(*)              AS total_rows,
    COUNT(customer_id)    AS non_null_count,
    COUNT(*) - COUNT(customer_id) AS null_count,
    ROUND(
        (COUNT(*) - COUNT(customer_id))::NUMERIC * 100.0 / NULLIF(COUNT(*), 0),
        2
    )                     AS null_rate_pct
FROM dq_orders
UNION ALL
SELECT
    'dq_orders'           AS table_name,
    'product_id'          AS column_name,
    COUNT(*)              AS total_rows,
    COUNT(product_id)     AS non_null_count,
    COUNT(*) - COUNT(product_id)  AS null_count,
    ROUND(
        (COUNT(*) - COUNT(product_id))::NUMERIC * 100.0 / NULLIF(COUNT(*), 0),
        2
    )                     AS null_rate_pct
FROM dq_orders
UNION ALL
SELECT
    'dq_customers'        AS table_name,
    'email'               AS column_name,
    COUNT(*)              AS total_rows,
    COUNT(email)          AS non_null_count,
    COUNT(*) - COUNT(email)       AS null_count,
    ROUND(
        (COUNT(*) - COUNT(email))::NUMERIC * 100.0 / NULLIF(COUNT(*), 0),
        2
    )                     AS null_rate_pct
FROM dq_customers
UNION ALL
SELECT
    'dq_customers'        AS table_name,
    'name'                AS column_name,
    COUNT(*)              AS total_rows,
    COUNT(name)           AS non_null_count,
    COUNT(*) - COUNT(name)        AS null_count,
    ROUND(
        (COUNT(*) - COUNT(name))::NUMERIC * 100.0 / NULLIF(COUNT(*), 0),
        2
    )                     AS null_rate_pct
FROM dq_customers;

Query it:

SELECT table_name, column_name, total_rows, null_count, null_rate_pct
FROM dq_null_rate_monitor
ORDER BY null_rate_pct DESC;
  table_name  | column_name | total_rows | null_count | null_rate_pct
--------------+-------------+------------+------------+---------------
 dq_customers | name        |          5 |          1 |          20.0
 dq_customers | email       |          5 |          1 |          20.0
 dq_orders    | product_id  |         10 |          1 |          10.0
 dq_orders    | customer_id |         10 |          1 |          10.0

The NULLIF(COUNT(*), 0) guard prevents a division-by-zero error if the table is empty, which matters when the stream first starts and no rows have arrived yet. Because this is a materialized view, these percentages update automatically every time a new row is inserted or deleted from the source tables.

Check 2: Schema Violation Detection

Schema violations are more subtle than nulls. A field may be present but contain a value in the wrong format: an email address missing the @ sign, a name with a single character, or a phone number with letters in it. These pass NOT NULL constraints but corrupt downstream lookups.

CREATE MATERIALIZED VIEW dq_schema_violations AS
SELECT
    customer_id,
    email,
    name,
    created_at,
    CASE
        WHEN email IS NOT NULL AND email NOT LIKE '%@%.%' THEN 'invalid_email_format'
        WHEN name  IS NOT NULL AND LENGTH(name) < 2       THEN 'name_too_short'
        ELSE NULL
    END AS violation_type
FROM dq_customers
WHERE
    (email IS NOT NULL AND email NOT LIKE '%@%.%')
    OR (name IS NOT NULL AND LENGTH(name) < 2);

Insert two customers with schema problems and query the view:

INSERT INTO dq_customers VALUES
    ('C006', 'Frank Lee', 'frankATbadformat', '2026-04-01 10:00:00+00'),
    ('C007', 'G',         'g@example.com',    '2026-04-01 10:00:00+00');

SELECT customer_id, email, name, violation_type
FROM dq_schema_violations
ORDER BY customer_id;
 customer_id |      email       |   name    |    violation_type
-------------+------------------+-----------+----------------------
 C006        | frankATbadformat | Frank Lee | invalid_email_format
 C007        | g@example.com    | G         | name_too_short

The view continuously maintains the list of records that violate format rules. When a customer record is corrected upstream (for example, the email field is updated to a valid address), that row automatically disappears from dq_schema_violations on the next refresh cycle.

You can extend this pattern to any format validation your domain requires: phone number patterns with SIMILAR TO, date range constraints, or field length limits.

Check 3: Referential Integrity Verification

Foreign key enforcement is not always possible at the source layer, especially in event-driven architectures where events arrive before their parent records, or where the message broker does not enforce schema relationships. A continuous LEFT JOIN catches these orphans as they arrive.

CREATE MATERIALIZED VIEW dq_referential_integrity AS
SELECT
    o.order_id,
    o.customer_id         AS referenced_id,
    o.amount,
    o.event_time,
    'customer_id -> dq_customers' AS fk_relationship,
    CASE
        WHEN o.customer_id IS NULL   THEN 'null_foreign_key'
        WHEN c.customer_id IS NULL   THEN 'orphaned_record'
        ELSE NULL
    END                   AS violation_type
FROM dq_orders o
LEFT JOIN dq_customers c ON o.customer_id = c.customer_id
WHERE
    o.customer_id IS NULL
    OR c.customer_id IS NULL;
SELECT order_id, referenced_id, fk_relationship, violation_type
FROM dq_referential_integrity
ORDER BY order_id;
 order_id | referenced_id |       fk_relationship       |  violation_type
----------+---------------+-----------------------------+------------------
 O004     |               | customer_id -> dq_customers | null_foreign_key
 O005     | C999          | customer_id -> dq_customers | orphaned_record

Order O004 has a null customer_id, and Order O005 references customer C999, which does not exist in dq_customers. Both are caught immediately at insertion time.

The key insight here is that RisingWave maintains the LEFT JOIN incrementally. When a new order arrives, RisingWave looks up the customer in the current state of dq_customers. If the customer does not exist, the join produces a null on the right side and the WHERE clause fires. No full table scan needed.

Check 4: Value Range Enforcement

Business rules often define acceptable ranges for numeric fields: order amounts must be positive, quantities must be greater than zero, and amounts above a certain threshold may need additional review. Encoding these as a materialized view gives you a continuously maintained list of out-of-range records.

CREATE MATERIALIZED VIEW dq_range_violations AS
SELECT
    order_id,
    customer_id,
    amount,
    quantity,
    status,
    event_time,
    CASE
        WHEN amount <= 0   THEN 'negative_or_zero_amount'
        WHEN amount > 5000 THEN 'amount_exceeds_max'
        WHEN quantity <= 0 THEN 'non_positive_quantity'
        WHEN status NOT IN ('pending','completed','shipped','refunded','cancelled')
                           THEN 'invalid_status_value'
        ELSE NULL
    END AS violation_type
FROM dq_orders
WHERE
    amount <= 0
    OR amount > 5000
    OR quantity <= 0
    OR status NOT IN ('pending','completed','shipped','refunded','cancelled');
SELECT order_id, amount, quantity, status, violation_type
FROM dq_range_violations
ORDER BY order_id;
 order_id | amount  | quantity |  status   |     violation_type
----------+---------+----------+-----------+-------------------------
 O002     |   -5.00 |        2 | completed | negative_or_zero_amount
 O003     |  199.00 |        0 | shipped   | non_positive_quantity
 O008     | 9999.00 |        1 | completed | amount_exceeds_max

Three violations surface immediately: a negative amount on O002, a zero quantity on O003, and an amount far above the $5,000 maximum on O008. Each of these could be a data pipeline bug, a fraudulent transaction, or an upstream schema change.

Because the CASE expression handles multiple violation types in a single pass, the view is efficient even with many rules. Add more conditions to the WHERE clause and the matching CASE branch as your business rules evolve.

Check 5: Duplicate Detection

Duplicates are particularly dangerous in event streams. At-least-once delivery guarantees in Kafka mean producers can retry and send the same event multiple times. Without deduplication, your counts, totals, and user-level aggregations will be inflated.

CREATE MATERIALIZED VIEW dq_duplicate_detection AS
SELECT
    event_id,
    user_id,
    event_type,
    received_at,
    COUNT(*) OVER (PARTITION BY event_id) AS occurrence_count
FROM dq_events
WHERE event_id IN (
    SELECT event_id
    FROM dq_events
    GROUP BY event_id
    HAVING COUNT(*) > 1
);
SELECT event_id, user_id, event_type, occurrence_count
FROM dq_duplicate_detection
ORDER BY event_id, received_at;
 event_id | user_id | event_type | occurrence_count
----------+---------+------------+------------------
 E001     | U001    | page_view  |                2
 E001     | U001    | page_view  |                2
 E002     | U002    | purchase   |                2
 E002     | U002    | purchase   |                2

Events E001 and E002 each appear twice. The occurrence_count column gives you the exact count, making it easy to build a downstream alert: any event ID with occurrence_count > 1 needs investigation.

For high-volume event streams, you may want to scope duplicate detection to a time window rather than scanning the entire history. Use a tumbling window with TUMBLE in your query to check for duplicates only within recent batches. See the RisingWave windowing functions documentation for details.

Assembling a Unified Quality Dashboard

Five separate materialized views is a good start, but your on-call data engineer should not have to query five views to understand the state of data quality. A UNION ALL view combines all checks into a single dashboard with a consistent schema.

CREATE MATERIALIZED VIEW dq_unified_quality_dashboard AS

-- Null rate violations (threshold: > 5%)
SELECT
    'null_rate'        AS check_type,
    table_name,
    column_name        AS subject,
    null_rate_pct::VARCHAR AS metric_value,
    CASE
        WHEN null_rate_pct >= 20 THEN 'CRITICAL'
        WHEN null_rate_pct >= 10 THEN 'WARNING'
        ELSE 'OK'
    END                AS severity
FROM dq_null_rate_monitor
WHERE null_rate_pct > 5

UNION ALL

-- Range violations
SELECT
    'range_violation'  AS check_type,
    'dq_orders'        AS table_name,
    violation_type     AS subject,
    COUNT(*)::VARCHAR  AS metric_value,
    'WARNING'          AS severity
FROM dq_range_violations
GROUP BY violation_type

UNION ALL

-- Referential integrity violations
SELECT
    'ref_integrity'    AS check_type,
    'dq_orders'        AS table_name,
    violation_type     AS subject,
    COUNT(*)::VARCHAR  AS metric_value,
    'WARNING'          AS severity
FROM dq_referential_integrity
GROUP BY violation_type

UNION ALL

-- Duplicate events
SELECT
    'duplicate'        AS check_type,
    'dq_events'        AS table_name,
    'duplicate_event_ids' AS subject,
    COUNT(DISTINCT event_id)::VARCHAR AS metric_value,
    'WARNING'          AS severity
FROM dq_duplicate_detection;

Query the dashboard:

SELECT check_type, table_name, subject, metric_value, severity
FROM dq_unified_quality_dashboard
ORDER BY severity, check_type;
   check_type    |  table_name  |         subject         | metric_value | severity
-----------------+--------------+-------------------------+--------------+----------
 null_rate       | dq_customers | name                    | 20.0         | CRITICAL
 null_rate       | dq_customers | email                   | 20.0         | CRITICAL
 duplicate       | dq_events    | duplicate_event_ids     | 2            | WARNING
 null_rate       | dq_orders    | product_id              | 10.0         | WARNING
 null_rate       | dq_orders    | customer_id             | 10.0         | WARNING
 range_violation | dq_orders    | negative_or_zero_amount | 1            | WARNING
 range_violation | dq_orders    | amount_exceeds_max      | 1            | WARNING
 range_violation | dq_orders    | non_positive_quantity   | 1            | WARNING
 ref_integrity   | dq_orders    | null_foreign_key        | 1            | WARNING
 ref_integrity   | dq_orders    | orphaned_record         | 1            | WARNING

CRITICAL rows sort first. At a glance, the dashboard shows that 20% of customer records are missing a name or email, two distinct event IDs have duplicates, and multiple order-level violations exist across range and referential integrity checks.

Routing Alerts When Quality Degrades

A quality dashboard is only useful if someone is notified when quality degrades past an acceptable threshold. Route CRITICAL and WARNING rows from dq_unified_quality_dashboard to a Kafka topic using a RisingWave Kafka sink:

CREATE SINK dq_alerts_to_kafka
FROM dq_unified_quality_dashboard
WITH (
    connector = 'kafka',
    properties.bootstrap.server = 'kafka:9092',
    topic = 'data-quality-alerts'
)
FORMAT PLAIN ENCODE JSON;

Each row that appears in dq_unified_quality_dashboard is emitted as a JSON message on the data-quality-alerts topic. A downstream consumer can route CRITICAL severity rows to PagerDuty and WARNING rows to a Slack channel.

The JSON payload looks like this:

{
  "check_type": "null_rate",
  "table_name": "dq_customers",
  "subject": "name",
  "metric_value": "20.0",
  "severity": "CRITICAL"
}

A lightweight Python consumer handles the routing:

import json
import requests
from kafka import KafkaConsumer

consumer = KafkaConsumer(
    'data-quality-alerts',
    bootstrap_servers='kafka:9092',
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)

SLACK_WEBHOOK = "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"

for message in consumer:
    alert = message.value
    severity = alert.get("severity", "WARNING")
    text = (
        f"[{severity}] Data quality issue detected\n"
        f"Check: {alert['check_type']} | Table: {alert['table_name']}\n"
        f"Subject: {alert['subject']} | Value: {alert['metric_value']}"
    )
    requests.post(SLACK_WEBHOOK, json={"text": text})

RisingWave handles the hard part: continuously evaluating quality rules and emitting only the rows that represent new violations. The consumer is stateless and trivial to replace with any notification endpoint.

Architecture Overview

The complete pipeline looks like this:

graph LR
    A[dq_customers table] --> B[RisingWave]
    C[dq_orders table] --> B
    D[dq_events table] --> B
    B --> E[dq_null_rate_monitor MV]
    B --> F[dq_schema_violations MV]
    B --> G[dq_referential_integrity MV]
    B --> H[dq_range_violations MV]
    B --> I[dq_duplicate_detection MV]
    E --> J[dq_unified_quality_dashboard MV]
    F --> J
    G --> J
    H --> J
    I --> J
    J --> K[Kafka Sink: data-quality-alerts]
    K --> L[Slack Consumer]
    K --> M[PagerDuty Consumer]

In production, replace the three tables with Kafka sources or CDC sources from PostgreSQL or MySQL. The materialized views and sink remain identical.

Production Considerations

Thresholds evolve with your data. Start with conservative thresholds (flag anything above 1% null rate as WARNING, 5% as CRITICAL) and tune them based on observed baseline rates. A table that normally runs at 0.1% null rate needs a tighter threshold than one where nulls are expected for optional fields.

Add event_time filtering for high-volume tables. If your order table receives millions of rows per day, the range and referential integrity views scan all historical rows. Add a filter like WHERE event_time >= NOW() - INTERVAL '24 hours' to scope checks to recent data and reduce materialized view maintenance cost.

Use separate views per severity level. Rather than putting severity logic inside the unified dashboard view, consider creating a dq_critical_violations view that filters only CRITICAL rows and sinks that to a high-priority alert channel. Reserve the unified dashboard for read-only querying from a BI tool.

Combine with schema registry enforcement. For Kafka-backed sources, pair RisingWave's runtime quality checks with a schema registry like Confluent Schema Registry or Apicurio to catch format violations before they reach RisingWave. RisingWave's checks then act as a second layer of defense for semantic violations that schema registries cannot express.

What Is Real-Time Data Quality Monitoring with Streaming SQL?

Real-time data quality monitoring with streaming SQL is the practice of defining data quality rules as continuous SQL queries that evaluate every row as it arrives, rather than running batch audits on a schedule. In RisingWave, these rules are implemented as materialized views that are maintained incrementally: when a new row is inserted or updated in a source table, only the affected portions of the view are recomputed. This means quality violations surface within milliseconds of the bad data arriving, enabling data teams to detect and route alerts to on-call channels before bad data propagates downstream.

How Do Materialized Views Differ from Scheduled Quality Jobs?

A scheduled quality job queries the database on a fixed interval (say, every 15 minutes) and scans all rows within that window. If bad data arrives at minute 1 and the job runs at minute 15, your detection latency is up to 14 minutes. A materialized view, by contrast, is maintained incrementally: RisingWave updates the view the moment a new row arrives in the source. The view always reflects the current state of the data, so querying it at any moment returns the up-to-date list of violations with no additional computation overhead.

Can I Monitor Data Quality on Kafka Topics Directly?

Yes. Replace the CREATE TABLE statements in this guide with CREATE SOURCE statements that read from Kafka topics. The materialized views that define your quality checks are identical whether the source is a table or a Kafka-backed source. In fact, monitoring Kafka topics directly is the preferred production pattern, because it means quality checks run on the raw event stream before any transformation occurs. See the RisingWave Kafka source documentation for the connector configuration.

How Do I Handle False Positives in Quality Alerts?

False positives usually fall into two categories. The first is threshold misconfiguration: your null rate threshold is set too tight for a field where nulls are acceptable (for example, a middle name field). Fix this by adjusting the threshold in the dq_unified_quality_dashboard WHERE clause. The second is timing issues in event-driven systems: an order arrives before its parent customer record because two producers are writing to Kafka at slightly different rates. Handle this by adding a grace period to your referential integrity check using a time-windowed filter, for example only flagging orphaned records where the order is older than five minutes. This gives late-arriving parent records time to land before triggering an alert.

Conclusion

Data quality failures are silent until they are catastrophic. Building a real-time monitoring system with RisingWave and streaming SQL gives your data team continuous visibility without the operational overhead of scheduling, orchestrating, and maintaining separate quality jobs.

The pattern in this guide scales from a single table to an entire data platform:

  • Null rate monitors surface missing values the moment bad rows arrive, with percentage thresholds that adapt as data volumes grow
  • Schema violation checks catch format problems that pass NOT NULL constraints but corrupt downstream lookups
  • Referential integrity views expose orphaned records and null foreign keys using incremental LEFT JOINs
  • Range violation views encode business rules directly in SQL, producing a continuously updated list of out-of-bounds records
  • Duplicate detection identifies repeated event IDs in streaming event logs before they inflate aggregations
  • A unified dashboard view consolidates all checks into a single queryable surface with severity-based ordering
  • A Kafka sink routes violations to Slack, PagerDuty, or any downstream consumer in real time

All of this runs in SQL. No custom application code for the quality logic, no separate scheduling layer, and no batch window during which bad data goes undetected.


Ready to add real-time data quality monitoring to your pipeline? Try RisingWave Cloud free, no credit card required. Sign up here.

Join our Slack community to share your data quality patterns and connect with other data engineers building on streaming SQL.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.