Real-Time Fraud Detection in Banking with Streaming SQL

Real-Time Fraud Detection in Banking with Streaming SQL

Real-time fraud detection in banking requires analyzing every transaction the moment it arrives — comparing it against customer history, velocity rules, and behavioral patterns simultaneously. With streaming SQL, banks can run continuous queries over live transaction streams and flag suspicious activity in under 100 milliseconds, before funds leave the account.

Why Batch Processing Falls Short for Fraud Detection

Traditional fraud detection relied on nightly batch jobs or rule engines that processed transactions after the fact. By the time an anomaly surfaced, the damage was done. Card-not-present fraud, account takeovers, and synthetic identity attacks all exploit the window between transaction and review.

Modern fraud requires sub-second decisioning. A customer's card being used in two countries within five minutes is a textbook fraud signal — but only if your system can compute that in real time, across millions of active accounts.

Streaming databases like RisingWave close that gap. They maintain continuously updated materialized views over live data streams, so fraud rules run as SQL queries that are always current.

Architecture Overview

A typical real-time fraud detection pipeline with RisingWave looks like this:

  1. Ingest: Transaction events flow from Kafka (produced by payment processors or core banking systems).
  2. Enrich: Reference data (customer profiles, merchant categories, spending limits) lives in RisingWave tables.
  3. Detect: Materialized views continuously evaluate fraud rules — velocity checks, geo-anomalies, amount thresholds.
  4. Alert: Suspicious transactions are written to a downstream Kafka topic for the fraud operations team or an automated block system.

Setting Up the Transaction Stream

First, create a source connected to your Kafka topic where payment transactions are published:

CREATE SOURCE transactions (
    transaction_id VARCHAR,
    account_id     VARCHAR,
    merchant_id    VARCHAR,
    amount         NUMERIC,
    currency       VARCHAR,
    country_code   VARCHAR,
    txn_time       TIMESTAMPTZ,
    channel        VARCHAR   -- 'online', 'pos', 'atm'
)
WITH (
    connector = 'kafka',
    topic = 'payment.transactions',
    properties.bootstrap.server = 'broker:9092',
    scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;

Next, maintain a reference table of known customer profiles:

CREATE TABLE customer_profiles (
    account_id        VARCHAR PRIMARY KEY,
    home_country      VARCHAR,
    avg_daily_spend   NUMERIC,
    typical_channels  VARCHAR[],
    risk_tier         VARCHAR
);

Velocity and Geo-Anomaly Detection with Materialized Views

The core of fraud detection is velocity checking — how many transactions has this account made in the last few minutes, and do the locations make sense?

-- Detect accounts with high transaction velocity in a 5-minute window
CREATE MATERIALIZED VIEW fraud_velocity_alerts AS
SELECT
    account_id,
    COUNT(*)                          AS txn_count,
    SUM(amount)                       AS total_amount,
    COUNT(DISTINCT country_code)      AS distinct_countries,
    MAX(txn_time)                     AS last_txn_time,
    window_start,
    window_end
FROM TUMBLE(
    transactions,
    txn_time,
    INTERVAL '5 MINUTES'
)
GROUP BY
    account_id,
    window_start,
    window_end
HAVING
    COUNT(*) > 10
    OR COUNT(DISTINCT country_code) > 1
    OR SUM(amount) > 5000;

This materialized view is continuously maintained. As new transactions arrive, RisingWave incrementally updates the aggregates — no full re-scans. Any account triggering the HAVING clause immediately becomes visible to downstream consumers.

For temporal enrichment — joining live transactions against current customer profiles — use a temporal join:

CREATE MATERIALIZED VIEW enriched_fraud_signals AS
SELECT
    t.transaction_id,
    t.account_id,
    t.amount,
    t.country_code,
    t.txn_time,
    cp.home_country,
    cp.avg_daily_spend,
    cp.risk_tier,
    CASE
        WHEN t.country_code <> cp.home_country THEN true
        ELSE false
    END AS geo_mismatch,
    CASE
        WHEN t.amount > cp.avg_daily_spend * 3 THEN true
        ELSE false
    END AS amount_anomaly
FROM transactions t
LEFT JOIN customer_profiles FOR SYSTEM_TIME AS OF t.txn_time AS cp
    ON t.account_id = cp.account_id
WHERE
    t.country_code <> cp.home_country
    OR t.amount > cp.avg_daily_spend * 3;

Streaming Results to Fraud Operations

Once fraud signals are detected, they need to reach the response layer immediately. Sink the enriched alerts back to Kafka:

CREATE SINK fraud_alerts_sink
FROM enriched_fraud_signals
WITH (
    connector = 'kafka',
    topic = 'fraud.alerts',
    properties.bootstrap.server = 'broker:9092'
)
FORMAT PLAIN ENCODE JSON;

Fraud operations dashboards and automated block systems subscribe to this topic and act within milliseconds of a signal being generated.

Batch vs. Streaming Fraud Detection: A Comparison

DimensionBatch ProcessingStreaming SQL (RisingWave)
Detection latencyMinutes to hoursMilliseconds
Fraud caught before settlementRarelyConsistently
Rule update mechanismRe-deploy ETL jobALTER MATERIALIZED VIEW
Infrastructure complexityHigh (Spark + schedulers)Low (SQL only)
Historical backfillNativeVia bounded source
False positive tuningSlow feedback loopImmediate feedback
Operational costHigh (batch clusters)Low (always-on stream)

FAQ

Q: Can RisingWave handle the transaction volumes of a large bank? A: RisingWave is horizontally scalable and can ingest millions of events per second across distributed compute nodes. It is deployed in production at financial institutions processing tens of millions of daily transactions.

Q: How do we handle schema changes in the transaction payload? A: RisingWave supports schema evolution for Kafka sources using schema registries. You can use FORMAT PLAIN ENCODE AVRO or FORMAT PLAIN ENCODE PROTOBUF with a Confluent Schema Registry endpoint to handle evolving schemas automatically.

Q: What happens if the streaming pipeline has downtime? A: RisingWave maintains consumer offsets for Kafka sources. On restart, it resumes from the last committed offset, ensuring no transactions are missed and aggregates remain consistent.

Q: Can we backtest fraud rules against historical data? A: Yes. Point a bounded Kafka source or a PostgreSQL/S3 connector at historical data to replay events through the same materialized view definitions, allowing rule validation before production deployment.

Q: Is the PostgreSQL interface useful for fraud analysts? A: Absolutely. Fraud analysts can query materialized views directly using any PostgreSQL-compatible client — psql, DBeaver, Metabase — without learning a new query language. This dramatically shortens the time from hypothesis to tested rule.

Get Started

Real-time fraud detection is one of the highest-ROI applications of streaming SQL in finance. The combination of sub-second detection, familiar SQL syntax, and seamless Kafka integration makes RisingWave an ideal foundation.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.