Real-Time Fraud Detection: Flink Java vs RisingWave SQL

Real-Time Fraud Detection: Flink Java vs RisingWave SQL

Real-time fraud detection with Flink Java requires writing, testing, and deploying roughly 100 lines of boilerplate code per pattern. RisingWave SQL expresses the same patterns in about 10 lines of standard SQL that a data engineer can read, review, and change without touching a JVM.

This article builds two concrete fraud detection patterns side by side: a velocity check (five or more transactions from the same user in a five-minute window) and a merchant anomaly (a user hitting the same merchant four or more times in ten minutes). Every SQL statement was verified against RisingWave 2.8.0. Every Flink snippet represents production-grade DataStream API code you would actually write for a real deployment.

Why Fraud Detection Is a Streaming Problem

Fraud happens in seconds. Authorization networks give payment processors roughly 100 to 200 milliseconds to decide whether to approve a transaction. Batch detection that runs every hour or overnight catches fraud after the damage is done.

The right architecture processes each transaction the moment it arrives, evaluates it against known patterns, and emits an alert before the authorization window closes. That means a stream processing system, not a nightly ETL job.

Both Apache Flink and RisingWave can do this. The question is how much code it takes, and what the operational complexity looks like after it ships.

The Setup: One Source, Two Patterns

The source data is a stream of payment transactions. Each event has a transaction ID, a user ID, an amount, a merchant ID, and a timestamp.

In production this stream comes from Apache Kafka. For local development and testing, a simple table works identically.

RisingWave: create the source

CREATE TABLE fjava_transactions (
    transaction_id    VARCHAR,
    user_id           VARCHAR,
    amount            DOUBLE PRECISION,
    merchant_id       VARCHAR,
    merchant_category VARCHAR,
    ts                TIMESTAMP
);

When you are ready to connect to Kafka, replace CREATE TABLE with CREATE SOURCE and add the connector block. The materialized views do not change.

The Flink equivalent requires a FlinkKafkaConsumer, a schema class, and a deserialization layer before you write a single line of business logic:

// Transaction POJO
public class Transaction {
    public String transactionId;
    public String userId;
    public double amount;
    public String merchantId;
    public String merchantCategory;
    public long   tsMillis;
}

// Environment and source setup
StreamExecutionEnvironment env =
    StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(4);
env.enableCheckpointing(10_000); // 10 s checkpoint interval

Properties kafkaProps = new Properties();
kafkaProps.setProperty("bootstrap.servers", "kafka:9092");
kafkaProps.setProperty("group.id", "fraud-detector");

FlinkKafkaConsumer<Transaction> source = new FlinkKafkaConsumer<>(
    "transactions",
    new TransactionDeserializationSchema(), // custom class, ~30 lines
    kafkaProps
);
source.setStartFromLatest();

DataStream<Transaction> txnStream =
    env.addSource(source)
       .assignTimestampsAndWatermarks(
           WatermarkStrategy
               .<Transaction>forBoundedOutOfOrderness(Duration.ofSeconds(5))
               .withTimestampAssigner(
                   (t, ts) -> t.tsMillis
               )
       );

This is before any fraud logic runs. You have already written roughly 30 lines (excluding the deserialization schema), and none of it encodes a business rule.

Pattern 1: Velocity Check

A velocity check flags users who make five or more transactions within a five-minute window. Fraudsters use stolen cards in rapid succession before the card is blocked.

RisingWave SQL (8 lines)

CREATE MATERIALIZED VIEW fjava_velocity_alerts AS
SELECT
    user_id,
    window_start,
    window_end,
    COUNT(*)    AS txn_count,
    SUM(amount) AS total_amount
FROM TUMBLE(fjava_transactions, ts, INTERVAL '5 MINUTES')
GROUP BY user_id, window_start, window_end
HAVING COUNT(*) >= 5;

The TUMBLE function partitions the stream into five-minute non-overlapping windows. HAVING COUNT(*) >= 5 filters to only the user/window combinations that crossed the threshold. The materialized view updates incrementally as new transactions arrive.

Querying it:

SELECT * FROM fjava_velocity_alerts;
 user_id |    window_start     |     window_end      | txn_count | total_amount
---------+---------------------+---------------------+-----------+--------------
 user_42 | 2026-04-01 10:00:00 | 2026-04-01 10:05:00 |         5 |           760
(1 row)

user_42 made five transactions totaling $760 in a single five-minute window. This view is always current. The moment a sixth transaction from user_42 arrives in the same window, the row updates automatically.

The Flink equivalent uses a ProcessWindowFunction with a TumblingEventTimeWindows assigner:

// Keyed stream + 5-minute tumbling window
txnStream
    .keyBy(t -> t.userId)
    .window(TumblingEventTimeWindows.of(Time.minutes(5)))
    .process(new VelocityWindowFunction())
    .print();

// The window function itself
public static class VelocityWindowFunction
    extends ProcessWindowFunction<Transaction, VelocityAlert, String, TimeWindow> {

    @Override
    public void process(
        String userId,
        Context ctx,
        Iterable<Transaction> elements,
        Collector<VelocityAlert> out
    ) {
        long   count       = 0;
        double totalAmount = 0.0;

        for (Transaction t : elements) {
            count++;
            totalAmount += t.amount;
        }

        if (count >= 5) {
            VelocityAlert alert = new VelocityAlert();
            alert.userId      = userId;
            alert.windowStart = new Timestamp(ctx.window().getStart());
            alert.windowEnd   = new Timestamp(ctx.window().getEnd());
            alert.txnCount    = count;
            alert.totalAmount = totalAmount;
            out.collect(alert);
        }
    }
}

// VelocityAlert output POJO
public static class VelocityAlert {
    public String    userId;
    public Timestamp windowStart;
    public Timestamp windowEnd;
    public long      txnCount;
    public double    totalAmount;
}

Then to see results during development you need a sink (a print sink or a JDBC sink). To query results from an application you need to write to an external database and query that database separately. The data is not directly queryable inside Flink.

The velocity check alone is about 55 to 65 lines of Java across the window function, the alert POJO, and the pipeline wiring. The equivalent SQL is 8 lines.

Pattern 2: Merchant Anomaly Detection

A merchant anomaly fires when a user hits the same merchant four or more times within ten minutes. This catches behavior like a fraudster using a compromised card at a single merchant point-of-sale terminal repeatedly.

RisingWave SQL (9 lines)

CREATE MATERIALIZED VIEW fjava_merchant_anomaly AS
SELECT
    user_id,
    merchant_id,
    window_start,
    window_end,
    COUNT(*)    AS txn_count,
    SUM(amount) AS total_amount
FROM TUMBLE(fjava_transactions, ts, INTERVAL '10 MINUTES')
GROUP BY user_id, merchant_id, window_start, window_end
HAVING COUNT(*) >= 4;

The only change from the velocity check: add merchant_id to the GROUP BY clause and adjust the window size and threshold. Nine lines.

SELECT * FROM fjava_merchant_anomaly;
 user_id | merchant_id |    window_start     |     window_end      | txn_count | total_amount
---------+-------------+---------------------+---------------------+-----------+--------------
 user_77 | merchant_X  | 2026-04-01 11:00:00 | 2026-04-01 11:10:00 |         4 |         3846
(1 row)

user_77 hit merchant_X four times in ten minutes, spending $3,846. The pattern is flagged immediately.

The merchant anomaly requires a composite key and a new output POJO. Everything else follows the same structure:

// Composite keying on userId + merchantId
txnStream
    .keyBy(t -> t.userId + "|" + t.merchantId)
    .window(TumblingEventTimeWindows.of(Time.minutes(10)))
    .process(new MerchantAnomalyFunction())
    .print();

public static class MerchantAnomalyFunction
    extends ProcessWindowFunction<Transaction, MerchantAlert, String, TimeWindow> {

    @Override
    public void process(
        String compositeKey,
        Context ctx,
        Iterable<Transaction> elements,
        Collector<MerchantAlert> out
    ) {
        long   count       = 0;
        double totalAmount = 0.0;
        String merchantId  = null;
        String userId      = null;

        for (Transaction t : elements) {
            count++;
            totalAmount += t.amount;
            merchantId   = t.merchantId;
            userId       = t.userId;
        }

        if (count >= 4) {
            MerchantAlert alert = new MerchantAlert();
            alert.userId      = userId;
            alert.merchantId  = merchantId;
            alert.windowStart = new Timestamp(ctx.window().getStart());
            alert.windowEnd   = new Timestamp(ctx.window().getEnd());
            alert.txnCount    = count;
            alert.totalAmount = totalAmount;
            out.collect(alert);
        }
    }
}

public static class MerchantAlert {
    public String    userId;
    public String    merchantId;
    public Timestamp windowStart;
    public Timestamp windowEnd;
    public long      txnCount;
    public double    totalAmount;
}

About 50 to 55 lines for the second pattern, because you need a new output POJO and a new window function class even though the logic is nearly identical to the velocity check.

In RisingWave, the second pattern took an extra merchant_id in two places. That is it.

Side-by-Side Comparison

DimensionFlink JavaRisingWave SQL
Lines of code (velocity check)~608
Lines of code (merchant anomaly)~559
New class per patternYes (ProcessWindowFunction + POJO)No
Results queryable without external DBNoYes (SELECT on materialized view)
Adding a new patternNew class, compile, redeployNew CREATE MATERIALIZED VIEW
Alerting threshold changeCode change, redeployDROP + CREATE, ~30 seconds
JVM requiredYesNo (Rust-based)
Operational modelCluster (JobManager + TaskManagers)Single system, PostgreSQL protocol
Checkpoint / stateRocksDB, local diskS3-native (Hummock)
Recovery time after failureProportional to state sizeSeconds regardless of state size

Why the Code Count Difference Matters

The line count difference is not just a style preference. It has operational consequences.

A fraud team typically needs to iterate quickly. When a new fraud pattern is discovered, the question is: how fast can you deploy a new detection rule? With Flink Java, the cycle is write the class, write the POJO, wire into the DAG, write a unit test, package the JAR, submit the job to the cluster, and verify the new job appears healthy. That cycle is measured in hours.

With RisingWave SQL, you write a CREATE MATERIALIZED VIEW statement and run it. The pattern is live in seconds. If you need to adjust the threshold, you drop and recreate the view. No redeployment, no downtime, no JAR packaging.

The broader comparison between Flink and RisingWave covers this operational cost difference in depth across state management, recovery, and infrastructure.

Shipping Alerts Downstream

Both patterns produce results you can act on. In RisingWave, you push fraud alerts to Kafka or PostgreSQL with a CREATE SINK statement:

CREATE SINK fjava_fraud_alerts_sink
FROM fjava_velocity_alerts
WITH (
    connector = 'kafka',
    topic     = 'fraud-alerts-velocity',
    properties.bootstrap.server = 'kafka:9092'
) FORMAT PLAIN ENCODE JSON;

Every row that matches the velocity pattern is emitted to the fraud-alerts-velocity topic automatically. Your downstream notification service, case management system, or model scoring pipeline reads from that topic.

In Flink, shipping to Kafka requires a FlinkKafkaProducer and a separate serialization schema. The pattern is the same conceptually, but it adds another 20 to 30 lines and another class to maintain.

What About More Complex Patterns?

Windowed aggregations like velocity checks and merchant anomalies cover a large share of rule-based fraud detection. But some patterns require looking at sequences of events across time, for example, "a card used in two different countries within 30 minutes." That kind of pattern can also be handled in RisingWave using a stream-to-stream join or a sliding window, which the streaming SQL fraud detection guide covers in detail.

For patterns that require complex event processing with MATCH_RECOGNIZE (detecting specific ordered sequences like A followed by B then C), Apache Flink is currently the better choice. RisingWave does not yet support MATCH_RECOGNIZE. That said, the majority of production fraud rules are threshold-based and window-based, which SQL handles well.

If you are evaluating whether to move an existing Flink pipeline to streaming SQL, the Flink SQL vs RisingWave SQL syntax comparison article walks through the exact translation steps for sources, windows, joins, and sinks.

Frequently Asked Questions

For threshold-based and window-based fraud patterns, yes. Velocity checks, merchant anomalies, geographic anomalies, and card testing detection all map directly to TUMBLE or HOP materialized views in RisingWave. Where Flink retains an advantage is in MATCH_RECOGNIZE for complex ordered-sequence detection and in the breadth of its connector ecosystem. For most teams doing SQL-expressible fraud detection, RisingWave's shorter development cycle and lower operational overhead make it the more practical choice.

How does RisingWave handle fraud detection at scale?

RisingWave uses incremental computation. When a new transaction arrives, only the affected windows are updated rather than re-scanning all historical data. The compute and storage layers scale independently: you add compute nodes to handle more transactions per second without touching the object storage layer where state lives. For fraud detection specifically, this means processing latency stays in the single-digit millisecond range even as throughput grows.

What happens to in-flight alerts when RisingWave restarts?

Because RisingWave stores all state in object storage (S3, GCS, or compatible), recovery does not require downloading state to local disk. A restarted compute node picks up the latest checkpoint metadata, resumes from where it left off, and continues updating materialized views. Recovery typically completes in seconds regardless of state size, which is critical for fraud detection systems where gaps in coverage create risk.

RisingWave supports watermark definitions on sources and tables using the same WATERMARK FOR ts AS ts - INTERVAL '5 SECONDS' syntax that Flink uses. The difference is that watermark management is optional for many workloads. If your transactions arrive in near-real-time order, you can omit the watermark and rely on processing-time windows. If out-of-order events are a concern (which they often are with distributed payment networks), adding a watermark clause to the source definition is a one-line change.

Conclusion

Building real-time fraud detection with Flink Java and RisingWave SQL produces the same results from the same transaction stream, but the implementation distance between them is large.

Flink requires you to write a ProcessWindowFunction class, an output POJO, watermark assignment, serialization logic, and a job submission pipeline for each new pattern. A pair of patterns (velocity check and merchant anomaly) runs to roughly 120 lines before counting infrastructure configuration.

RisingWave requires 8 and 9 lines respectively for the same two patterns. Results are directly queryable without a separate serving database. New patterns go live in seconds. Threshold changes take a drop and a recreate.

If your fraud detection rules are SQL-expressible, which the vast majority are, the case for RisingWave is the time your team saves on every pattern, every iteration, and every on-call incident.


Try it yourself. Get RisingWave running locally in under five minutes with the quickstart guide, then paste the CREATE TABLE and CREATE MATERIALIZED VIEW statements from this article directly into psql.

Join the RisingWave Slack community to ask questions and see how other teams are building real-time detection pipelines.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.