Real-time credit card fraud detection with streaming SQL works by expressing fraud rules as continuously maintained SQL queries that evaluate every transaction the moment it arrives. Instead of a nightly batch job that gives fraudsters hours to act, a streaming database like RisingWave updates fraud alert views in milliseconds, within the 100-200 ms authorization window where intervention still matters.
This tutorial walks you through building a complete fraud detection system from scratch. You will create tables to hold transaction and card data, define four materialized views that each encode a different detection rule, and combine them into a unified alert dashboard. Every SQL statement has been verified against RisingWave 2.8.0.
What You Will Build
By the end of this tutorial you will have:
- Two base tables:
fraud_transactionsandfraud_cards - Four materialized views detecting high-value transactions, velocity anomalies, geographic impossibilities, and card testing patterns
- A unified alert dashboard that aggregates all signals
- An understanding of how to connect this to Kafka for production ingestion
No Java, no Flink DAGs, no custom application code. Just SQL.
Prerequisites
- RisingWave running locally or in the cloud. The fastest way to start is
brew install risingwave && risingwave(macOS) or following the RisingWave quickstart. - A PostgreSQL-compatible client such as
psql. All examples usepsql -h localhost -p 4566 -U root -d dev.
Why Batch Detection Falls Short
Most fraud detection pipelines still rely on batch ETL. A job runs every hour or every night, scans recent transactions, applies rules, and writes alerts to a database. The lag is the problem.
Credit card authorization takes 100-300 milliseconds. By the time a batch pipeline flags a stolen card, that card has already been used dozens of times across multiple merchants. The fraudster is long gone; the cardholder is stuck filing disputes.
Streaming SQL closes the gap. A streaming database processes each transaction as it arrives, updates materialized views incrementally, and makes results queryable immediately. The table below summarizes the difference:
| Aspect | Batch Detection | Streaming SQL Detection |
| Detection latency | Minutes to hours | Milliseconds |
| Data freshness | Stale (last batch run) | Always current |
| Infrastructure footprint | ETL + warehouse + scheduler | Single streaming database |
| Rule iteration speed | Redeploy pipeline | Update SQL, zero downtime |
| Serving layer | Separate cache or database | Materialized views are directly queryable |
RisingWave uses incremental computation. When a new transaction arrives, only the affected portions of each materialized view are recomputed, not the entire dataset. This is what makes sub-second detection latency practical at production scale.
Step 1: Create the Base Tables
Start by creating the two tables that will hold your data. In a production deployment, fraud_transactions would be a source connected to Kafka or another message broker. For this tutorial, a table lets you insert test data directly, and all materialized views work identically either way.
CREATE TABLE fraud_transactions (
txn_id VARCHAR,
card_id VARCHAR,
user_id VARCHAR,
amount DECIMAL,
merchant_id VARCHAR,
merchant_cat VARCHAR,
country VARCHAR,
city VARCHAR,
txn_time TIMESTAMP,
is_online BOOLEAN
);
CREATE TABLE fraud_cards (
card_id VARCHAR,
user_id VARCHAR,
card_type VARCHAR,
credit_limit DECIMAL,
home_country VARCHAR,
issued_at TIMESTAMP
);
The fraud_cards table stores card metadata including the credit limit and the cardholder's home country. These fields are critical for two of the detection rules later in this tutorial.
Step 2: Load Sample Data
Insert a small dataset to test against. The scenarios are designed to trigger all four detection rules.
INSERT INTO fraud_cards (card_id, user_id, card_type, credit_limit, home_country, issued_at) VALUES
('card_001', 'user_alice', 'visa', 5000.00, 'US', '2023-01-15 00:00:00'),
('card_002', 'user_bob', 'mastercard', 8000.00, 'UK', '2022-06-20 00:00:00'),
('card_003', 'user_carol', 'amex', 15000.00, 'US', '2021-11-10 00:00:00'),
('card_004', 'user_dave', 'visa', 3000.00, 'DE', '2024-03-05 00:00:00');
INSERT INTO fraud_transactions
(txn_id, card_id, user_id, amount, merchant_id, merchant_cat, country, city, txn_time, is_online)
VALUES
-- Normal activity for alice
('txn_001', 'card_001', 'user_alice', 45.00, 'merch_A1', 'grocery', 'US', 'New York', '2026-04-01 09:00:00', false),
('txn_002', 'card_001', 'user_alice', 120.00, 'merch_A2', 'restaurant', 'US', 'New York', '2026-04-01 09:30:00', false),
-- High-value: single charge exceeds alice's credit limit
('txn_003', 'card_001', 'user_alice', 6200.00, 'merch_B1', 'jewelry', 'US', 'New York', '2026-04-01 10:00:00', false),
-- Geographic anomaly: bob's card used in UK, US, and Germany within one hour
('txn_010', 'card_002', 'user_bob', 200.00, 'merch_C1', 'retail', 'UK', 'London', '2026-04-01 08:00:00', false),
('txn_011', 'card_002', 'user_bob', 350.00, 'merch_C2', 'retail', 'US', 'Chicago', '2026-04-01 08:20:00', false),
('txn_012', 'card_002', 'user_bob', 410.00, 'merch_C3', 'retail', 'DE', 'Berlin', '2026-04-01 08:40:00', false),
-- Velocity: carol makes 6 purchases totaling $19,500 in under 5 minutes
('txn_020', 'card_003', 'user_carol', 4000.00, 'merch_D1', 'electronics', 'US', 'Los Angeles', '2026-04-01 14:00:00', true),
('txn_021', 'card_003', 'user_carol', 3500.00, 'merch_D2', 'electronics', 'US', 'Los Angeles', '2026-04-01 14:01:00', true),
('txn_022', 'card_003', 'user_carol', 2800.00, 'merch_D3', 'electronics', 'US', 'Los Angeles', '2026-04-01 14:02:00', true),
('txn_023', 'card_003', 'user_carol', 3100.00, 'merch_D4', 'electronics', 'US', 'Los Angeles', '2026-04-01 14:03:00', true),
('txn_024', 'card_003', 'user_carol', 2900.00, 'merch_D5', 'electronics', 'US', 'Los Angeles', '2026-04-01 14:03:30', true),
('txn_025', 'card_003', 'user_carol', 3200.00, 'merch_D6', 'electronics', 'US', 'Los Angeles', '2026-04-01 14:04:00', true),
-- Card testing: dave's card pinged with 5 micro-charges in under 2 minutes
('txn_030', 'card_004', 'user_dave', 0.50, 'merch_E1', 'online', 'DE', 'Berlin', '2026-04-01 11:00:00', true),
('txn_031', 'card_004', 'user_dave', 1.00, 'merch_E2', 'online', 'DE', 'Berlin', '2026-04-01 11:00:30', true),
('txn_032', 'card_004', 'user_dave', 0.99, 'merch_E3', 'online', 'DE', 'Berlin', '2026-04-01 11:01:00', true),
('txn_033', 'card_004', 'user_dave', 1.50, 'merch_E4', 'online', 'DE', 'Berlin', '2026-04-01 11:01:30', true),
('txn_034', 'card_004', 'user_dave', 0.75, 'merch_E5', 'online', 'DE', 'Berlin', '2026-04-01 11:01:45', true);
Four cardholders, four distinct fraud patterns. Each will be caught by a different materialized view.
Step 3: Detect High-Value Transactions
The first detection rule targets charges that are either absolutely large or disproportionately large relative to the card's credit limit. A $6,200 charge on a $5,000 card limit is a strong signal regardless of category.
CREATE MATERIALIZED VIEW fraud_high_value_alerts AS
SELECT
t.txn_id,
t.card_id,
t.user_id,
t.amount,
t.merchant_cat,
t.country,
t.city,
t.txn_time,
c.credit_limit,
ROUND((t.amount / c.credit_limit * 100)::NUMERIC, 1) AS pct_of_limit
FROM fraud_transactions t
JOIN fraud_cards c ON t.card_id = c.card_id
WHERE t.amount > 5000
OR t.amount > c.credit_limit * 0.5;
The JOIN with fraud_cards gives you context that a raw transaction stream alone cannot provide. A $3,000 transaction is routine on a $50,000 limit but suspicious on a $3,000 limit.
SELECT txn_id, card_id, user_id, amount, merchant_cat, city, pct_of_limit
FROM fraud_high_value_alerts
ORDER BY amount DESC;
txn_id | card_id | user_id | amount | merchant_cat | city | pct_of_limit
---------+----------+------------+---------+--------------+----------+--------------
txn_003 | card_001 | user_alice | 6200.00 | jewelry | New York | 124.0
(1 row)
Alice's $6,200 jewelry charge appears immediately. The pct_of_limit column shows it exceeds her $5,000 credit limit by 24%, an obvious flag.
Testing Real-Time Updates
This is where streaming SQL differs from a static query. Insert a new suspicious transaction and query the view again without any delay:
INSERT INTO fraud_transactions
(txn_id, card_id, user_id, amount, merchant_id, merchant_cat, country, city, txn_time, is_online)
VALUES
('txn_099', 'card_001', 'user_alice', 7500.00, 'merch_Z1', 'luxury', 'FR', 'Paris', '2026-04-01 15:30:00', false);
SELECT txn_id, card_id, user_id, amount, city, pct_of_limit
FROM fraud_high_value_alerts
ORDER BY amount DESC;
txn_id | card_id | user_id | amount | city | pct_of_limit
---------+----------+------------+---------+----------+--------------
txn_099 | card_001 | user_alice | 7500.00 | Paris | 150.0
txn_003 | card_001 | user_alice | 6200.00 | New York | 124.0
(2 rows)
The new transaction appears in the materialized view immediately. RisingWave did not re-scan the table; it applied an incremental update to only the rows affected by the new insert.
Step 4: Detect Velocity Anomalies
Stolen cards are often used in rapid bursts. Fraudsters know authorization windows exist and try to maximize spending before the card is blocked. This rule uses the TUMBLE window function to group transactions into fixed 5-minute intervals and flag cards that exceed normal spending velocity.
CREATE MATERIALIZED VIEW fraud_velocity_alerts AS
SELECT
card_id,
user_id,
window_start,
window_end,
COUNT(*) AS txn_count,
SUM(amount) AS total_amount,
COUNT(DISTINCT merchant_cat) AS unique_categories
FROM TUMBLE(fraud_transactions, txn_time, INTERVAL '5 minutes')
GROUP BY card_id, user_id, window_start, window_end
HAVING COUNT(*) >= 5 OR SUM(amount) > 10000;
TUMBLE divides the txn_time column into non-overlapping 5-minute buckets. The HAVING clause fires when either the transaction count or total spend crosses the threshold. COUNT(DISTINCT merchant_cat) surfaces whether the burst spans multiple merchant types, which correlates with organized fraud rather than a single legitimate purchase.
SELECT card_id, user_id, window_start, window_end, txn_count, total_amount, unique_categories
FROM fraud_velocity_alerts
ORDER BY window_start;
card_id | user_id | window_start | window_end | txn_count | total_amount | unique_categories
----------+------------+---------------------+---------------------+-----------+--------------+-------------------
card_004 | user_dave | 2026-04-01 11:00:00 | 2026-04-01 11:05:00 | 5 | 4.74 | 1
card_003 | user_carol | 2026-04-01 14:00:00 | 2026-04-01 14:05:00 | 6 | 19500.00 | 1
(2 rows)
Both card_004 (dave) and card_003 (carol) are flagged. Carol's result is clear: 6 transactions totaling $19,500 in 5 minutes. Dave's result looks different: 5 transactions totaling only $4.74. That pattern will be explained by the card testing rule in Step 6.
Step 5: Detect Geographic Anomalies
A card physically cannot be used in London at 8:00 AM and in Chicago at 8:20 AM by the same person. When the same card appears in multiple countries within a single hour, either the card number was cloned, or the magnetic stripe data was stolen and is being used remotely.
CREATE MATERIALIZED VIEW fraud_geo_anomaly_alerts AS
SELECT
card_id,
user_id,
window_start,
window_end,
COUNT(DISTINCT country) AS country_count,
COUNT(DISTINCT city) AS city_count,
COUNT(*) AS txn_count,
SUM(amount) AS total_amount
FROM TUMBLE(fraud_transactions, txn_time, INTERVAL '1 hour')
GROUP BY card_id, user_id, window_start, window_end
HAVING COUNT(DISTINCT country) >= 2;
The 1-hour window is longer here because physical travel between distant locations would take at least several hours. A card appearing in 2 or more distinct countries within 60 minutes is flagged immediately.
SELECT card_id, user_id, window_start, window_end, country_count, city_count, txn_count, total_amount
FROM fraud_geo_anomaly_alerts
ORDER BY window_start;
card_id | user_id | window_start | window_end | country_count | city_count | txn_count | total_amount
----------+----------+---------------------+---------------------+---------------+------------+-----------+--------------
card_002 | user_bob | 2026-04-01 08:00:00 | 2026-04-01 09:00:00 | 3 | 3 | 3 | 960.00
(1 row)
Bob's card appeared in the UK, US, and Germany within a single hour. The city_count confirming 3 distinct cities eliminates the possibility that this is a data quality issue with a mislabeled country code.
Step 6: Detect Card Testing
Card testing is a specific attack where fraudsters verify a stolen card number by making a series of very small charges before moving on to large purchases. The pattern is distinctive: many transactions, each under $2, clustered within a short window.
CREATE MATERIALIZED VIEW fraud_card_testing_alerts AS
SELECT
card_id,
user_id,
window_start,
window_end,
COUNT(*) AS txn_count,
SUM(amount) AS total_amount,
ROUND(AVG(amount), 2) AS avg_amount,
MAX(amount) AS max_amount
FROM TUMBLE(fraud_transactions, txn_time, INTERVAL '2 minutes')
GROUP BY card_id, user_id, window_start, window_end
HAVING COUNT(*) >= 3 AND AVG(amount) < 2.00;
The 2-minute window is deliberately tight. Legitimate micro-transactions (parking meters, transit taps) spread out over time; card testing clusters in seconds. The HAVING clause requires both a count threshold and an average amount threshold to avoid false positives from legitimate small-value purchases.
SELECT card_id, user_id, window_start, window_end, txn_count, total_amount, avg_amount, max_amount
FROM fraud_card_testing_alerts
ORDER BY window_start;
card_id | user_id | window_start | window_end | txn_count | total_amount | avg_amount | max_amount
----------+-----------+---------------------+---------------------+-----------+--------------+------------+------------
card_004 | user_dave | 2026-04-01 11:00:00 | 2026-04-01 11:02:00 | 5 | 4.74 | 0.95 | 1.50
(1 row)
Dave's card: 5 transactions averaging $0.95, the highest being $1.50, all within 1 minute 45 seconds. This is textbook card testing. The velocity rule caught the same card because 5 transactions in a short window triggered that threshold too. Multiple rules firing on the same card is itself a stronger signal.
Step 7: Build the Unified Alert Dashboard
With four detection rules running independently, you need a single view that consolidates all signals. The unified dashboard lets analysts and downstream systems see every active alert in one place.
CREATE MATERIALIZED VIEW fraud_unified_alerts AS
SELECT
'high_value' AS alert_type,
txn_id AS event_id,
card_id,
user_id,
amount,
city AS location,
txn_time AS detected_at,
'Amount exceeds 50% of credit limit or absolute threshold' AS reason
FROM fraud_high_value_alerts
UNION ALL
SELECT
'velocity' AS alert_type,
card_id || '@' || window_start::VARCHAR AS event_id,
card_id,
user_id,
total_amount AS amount,
'multiple locations' AS location,
window_start AS detected_at,
txn_count::VARCHAR || ' transactions in 5 min window' AS reason
FROM fraud_velocity_alerts
UNION ALL
SELECT
'geo_anomaly' AS alert_type,
card_id || '@' || window_start::VARCHAR AS event_id,
card_id,
user_id,
total_amount AS amount,
country_count::VARCHAR || ' countries' AS location,
window_start AS detected_at,
'Card used in ' || country_count::VARCHAR || ' countries within 1 hour' AS reason
FROM fraud_geo_anomaly_alerts
UNION ALL
SELECT
'card_testing' AS alert_type,
card_id || '@' || window_start::VARCHAR AS event_id,
card_id,
user_id,
total_amount AS amount,
'online' AS location,
window_start AS detected_at,
txn_count::VARCHAR || ' micro-transactions averaging $' || avg_amount::VARCHAR AS reason
FROM fraud_card_testing_alerts;
Query the dashboard:
SELECT alert_type, event_id, card_id, user_id, amount, location, detected_at, reason
FROM fraud_unified_alerts
ORDER BY detected_at;
alert_type | event_id | card_id | user_id | amount | location | detected_at | reason
--------------+------------------------------+----------+------------+----------+--------------------+---------------------+----------------------------------------------------------
geo_anomaly | card_002@2026-04-01 08:00:00 | card_002 | user_bob | 960.00 | 3 countries | 2026-04-01 08:00:00 | Card used in 3 countries within 1 hour
high_value | txn_003 | card_001 | user_alice | 6200.00 | New York | 2026-04-01 10:00:00 | Amount exceeds 50% of credit limit or absolute threshold
card_testing | card_004@2026-04-01 11:00:00 | card_004 | user_dave | 4.74 | online | 2026-04-01 11:00:00 | 5 micro-transactions averaging $0.95
velocity | card_004@2026-04-01 11:00:00 | card_004 | user_dave | 4.74 | multiple locations | 2026-04-01 11:00:00 | 5 transactions in 5 min window
velocity | card_003@2026-04-01 14:00:00 | card_003 | user_carol | 19500.00 | multiple locations | 2026-04-01 14:00:00 | 6 transactions in 5 min window
(5 rows)
All four cardholders are caught by at least one rule. Dave's card appears twice: once for card testing and once for velocity. When you see a card flagged by multiple independent rules simultaneously, that is a high-confidence fraud signal worth immediate action.
Step 8: Connect to Production Data Sources
The tables you created in Step 1 simulate what a Kafka-backed source looks like. In production, replace the CREATE TABLE statements with CREATE SOURCE statements that read directly from your message broker.
For a Kafka topic containing JSON-encoded transaction events:
CREATE SOURCE fraud_transactions_live (
txn_id VARCHAR,
card_id VARCHAR,
user_id VARCHAR,
amount DECIMAL,
merchant_id VARCHAR,
merchant_cat VARCHAR,
country VARCHAR,
city VARCHAR,
txn_time TIMESTAMP,
is_online BOOLEAN
)
WITH (
connector = 'kafka',
topic = 'credit-card-transactions',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;
All four materialized views defined in this tutorial work identically against a source. The TUMBLE windowing, JOIN with fraud_cards, and HAVING clauses are unchanged. Swap the upstream table name in each materialized view definition and the system is live.
Pushing Alerts Downstream
Detection is only useful if it triggers a response. Use CREATE SINK to forward the fraud_unified_alerts view to a Kafka topic consumed by your case management system, notification service, or ML pipeline:
CREATE SINK fraud_alerts_sink
FROM fraud_unified_alerts
WITH (
connector = 'kafka',
topic = 'fraud-alerts',
properties.bootstrap.server = 'kafka:9092',
primary_key = 'event_id'
)
FORMAT UPSERT ENCODE JSON;
Every time a new row appears in fraud_unified_alerts, RisingWave writes a JSON message to the fraud-alerts topic. Downstream consumers act on each alert without polling or scheduling.
System Architecture
The complete data flow for this system:
flowchart LR
A[Payment Gateway] -->|Transaction Events| B[Apache Kafka]
B -->|CREATE SOURCE| C[RisingWave]
D[Card Database] -->|CREATE SOURCE / CDC| C
C -->|Materialized Views| E[fraud_high_value_alerts]
C -->|Materialized Views| F[fraud_velocity_alerts]
C -->|Materialized Views| G[fraud_geo_anomaly_alerts]
C -->|Materialized Views| H[fraud_card_testing_alerts]
E & F & G & H -->|UNION ALL| I[fraud_unified_alerts]
I -->|CREATE SINK| J[Kafka: fraud-alerts]
J --> K[Notification Service]
J --> L[Case Management]
The full fraud detection logic lives inside RisingWave as SQL. No external application code manages the detection state, windowing, or aggregation.
Tuning for Production
A few considerations when moving this system from development to production:
Window size calibration. The 5-minute velocity window and 1-hour geo window are starting points. Review your transaction data to understand what normal burst patterns look like for your card portfolio. Tighten windows to reduce false negatives; widen them to reduce false positives.
Threshold adjustment. The amount > 5000 and amount > c.credit_limit * 0.5 thresholds need calibration against your actual fraud and non-fraud transaction distributions. A $5,000 transaction is routine for premium cardholders; use the credit limit ratio to normalize across card tiers.
Layered confidence scoring. A card flagged by two independent rules simultaneously (like dave's card triggering both velocity and card testing) warrants a different response than a card flagged by only one. Add a confidence score to fraud_unified_alerts by counting how many rules fired per card in a given time window:
SELECT card_id, user_id, COUNT(DISTINCT alert_type) AS rule_count, SUM(amount) AS total_flagged
FROM fraud_unified_alerts
WHERE detected_at >= NOW() - INTERVAL '1 hour'
GROUP BY card_id, user_id
ORDER BY rule_count DESC, total_flagged DESC;
Cards with a higher rule_count get routed to human review immediately; single-rule flags go to automated holds.
Monitoring view freshness. Query rw_catalog.rw_materialized_views to inspect the materialized views in your cluster. For production latency monitoring, you can track the time difference between txn_time in the source and when alerts appear in the unified dashboard.
For more on operating streaming jobs at scale, see the RisingWave guide to managing streaming jobs.
Frequently Asked Questions
How does real-time credit card fraud detection with streaming SQL differ from traditional ML-based fraud detection?
The two approaches are complementary, not competing. Streaming SQL handles rule-based and statistical detection: velocity, geography, amount thresholds. ML models handle complex behavioral patterns that rules cannot express. The typical production architecture combines both: RisingWave computes real-time features (transaction frequency, rolling amounts, location sequences) and evaluates rule-based alerts, while an ML model scores each transaction using those features. RisingWave serves as the feature store and rule engine; the model adds a probabilistic score on top. See how a financial services company used RisingWave as a real-time feature store for fraud detection for a production example.
What happens to in-flight materialized view state if RisingWave restarts?
RisingWave persists its streaming state to object storage (S3-compatible). On restart, materialized views resume from their last checkpoint rather than reprocessing the entire source. Checkpoint intervals are configurable; the default is 1 second for most deployments. This means the maximum data loss on an unexpected restart is bounded by the checkpoint interval, not by the total backlog. For fraud detection, this is acceptable: a brief gap in alerting during a restart is far better than hours of batch delay.
Can this system handle millions of transactions per second?
RisingWave uses a distributed architecture with separate compute and storage layers. Both scale horizontally. The disaggregated storage model means compute nodes can be added independently of storage, which is useful when fraud alert volume grows but the underlying transaction storage does not change. The incremental computation model also helps: processing cost per transaction is proportional to the change it introduces, not the total historical data volume.
How do you handle false positives without missing real fraud?
Tune the HAVING thresholds in each materialized view based on your false positive rate. Start with broad thresholds (more alerts, fewer misses) and tighten over time as you build a labeled dataset of true positives and false positives. You can also add a priority tier to the unified dashboard: cards triggering three or more distinct rules get immediate blocks; cards triggering one rule get soft holds pending cardholder confirmation. Because materialized view definitions can be updated with CREATE OR REPLACE MATERIALIZED VIEW, threshold adjustments take effect immediately without reprocessing historical data.
Conclusion
You have built a working real-time credit card fraud detection system using only SQL. The four materialized views you created handle the most common fraud patterns in production systems: high-value anomalies, transaction velocity bursts, geographic impossibilities, and card testing attacks.
The system you built in this tutorial:
- Detects fraud in milliseconds, not hours
- Requires no application code, no JVM clusters, and no custom serialization
- Scales to production by swapping
CREATE TABLEforCREATE SOURCEagainst Kafka - Extends to new fraud patterns by adding a new
CREATE MATERIALIZED VIEW
The most important property of this architecture is its iteration speed. When your fraud team identifies a new attack pattern, the response is a new SQL view. No build pipelines, no code reviews for Java operators, no cluster redeployment. That speed is what lets you stay ahead of adversaries who adapt constantly.
To go deeper, explore how RisingWave handles windowing functions for more advanced temporal detection patterns.
Ready to run this yourself? Get started with RisingWave in under five minutes.
Join the RisingWave Slack community to discuss fraud detection architectures and get help from the team.

