Fraud detection in e-commerce is a race: legitimate orders should complete in milliseconds, while fraudulent patterns — velocity attacks, account takeovers, synthetic identity fraud — must be flagged before fulfillment begins. RisingWave, a PostgreSQL-compatible streaming database, enables real-time fraud signal computation using standard SQL, without building separate machine learning pipelines for each detection rule.
The Problem: Batch Fraud Analysis Misses the Window
E-commerce fraud evolves faster than batch detection systems can respond. By the time a nightly fraud report surfaces a pattern, the fraudster has already placed hundreds of orders, collected digital goods or triggered COD deliveries, and moved on.
The specific failure modes of batch fraud detection:
Velocity attacks: A fraudster tests stolen card numbers with small transactions across many accounts. Velocity checks (N transactions from the same device or IP within M minutes) require sliding-window computation that batch systems cannot provide in real time.
Account takeover: A compromised account places an unusual order — different shipping address, new payment method, atypical order value. Batch systems compare to historical patterns overnight. By then, the order has shipped.
COD fraud rings: In Southeast Asia and South Asia, cash on delivery fraud involves placing orders with false delivery addresses, then refusing delivery or providing fake IDs. Detection requires pattern matching across multiple orders from the same device fingerprint or IP range.
Chargeback pattern detection: Certain customer IDs, email domains, or IP ranges show elevated chargeback rates. Identifying these patterns in real time — before new orders from the same signals are fulfilled — requires continuously updated aggregations.
How Streaming SQL Solves This
RisingWave maintains sliding window aggregations over the transaction stream. For each new transaction event, velocity counters (transactions per IP, per device fingerprint, per customer ID) are updated in real time. Anomaly rules expressed as SQL predicates fire immediately when thresholds are crossed.
Because these computations are incremental materialized views, they add negligible latency to the transaction processing path — they run alongside the normal order flow, not in the critical path.
Building Real-Time Fraud Detection
Step 1: Data Source Setup
Fraud detection requires both order events and payment events, with device and network signals included:
CREATE SOURCE orders (
order_id VARCHAR,
merchant_id VARCHAR,
customer_id VARCHAR,
channel VARCHAR, -- 'web', 'mobile', 'pos', 'marketplace'
payment_method VARCHAR, -- 'card', 'alipay', 'paypal', 'cod'
subtotal NUMERIC,
discount NUMERIC,
total NUMERIC,
status VARCHAR,
created_at TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'ecommerce.orders',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;
CREATE SOURCE transaction_signals (
signal_id VARCHAR,
order_id VARCHAR,
customer_id VARCHAR,
device_fingerprint VARCHAR,
ip_address VARCHAR,
ip_country VARCHAR,
email_domain VARCHAR,
shipping_address_hash VARCHAR,
billing_country VARCHAR,
is_proxy BOOLEAN,
is_vpn BOOLEAN,
occurred_at TIMESTAMPTZ
) WITH (
connector = 'kafka',
topic = 'ecommerce.transaction_signals',
properties.bootstrap.server = 'kafka:9092',
scan.startup.mode = 'earliest'
) FORMAT PLAIN ENCODE JSON;
Step 2: Core Materialized View
The velocity check view computes transaction counts per device fingerprint, IP address, and customer ID in a sliding 1-hour window. This catches card testing attacks and account reuse patterns:
-- Velocity counters: transactions per device fingerprint in 1 hour
CREATE MATERIALIZED VIEW mv_device_velocity_1h AS
SELECT
device_fingerprint,
WINDOW_START AS window_start,
COUNT(DISTINCT order_id) AS order_count,
COUNT(DISTINCT customer_id) AS unique_customers,
COUNT(DISTINCT ip_address) AS unique_ips,
SUM(o.total) AS total_transaction_value
FROM HOP(transaction_signals, occurred_at, INTERVAL '5 MINUTES', INTERVAL '1 HOUR')
JOIN orders o ON transaction_signals.order_id = o.order_id
GROUP BY device_fingerprint, WINDOW_START;
-- Velocity counters: transactions per IP in 1 hour
CREATE MATERIALIZED VIEW mv_ip_velocity_1h AS
SELECT
ip_address,
WINDOW_START AS window_start,
COUNT(DISTINCT order_id) AS order_count,
COUNT(DISTINCT customer_id) AS unique_customers,
COUNT(DISTINCT device_fingerprint) AS unique_devices
FROM HOP(transaction_signals, occurred_at, INTERVAL '5 MINUTES', INTERVAL '1 HOUR')
GROUP BY ip_address, WINDOW_START;
Step 3: Aggregations — Fraud Signal Scoring
Compute a composite fraud risk score by combining multiple signals. Flag transactions that exceed individual thresholds or combine multiple moderate signals:
-- Flags: high-velocity devices
CREATE MATERIALIZED VIEW mv_high_velocity_devices AS
SELECT
device_fingerprint,
window_start,
order_count,
unique_customers,
'high_device_velocity' AS fraud_signal,
CASE
WHEN order_count >= 20 THEN 'critical'
WHEN order_count >= 10 THEN 'high'
ELSE 'medium'
END AS severity
FROM mv_device_velocity_1h
WHERE order_count >= 5;
-- Flags: shared device across multiple customers (account takeover signal)
CREATE MATERIALIZED VIEW mv_device_sharing_flags AS
SELECT
device_fingerprint,
window_start,
unique_customers,
order_count,
'device_shared_across_customers' AS fraud_signal,
'high' AS severity
FROM mv_device_velocity_1h
WHERE unique_customers >= 3;
-- Flags: orders with IP/billing country mismatch
CREATE MATERIALIZED VIEW mv_geolocation_mismatches AS
SELECT
ts.order_id,
ts.customer_id,
ts.device_fingerprint,
ts.ip_country,
ts.billing_country,
ts.is_proxy,
ts.is_vpn,
'geo_mismatch' AS fraud_signal,
CASE WHEN ts.is_proxy OR ts.is_vpn THEN 'high' ELSE 'medium' END AS severity
FROM transaction_signals ts
WHERE ts.ip_country != ts.billing_country
OR ts.is_proxy = TRUE
OR ts.is_vpn = TRUE;
Track chargeback rates by email domain over a rolling 7-day window to identify synthetic identity patterns:
CREATE MATERIALIZED VIEW mv_email_domain_risk_7d AS
SELECT
ts.email_domain,
WINDOW_START AS window_start,
COUNT(DISTINCT ts.order_id) AS total_orders,
SUM(CASE WHEN o.status = 'charged_back'
THEN 1 ELSE 0 END) AS chargebacks,
ROUND(
SUM(CASE WHEN o.status = 'charged_back'
THEN 1 ELSE 0 END)::NUMERIC /
NULLIF(COUNT(DISTINCT ts.order_id), 0) * 100,
2
) AS chargeback_rate_pct
FROM HOP(transaction_signals, occurred_at,
INTERVAL '1 HOUR', INTERVAL '7 DAYS') ts
JOIN orders o ON ts.order_id = o.order_id
GROUP BY ts.email_domain, WINDOW_START
HAVING COUNT(DISTINCT ts.order_id) >= 5; -- minimum sample size
Step 4: Downstream / Serving
Publish fraud signals to a risk scoring API and a block list management system:
-- Real-time fraud signals to risk API
CREATE SINK sink_fraud_signals
FROM mv_high_velocity_devices
WITH (
connector = 'kafka',
properties.bootstrap.server = 'kafka:9092',
topic = 'fraud.velocity_signals'
) FORMAT PLAIN ENCODE JSON;
-- Device sharing flags to block list topic
CREATE SINK sink_device_flags
FROM mv_device_sharing_flags
WITH (
connector = 'kafka',
properties.bootstrap.server = 'kafka:9092',
topic = 'fraud.device_flags'
) FORMAT PLAIN ENCODE JSON;
-- Email domain risk scores to application DB for checkout pre-screening
CREATE SINK sink_domain_risk
FROM mv_email_domain_risk_7d
WITH (
connector = 'jdbc',
jdbc.url = 'jdbc:postgresql://fraud-db:5432/risk',
table.name = 'email_domain_risk',
type = 'upsert',
primary_key = 'email_domain,window_start'
);
Fraud Detection: Batch vs. Streaming SQL
| Signal Type | Batch Detection | Streaming SQL (RisingWave) |
| Velocity check (device) | Daily report | 1-hour sliding window, continuous |
| Velocity check (IP) | Daily report | 1-hour sliding window, continuous |
| Account sharing | Overnight analysis | Real-time device × customer count |
| Geo mismatch | At order review | At order placement |
| Chargeback rate by domain | Weekly report | 7-day rolling window, continuously updated |
| COD fraud ring | Manual investigation | Pattern match in streaming SQL |
| Response time | Hours after fraud occurs | Within the fraud event window |
FAQ
Q: Can RisingWave replace a dedicated fraud ML platform? RisingWave computes the real-time features (velocity signals, pattern flags, risk scores) that ML models consume. The ML scoring itself typically runs in a separate inference service. RisingWave feeds that service with continuously fresh features — effectively replacing the batch feature computation pipeline.
Q: How do I handle false positives without blocking legitimate high-value customers? Add a customer reputation table (maintained via CDC from your customer database) and join it into the fraud signal views. A customer with a long, clean transaction history on the same device gets a lower fraud score even if their IP triggers a moderate flag.
Q: How do I maintain a real-time block list (banned devices, IPs) in RisingWave? Use a CDC source connected to your block list database (PostgreSQL or MySQL). RisingWave will maintain the join between the block list and incoming transaction signals in real time — any transaction from a newly added blocked device is flagged within seconds.
Key Takeaways
- Velocity checks, the foundation of real-time fraud detection, require sliding window aggregations that only streaming SQL can compute continuously.
- HOP windows in RisingWave update every 5 minutes while maintaining a 1-hour lookback — giving you a balance between freshness and detection window size.
- Device sharing detection (multiple customers on one device fingerprint) is a native sliding window aggregation —
COUNT(DISTINCT customer_id)per device per hour. - Chargeback rate by email domain, computed over a 7-day rolling window, surfaces synthetic identity fraud patterns that emerge slowly over many transactions.
- Fraud signals published to Kafka enable downstream systems (risk API, block list management, order hold workflows) to act within the fraud window.

