Transaction Velocity Fraud Detection with SQL Windows
Transaction velocity fraud detection works by counting card or account activity within rolling time windows and flagging sudden spikes before they cause financial loss. With RisingWave, you can write the entire pipeline, from raw Kafka events to composite fraud scores, in standard SQL, with no Java, no custom connectors, and no separate serving layer.
What Is Transaction Velocity and Why Does It Matter for Fraud?
A fraudster testing a stolen card typically runs a burst of small transactions in rapid succession to verify the card is active before making larger purchases. Legitimate cardholders almost never behave this way. Velocity checks formalize this intuition: if a card fires 15 transactions in 10 minutes, that is an anomaly worth investigating.
Velocity signals catch several fraud patterns:
- Card testing: Many small authorizations in a short burst
- Account takeover: A dormant account suddenly transacting at high frequency
- Merchant collusion: Unusually concentrated spend at a single merchant category
- Geographic impossibility: High spend velocity combined with location mismatch (handled in a companion pipeline)
The challenge is computing these counts in real time, continuously, at scale, across millions of cards. Batch jobs are too slow. Custom stream processing code is expensive to maintain. SQL windows on a streaming database are the practical answer.
Setting Up the Source: Kafka to RisingWave
The first step is creating a source that reads raw transaction events from a Kafka topic. RisingWave supports Kafka natively with a simple DDL statement.
CREATE SOURCE transactions (
transaction_id VARCHAR,
card_id VARCHAR,
account_id VARCHAR,
merchant_id VARCHAR,
merchant_category VARCHAR,
amount_usd DECIMAL(12, 2),
currency VARCHAR,
status VARCHAR,
event_time TIMESTAMPTZ
)
WITH (
connector = 'kafka',
topic = 'transactions',
properties.bootstrap.server = 'broker:9092',
scan.startup.mode = 'latest'
)
FORMAT PLAIN ENCODE JSON;
Each message contains a card identifier, a merchant category, an amount, and a timestamp. The event_time column is the watermark for all window operations.
How Does a TUMBLE Window Velocity Check Work?
A TUMBLE window splits the timeline into non-overlapping buckets of equal length. Every event falls into exactly one bucket. This makes TUMBLE ideal for reporting and alerting on discrete intervals, such as "how many transactions did card X make in the 10-minute slot from 14:00 to 14:10?"
CREATE MATERIALIZED VIEW card_velocity_tumble_10min AS
SELECT
card_id,
window_start,
window_end,
COUNT(*) AS txn_count,
COUNT(DISTINCT merchant_id) AS unique_merchants,
SUM(amount_usd) AS total_amount
FROM TUMBLE(
transactions,
event_time,
INTERVAL '10 minutes'
)
GROUP BY
card_id,
window_start,
window_end;
RisingWave computes this incrementally. As each transaction arrives from Kafka, the database updates only the affected window bucket without reprocessing historical data. You can query this view at any time with a standard SELECT statement.
To identify cards that crossed a threshold within any 10-minute window:
SELECT
card_id,
window_start,
txn_count,
total_amount
FROM card_velocity_tumble_10min
WHERE txn_count > 10
OR total_amount > 500.00
ORDER BY window_start DESC;
How Do HOP (Sliding) Windows Provide Continuous Velocity Coverage?
TUMBLE windows have a blind spot: a burst that starts at 14:09 and ends at 14:11 straddles two windows and might not breach the threshold in either one individually. HOP windows solve this by overlapping.
A HOP window is defined by two intervals: the window size (how long each window spans) and the slide interval (how often a new window starts). A card event participates in multiple overlapping windows simultaneously.
CREATE MATERIALIZED VIEW card_velocity_hop_1h AS
SELECT
card_id,
window_start,
window_end,
COUNT(*) AS txn_count,
SUM(amount_usd) AS total_amount,
MIN(amount_usd) AS min_txn,
MAX(amount_usd) AS max_txn
FROM HOP(
transactions,
event_time,
INTERVAL '5 minutes', -- slide: a new window starts every 5 min
INTERVAL '1 hour' -- size: each window covers 1 hour
)
GROUP BY
card_id,
window_start,
window_end;
With a 5-minute slide and a 1-hour window, each card has 12 overlapping windows active at any moment. The moment a card's count crosses the threshold in any window, an alert fires. There are no gaps in coverage.
Querying for cards currently flagged across any active window:
SELECT
card_id,
MAX(txn_count) AS peak_count_in_1h,
MAX(total_amount) AS peak_spend_in_1h
FROM card_velocity_hop_1h
WHERE window_end > NOW() - INTERVAL '5 minutes'
GROUP BY card_id
HAVING MAX(txn_count) > 20
OR MAX(total_amount) > 2000.00;
How Do You Track Amount-Based Velocity Over 24 Hours?
Count velocity catches card testing. Amount velocity catches high-value fraud, where a compromised card makes a small number of large purchases. A 24-hour spend limit check covers both same-day and overnight attacks.
CREATE MATERIALIZED VIEW card_spend_velocity_24h AS
SELECT
card_id,
window_start,
window_end,
COUNT(*) AS txn_count,
SUM(amount_usd) AS total_spend_24h,
AVG(amount_usd) AS avg_txn_amount,
COUNT(DISTINCT merchant_category) AS categories_used
FROM HOP(
transactions,
event_time,
INTERVAL '1 hour',
INTERVAL '24 hours'
)
GROUP BY
card_id,
window_start,
window_end;
A separate threshold table lets risk analysts tune limits per card tier without redeploying the pipeline:
CREATE TABLE card_spend_limits (
card_tier VARCHAR PRIMARY KEY,
max_spend_24h DECIMAL(12, 2),
max_txn_count INT
);
INSERT INTO card_spend_limits VALUES
('standard', 1500.00, 30),
('premium', 5000.00, 60),
('corporate', 20000.00, 150);
Joining the velocity view against the limit table gives a live breach query that risk systems can poll or subscribe to:
SELECT
v.card_id,
v.total_spend_24h,
l.max_spend_24h,
v.txn_count,
l.max_txn_count
FROM card_spend_velocity_24h v
JOIN card_spend_limits l ON l.card_tier = 'standard'
WHERE v.window_end > NOW() - INTERVAL '1 hour'
AND (
v.total_spend_24h > l.max_spend_24h
OR v.txn_count > l.max_txn_count
);
How Do You Build Multi-Dimensional Velocity Checks?
Single-signal velocity checks miss sophisticated fraud that stays just below per-card thresholds by spreading activity across merchants or time zones. Multi-dimensional velocity adds a second grouping key, such as merchant category, to catch concentrated attacks against a single retailer type.
CREATE MATERIALIZED VIEW card_merchant_category_velocity AS
SELECT
card_id,
merchant_category,
EXTRACT(HOUR FROM window_start) AS hour_of_day,
window_start,
window_end,
COUNT(*) AS txn_count,
SUM(amount_usd) AS category_spend,
COUNT(DISTINCT merchant_id) AS merchant_count
FROM HOP(
transactions,
event_time,
INTERVAL '15 minutes',
INTERVAL '2 hours'
)
GROUP BY
card_id,
merchant_category,
EXTRACT(HOUR FROM window_start),
window_start,
window_end;
This view reveals patterns like "card X made 18 transactions at online electronics merchants between 2 AM and 4 AM," a signature that is invisible when you only aggregate by card.
How Do You Combine Multiple Velocity Signals Into a Composite Fraud Score?
Individual velocity signals each carry noise. A cardholder returning from a trip might genuinely spike transaction count without committing fraud. Combining signals into a weighted score reduces false positives while preserving recall.
CREATE MATERIALIZED VIEW fraud_velocity_score AS
WITH recent_cards AS (
-- Get cards active in the last 5 minutes across any window
SELECT DISTINCT card_id
FROM card_velocity_hop_1h
WHERE window_end > NOW() - INTERVAL '5 minutes'
),
count_score AS (
SELECT
card_id,
CASE
WHEN MAX(txn_count) >= 30 THEN 40
WHEN MAX(txn_count) >= 20 THEN 25
WHEN MAX(txn_count) >= 10 THEN 10
ELSE 0
END AS score
FROM card_velocity_hop_1h
WHERE window_end > NOW() - INTERVAL '5 minutes'
GROUP BY card_id
),
spend_score AS (
SELECT
card_id,
CASE
WHEN MAX(total_spend_24h) >= 5000 THEN 40
WHEN MAX(total_spend_24h) >= 2000 THEN 25
WHEN MAX(total_spend_24h) >= 1000 THEN 10
ELSE 0
END AS score
FROM card_spend_velocity_24h
WHERE window_end > NOW() - INTERVAL '1 hour'
GROUP BY card_id
),
category_score AS (
SELECT
card_id,
CASE
WHEN MAX(txn_count) >= 10
AND COUNT(DISTINCT merchant_category) <= 2 THEN 20
WHEN MAX(txn_count) >= 5
AND COUNT(DISTINCT merchant_category) = 1 THEN 10
ELSE 0
END AS score
FROM card_merchant_category_velocity
WHERE window_end > NOW() - INTERVAL '15 minutes'
GROUP BY card_id
)
SELECT
r.card_id,
COALESCE(cs.score, 0) AS count_velocity_score,
COALESCE(ss.score, 0) AS spend_velocity_score,
COALESCE(cats.score, 0) AS category_concentration_score,
COALESCE(cs.score, 0) + COALESCE(ss.score, 0) + COALESCE(cats.score, 0) AS composite_score,
NOW() AS scored_at
FROM recent_cards r
LEFT JOIN count_score cs ON cs.card_id = r.card_id
LEFT JOIN spend_score ss ON ss.card_id = r.card_id
LEFT JOIN category_score cats ON cats.card_id = r.card_id;
A composite score of 70 or above triggers an immediate block. Scores between 40 and 69 route to step-up authentication. Below 40, the transaction proceeds normally. The thresholds live in a separate configuration table so the risk team can adjust them without touching the pipeline.
CREATE TABLE fraud_action_thresholds (
action VARCHAR PRIMARY KEY,
min_score INT,
max_score INT
);
INSERT INTO fraud_action_thresholds VALUES
('block', 70, 100),
('step_up_auth', 40, 69),
('allow', 0, 39);
How Does RisingWave Compare to Apache Flink for Velocity Fraud Detection?
Both RisingWave and Apache Flink can power velocity-based fraud detection, but they differ significantly in how rules are expressed, deployed, and maintained.
Rule authoring: RisingWave uses standard SQL with built-in TUMBLE, HOP, and SESSION window functions. Non-overlapping windows, sliding windows, and gap-based session windows all have dedicated syntax. Apache Flink's richest windowing features require Java or Scala DataStream API code. Flink SQL covers the basics, but complex multi-dimensional joins and late-event handling often push teams into the Java API.
State management: RisingWave stores incremental state in S3-compatible object storage. There is no local disk dependency. Apache Flink uses RocksDB for hot state on the TaskManager's local disk. Durability requires careful checkpoint configuration, and recovery after a node failure involves replaying from the last checkpoint plus any buffered Kafka lag.
Rule updates: Changing a velocity threshold or adding a new signal in RisingWave is a SQL ALTER statement or a new CREATE MATERIALIZED VIEW. The system routes new data through the updated logic without interrupting the rest of the pipeline. In Apache Flink, most logic changes require a full job redeployment, which involves serializing the current job state, stopping the job, redeploying the new JAR, and restoring from a savepoint.
Serving layer: RisingWave is PostgreSQL-compatible. Fraud scoring applications connect over the standard Postgres wire protocol and issue plain SELECT statements. There is no need to sink results to a separate serving database. Flink writes output to an external store, typically Redis, Cassandra, or PostgreSQL, before applications can query it, adding a hop that increases latency and operational complexity.
When Flink is the right choice: If your team has deep Flink expertise, runs on a managed Flink service, and needs custom processing logic that does not fit SQL well, Flink is a mature and proven option. For teams that want to express fraud rules in SQL, maintain them like database views, and serve scores directly to applications, RisingWave removes substantial infrastructure overhead.
Frequently Asked Questions
Does RisingWave handle late-arriving transactions?
Yes. RisingWave supports watermarks and configurable late-event handling. You set a watermark delay on the source to define how long the system waits before closing a window. Transactions arriving within the watermark delay are included in the correct window. Transactions arriving after the window closes can be routed to a side output table for manual review.
How do HOP windows affect storage?
Because a single event participates in multiple overlapping windows, HOP windows use more state than TUMBLE windows with the same window size. A 1-hour window with a 5-minute slide keeps 12 overlapping windows per group key. RisingWave stores this state in S3, so there is no local disk pressure, though you should account for increased storage costs when choosing very small slide intervals over large windows.
Can velocity rules trigger actions in real time?
Yes. You can use RisingWave's sink connectors to push high-score records to a Kafka topic, a webhook, or a PostgreSQL table that your fraud decisioning service consumes. For sub-100ms block decisions, many teams use RisingWave to maintain a materialized scoring view and query it synchronously from the authorization path.
What happens when a fraud rule needs to change mid-day?
In RisingWave, you DROP and CREATE a materialized view to change the rule logic, or you update a threshold in a configuration table if the change is parameter-only. RisingWave recomputes state from the source as needed. There is no job savepoint to manage, no JAR to rebuild, and no deployment window required.
Start Building Real-Time Fraud Detection
Transaction velocity checks are the first layer in any robust fraud detection system. With RisingWave, you can implement count velocity, amount velocity, multi-dimensional concentration checks, and composite scoring entirely in SQL, with results available via the Postgres protocol within seconds of each transaction.
Learn more about building production fraud detection pipelines at risingwave.com/real-time-fraud-detection/.

