Building Real-Time Risk Features for Buy Now Pay Later

Buy Now Pay Later has a credit problem. Not a credit quality problem -- a credit data problem.

When a first-time user clicks "Pay in 4" at checkout, traditional lenders would run a bureau pull, check a FICO score, and make a decision in a few seconds. BNPL lenders often can't do that. Thin-file consumers -- young adults, recent immigrants, people who've stayed off the credit grid -- have no meaningful bureau record. International markets may lack a centralized bureau entirely. And even where bureau data exists, the hard inquiry itself can ding a consumer's score, creating friction that kills conversion.

So BNPL lenders have built something different: behavioral underwriting. Instead of asking "what does Experian say about this person?", they ask "what has this person actually done?" Transaction history, repayment behavior, device consistency, velocity signals -- these become the inputs to a credit model built entirely from first-party and consortium data.

The catch: behavioral signals go stale fast. A user who made their last three payments on time a month ago is not the same risk as a user who missed a payment yesterday. If you're batch-computing features nightly, you're underwriting with a blindfold on.

This article shows how to build a real-time behavioral feature set for BNPL underwriting using streaming SQL -- specifically the kind of continuously-maintained materialized views that make sub-100ms feature serving practical.

Why BNPL Risk Is Different from Traditional Credit

Traditional credit underwriting is built around a periodic, centralized model. Borrowers accumulate a history over years. Lenders report to bureaus monthly. Risk models run on data that's 30 to 60 days old at the point of decision.

BNPL breaks every one of those assumptions:

Thin-file borrowers. A significant portion of BNPL users are "credit invisibles" -- the CFPB estimates roughly 26 million Americans have no credit file at all, and another 19 million have files too thin or stale to score. Bureau-dependent underwriting simply can't serve this segment. BNPL platforms that want to grow must build their own signal.

High-velocity decisions. A checkout flow that takes 30 seconds to approve a $150 basket is a checkout flow that loses sales. Users will abandon before the approval comes back. The model has to run fast, which means the features have to be precomputed and ready to serve, not computed on demand.

Frequent, small exposures. A traditional mortgage lender makes thousands of decisions per year. A mid-scale BNPL platform makes millions. Each individual exposure is small enough that any one bad decision doesn't matter much -- but model drift or stale features can shift the portfolio's risk profile in days, not quarters.

Correlated deterioration. When a consumer starts missing BNPL payments, they often miss them at multiple lenders simultaneously. This is the BNPL-specific version of the credit spiral problem. A bureau-based model would catch this slowly (lenders report monthly). A behavioral model with real-time data from multiple sources can catch it within hours.

The regulatory environment is also distinct. In the US, the CFPB has been expanding oversight of BNPL products, with increasing pressure around adverse action notices and explainability requirements. In the UK, the FCA has moved BNPL into regulated territory. Any behavioral feature set needs to be built with explainability in mind from the start -- you can't just throw a 500-feature gradient-boosted model at a regulator and call it a day.

Key Feature Signals

Before writing any SQL, it helps to understand which signals actually matter for BNPL risk. The academic literature and practitioner experience both converge on a few categories:

Repayment behavior is the most predictive signal, by a wide margin. Whether someone pays on time, how late they pay when they miss, and whether lateness is trending in either direction -- these features carry more information than almost anything else. The recency of the last late payment matters enormously. A user who was chronically late two years ago but has been perfect for the last 18 months is a different risk profile than someone whose first missed payment was last week.

Application velocity is a strong adverse signal. When someone is applying to multiple BNPL lenders in a short window, they're either rate-shopping (unlikely, given how similar BNPL terms are) or they're in financial distress and looking for available credit anywhere. Multiple applications across multiple lenders within 7 days is one of the strongest early-warning signals available.

Purchase pattern features capture behavioral consistency. Does the user's basket size stay relatively stable, or are they suddenly requesting much larger amounts? Are they buying in categories consistent with their history? Sudden shifts in purchase behavior can precede credit deterioration.

Device and session consistency provides fraud signal more than credit signal, but the two often correlate. Users who suddenly switch devices, rotate through VPNs, or show unusual session behavior warrant additional scrutiny -- not because they're necessarily fraudulent, but because behavioral inconsistency across dimensions is itself a risk flag.

Data Architecture: What Events to Capture

A behavioral feature pipeline needs three primary event streams:

Purchase events -- fired when a user initiates or completes a BNPL transaction. At minimum: user_id, merchant_id, requested_amount, approved_amount, product_category, device_id, session_id, event_time, lender_id.

Repayment events -- fired when an installment is due or paid. At minimum: user_id, loan_id, installment_number, due_date, paid_date, amount_due, amount_paid, paid_on_time (boolean), days_late, lender_id, event_time.

Application events -- fired when a user applies for BNPL credit at any participating lender. At minimum: user_id, lender_id, requested_amount, decision (approved/denied/pending), event_time.

These events flow into a Kafka topic (or equivalent) and are consumed by the streaming SQL layer. The key architectural principle is that features should be maintained incrementally -- every new event updates the relevant materialized views immediately, so that the feature store is always within seconds of current.

Kafka Topics
  repayment_events ──┐
  purchase_events  ──┼──► Streaming SQL (RisingWave) ──► Feature Store ──► Model Serving
  application_events ┘         (Materialized Views)        (Postgres/Redis)    (< 100ms)

SQL Implementation

Here are the core materialized views for a BNPL feature pipeline.

Repayment Behavior

Repayment history is the anchor feature. This view looks back 12 months, which is long enough to capture behavioral patterns while keeping the signal reasonably current.

CREATE MATERIALIZED VIEW repayment_behavior AS
SELECT
    user_id,
    COUNT(*) AS total_installments_due,
    COUNT(*) FILTER (WHERE paid_on_time = true) AS on_time_payments,
    COUNT(*) FILTER (WHERE paid_on_time = false AND days_late <= 7) AS slightly_late,
    COUNT(*) FILTER (WHERE paid_on_time = false AND days_late > 7) AS seriously_late,
    AVG(days_late) FILTER (WHERE days_late > 0) AS avg_days_late,
    MAX(event_time) FILTER (WHERE paid_on_time = false) AS last_late_payment_at,
    -- On-time rate: null if no history (treat as unknown, not zero risk)
    CASE
        WHEN COUNT(*) > 0
        THEN COUNT(*) FILTER (WHERE paid_on_time = true)::FLOAT / COUNT(*)
        ELSE NULL
    END AS on_time_rate
FROM repayment_events
WHERE event_time >= NOW() - INTERVAL '12 months'
GROUP BY user_id;

A few design choices worth noting:

The FILTER clause on on_time_rate returns NULL for users with no history. This is intentional. A null on-time rate is not the same as a zero on-time rate -- it means the model should treat this user as unknown, not as high-risk. Your model should handle null features explicitly.
last_late_payment_at gives you the recency of the most recent adverse event. The time elapsed since this timestamp is one of the most useful derived features you can generate at serving time.
Separating slightly_late from seriously_late matters. A user with 10% of payments 1-3 days late is behaviorally different from a user with 10% of payments 30+ days late. Collapsing these into a single "late payment rate" throws away information.

Application Velocity

Application velocity is the real-time early-warning signal. It has a short lookback window by design.

CREATE MATERIALIZED VIEW application_velocity AS
SELECT
    user_id,
    COUNT(*) AS applications_7d,
    COUNT(DISTINCT lender_id) AS unique_lenders_7d,
    SUM(requested_amount) AS total_requested_7d,
    COUNT(*) FILTER (WHERE decision = 'denied') AS denials_7d,
    -- Denial rate: high denial rate compounds the velocity signal
    CASE
        WHEN COUNT(*) > 0
        THEN COUNT(*) FILTER (WHERE decision = 'denied')::FLOAT / COUNT(*)
        ELSE 0
    END AS denial_rate_7d
FROM application_events
WHERE event_time >= NOW() - INTERVAL '7 days'
GROUP BY user_id;

unique_lenders_7d is particularly valuable. A single lender with multiple applications might be a technical retry. Multiple distinct lenders in one week is a much stronger signal.

Purchase Pattern

Purchase pattern features capture behavioral consistency over a longer window.

CREATE MATERIALIZED VIEW purchase_pattern AS
SELECT
    user_id,
    COUNT(*) AS total_purchases_90d,
    AVG(approved_amount) AS avg_basket_size_90d,
    STDDEV(approved_amount) AS basket_size_stddev_90d,
    MAX(approved_amount) AS max_basket_size_90d,
    -- Most recent 30 days vs prior 60 days: are baskets growing?
    AVG(approved_amount) FILTER (
        WHERE event_time >= NOW() - INTERVAL '30 days'
    ) AS avg_basket_last_30d,
    AVG(approved_amount) FILTER (
        WHERE event_time >= NOW() - INTERVAL '90 days'
            AND event_time < NOW() - INTERVAL '30 days'
    ) AS avg_basket_prior_60d,
    COUNT(DISTINCT merchant_id) AS distinct_merchants_90d,
    COUNT(DISTINCT product_category) AS distinct_categories_90d
FROM purchase_events
WHERE event_time >= NOW() - INTERVAL '90 days'
    AND decision = 'approved'
GROUP BY user_id;

The comparison between avg_basket_last_30d and avg_basket_prior_60d gives you a trend signal. A user whose average basket has jumped 3x in the last month is asking for more credit than their established behavior would predict -- which may be fine, or may be an early deterioration signal.

Device Consistency

Device signals sit at the boundary between credit risk and fraud. In practice, the distinction matters less than the signal.

CREATE MATERIALIZED VIEW device_consistency AS
SELECT
    user_id,
    COUNT(DISTINCT device_id) AS distinct_devices_30d,
    COUNT(DISTINCT device_id) FILTER (
        WHERE event_time >= NOW() - INTERVAL '7 days'
    ) AS distinct_devices_7d,
    -- Flag sudden device switches: new device in last 7d, not seen in prior 23d
    COUNT(DISTINCT device_id) FILTER (
        WHERE event_time >= NOW() - INTERVAL '7 days'
    ) > COUNT(DISTINCT device_id) FILTER (
        WHERE event_time >= NOW() - INTERVAL '30 days'
            AND event_time < NOW() - INTERVAL '7 days'
    ) AS new_device_introduced
FROM purchase_events
WHERE event_time >= NOW() - INTERVAL '30 days'
GROUP BY user_id;

Serving Features at Checkout

The feature pipeline only delivers value if features are available at the moment of the underwriting decision. For BNPL, that moment is checkout, and the SLA is roughly 100ms from request to decision.

The architecture that makes this work:

Streaming SQL continuously maintains the materialized views as new events arrive. The views reflect the state of the world within seconds of any new event.
A feature serving layer (typically Postgres with a Redis cache layer in front) reads from the materialized views on a short poll or via change data capture. Features are pre-materialized and available for point-lookup by user_id.
At checkout, the model serving layer does a single key lookup per user -- not a query, not a join, just a read -- against the feature store. This retrieval takes single-digit milliseconds.
The risk model scores the pre-fetched feature vector and returns a decision.

The critical property here is that no computation happens at serving time. Every derived metric -- on-time rate, velocity counts, basket trends -- was computed incrementally as events arrived and stored ready to serve. The serving path is pure read.

For users with no history (first-time applicants), you'll need a fallback: device-based signals, merchant context, or a conservative default policy. First-time applicants are handled differently from returning users in almost every BNPL model.

Handling Data from Multiple BNPL Providers

The application velocity signal is only as good as the breadth of your data. If you can only see applications at your own platform, a user who has applied at five other BNPL lenders will look clean to you.

This is the motivation for BNPL data-sharing consortiums. Several industry initiatives have emerged to allow participating lenders to share application and repayment events (with appropriate user consent and privacy controls). From an engineering perspective, consortium data arrives as an additional stream of events from external sources, conforming to the same schema as your internal events -- the lender_id field distinguishes them.

A few implementation considerations:

Consent and permissioning. Users must consent to data sharing. Your event schema needs to carry consent status, and your feature pipeline needs to filter events based on whether the user's consent covers the use case. This is not optional -- regulators in both the US and UK have been explicit about consent requirements for data sharing in financial services.

Data freshness SLAs. Consortium data typically has higher latency than your own events. An event that happened at another lender may take minutes to hours to arrive in your stream. Your feature definitions should account for this by treating consortium data as inherently less fresh than first-party data. You may want separate features for "first-party repayment rate" and "consortium repayment rate."

Identity resolution. Matching a user across multiple lenders requires a common identifier -- typically phone number or email, hashed before transmission. The quality of your identity resolution directly affects the usefulness of consortium data. A user who registers at each lender with a different email address will look like multiple distinct users.

Regulatory Considerations

Behavioral underwriting with streaming features doesn't exempt you from adverse action requirements. In the US, if you deny a consumer credit based on any factors -- including behavioral features -- you must provide an adverse action notice identifying the principal reasons for the denial. The CFPB's Regulation B requires this, and it applies regardless of whether you used a traditional bureau score or a proprietary behavioral model.

This has direct implications for your feature design:

Explainability over opacity. A feature like on_time_rate or applications_7d can be explained to a consumer ("your application was denied because you have a high number of recent applications to other credit providers"). A dense embedding from a neural network cannot. Regulators and courts have increasingly taken the position that "the model said so" is not an adequate explanation.

Avoid prohibited basis. Features that proxy for protected characteristics (race, national origin, sex, religion, etc.) are prohibited under the Equal Credit Opportunity Act. This includes features that are facially neutral but correlate with protected class membership. Device type, merchant category, and location features all require careful analysis before use. This is an area where legal and compliance review is non-negotiable.

Model monitoring. Streaming features that change continuously can cause model behavior to drift in ways that batch models don't. A feature that was uncorrelated with a protected characteristic when the model was trained may drift into correlation as the underlying population shifts. Ongoing monitoring of disparate impact across protected classes is required, not optional.

The FCA in the UK has similar requirements under the Consumer Duty framework, with additional emphasis on ensuring that BNPL products don't cause foreseeable harm to vulnerable customers.

FAQ

Does streaming SQL require a completely new data infrastructure?

Not necessarily. If you're already running Kafka (or a similar event stream), adding a streaming SQL layer like RisingWave sits between your existing Kafka topics and your feature store. You don't have to replace your existing batch jobs immediately -- you can run both in parallel and migrate feature by feature.

How do I handle users who haven't consented to data sharing?

Treat them as first-time applicants even if you have some internal history. The alternative -- using data for purposes the user didn't consent to -- creates regulatory and reputational risk that isn't worth the marginal model improvement.

What's a reasonable lookback window for repayment features?

12 months is the common choice for "established" features. Shorter windows (30-90 days) are useful for detecting recent changes in behavior. You may want both: a 12-month on-time rate for baseline risk, and a 30-day on-time rate that can flag recent deterioration. The two in combination are more predictive than either alone.

How do I handle the cold-start problem for new users?

A few approaches: (1) Use device-based features and merchant context as a proxy for risk. (2) Apply a conservative approval threshold for first-time applicants with a lower credit limit. (3) For markets where it's available and legal, use a bureau soft pull that doesn't affect the consumer's score. The cold-start problem doesn't have a perfect solution -- it's a fundamental tradeoff between inclusion and risk.

What about model drift?

Streaming features don't prevent model drift -- they make it visible faster. Because your features reflect the current state of the world in near-real-time, model performance metrics also reflect current reality. A sudden shift in approval rates, default rates, or feature distributions is visible within days rather than months. This is an advantage, but it requires that you have monitoring in place to catch and act on drift signals quickly.

Can behavioral features replace bureau data entirely?

In thin-file markets, they have to. In markets with good bureau coverage, the best models typically combine both. Bureau data provides historical context that goes back further than your own customer history. Behavioral data provides recency and specificity that bureau data can't match. They're complements, not substitutes -- though for a BNPL player entering a new market, building on behavioral data first and adding bureau data later is a practical path.

The fundamental insight behind real-time behavioral underwriting is that risk is a dynamic property, not a static one. A consumer's creditworthiness today is not the same as their creditworthiness last week, and certainly not the same as whatever score a bureau computed from month-old data. Streaming SQL makes it practical to compute the features that reflect that dynamic reality -- and to serve them fast enough to make a decision before the user gives up and walks away from checkout.

The engineering investment is real, but it's smaller than building a batch feature pipeline that you'll need to replace in two years anyway. Start with the three or four highest-signal features -- repayment on-time rate, application velocity, and basket trend -- and build the infrastructure that lets you add more without re-architecting. That's a foundation you can grow a model on.