Real-Time Defect Detection in Manufacturing Lines

Real-Time Defect Detection in Manufacturing Lines

Defects caught at the point of origin cost a fraction of defects found downstream—or worse, by customers. By streaming inline inspection results through RisingWave, a PostgreSQL-compatible streaming database, manufacturers can detect defect pattern emergence in seconds, trigger machine holds before the next cycle, and trace defects to specific tool changes, material lots, or operator shifts without waiting for end-of-shift reports.

Why Real-Time Defect Detection Matters in Manufacturing

The cost of a defect scales dramatically with how far downstream it travels before detection. A defective part caught by the machining center's inline gauge costs seconds of cycle time to scrap. The same defect caught at final inspection costs rework labor, possible rescheduling, and customer delivery delay. A defect escaping to the customer triggers warranty claims, potential recalls, and reputation damage.

Traditional defect detection workflows create latency at every step:

  • Inline sensors generate data but write to local databases queried only by operators at scheduled intervals.
  • Defect alerts are threshold-based on single-point values, missing patterns like a gradual drift that crosses the spec limit only on part 47 out of 50.
  • Traceability is reconstructed after the fact by correlating inspection records with production logs—a time-consuming process when speed of containment matters.
  • Tool change and lot change events are logged in the MES but not correlated with inspection streams in real time.

Real-time defect detection with streaming SQL processes every inspection result as it is generated, correlates it with process state data, and provides traceability links immediately—not hours later.

How Streaming SQL Solves This

RisingWave ingests inspection results from vision systems, CMMs, gauging stations, and functional testers via Kafka. Simultaneously, it ingests process state events—tool changes, material lot changes, operator logins, machine parameter updates—from the MES. Materialized views join these streams to provide defect counts with full traceability context, updated continuously.

Capabilities this enables:

  • Rolling defect rate per line, machine, and defect category, updated with every inspection result
  • Process event correlation: defect spike views joined against tool change and lot change events to identify root cause candidates in real time
  • Defect clustering: detecting spatial or temporal patterns in defect occurrence
  • Automatic machine hold signals: sinks that publish hold commands to SCADA when defect thresholds are breached

Building the System

Step 1: Data Source

CREATE SOURCE inspection_results (
    inspection_id   VARCHAR,
    line_id         VARCHAR,
    machine_id      VARCHAR,
    station_id      VARCHAR,
    part_serial     VARCHAR,
    part_number     VARCHAR,
    shift_code      VARCHAR,
    inspect_time    TIMESTAMPTZ,
    result          VARCHAR,   -- 'PASS','FAIL','MARGINAL'
    defect_code     VARCHAR,   -- defect category
    defect_severity INT,       -- 1=minor, 2=major, 3=critical
    tool_id         VARCHAR,
    material_lot_id VARCHAR,
    operator_id     VARCHAR,
    measurement_value FLOAT,
    usl             FLOAT,
    lsl             FLOAT
)
WITH (
    connector = 'kafka',
    topic = 'manufacturing.inspection.results',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;

CREATE SOURCE process_events (
    event_id        VARCHAR,
    line_id         VARCHAR,
    machine_id      VARCHAR,
    event_time      TIMESTAMPTZ,
    event_type      VARCHAR,   -- 'TOOL_CHANGE','LOT_CHANGE','OPERATOR_LOGIN','PARAM_CHANGE'
    reference_id    VARCHAR,   -- new tool ID, new lot ID, etc.
    previous_value  VARCHAR,
    new_value       VARCHAR
)
WITH (
    connector = 'kafka',
    topic = 'manufacturing.process.events',
    properties.bootstrap.server = 'kafka:9092',
    scan.startup.mode = 'latest'
) FORMAT PLAIN ENCODE JSON;

Step 2: Real-Time View

Compute a rolling 30-minute defect rate per line and machine, with breakdown by defect category:

CREATE MATERIALIZED VIEW defect_rate_rolling AS
SELECT
    line_id,
    machine_id,
    defect_code,
    window_start,
    window_end,
    COUNT(*)                                              AS total_inspections,
    COUNT(*) FILTER (WHERE result = 'FAIL')               AS fail_count,
    COUNT(*) FILTER (WHERE result = 'MARGINAL')           AS marginal_count,
    COUNT(*) FILTER (WHERE defect_severity = 3)           AS critical_defects,
    COUNT(*) FILTER (WHERE result = 'FAIL')::FLOAT /
        NULLIF(COUNT(*), 0)                               AS defect_rate,
    AVG(measurement_value)                                AS avg_measurement,
    STDDEV_POP(measurement_value)                         AS measurement_stddev,
    -- Track which tool and lot are producing defects
    MODE() WITHIN GROUP (ORDER BY tool_id)                AS dominant_tool,
    MODE() WITHIN GROUP (ORDER BY material_lot_id)        AS dominant_lot
FROM TUMBLE(
    inspection_results,
    inspect_time,
    INTERVAL '30 minutes'
)
GROUP BY line_id, machine_id, defect_code, window_start, window_end;

Correlate defect spikes with recent process events for root cause identification:

CREATE MATERIALIZED VIEW defect_process_correlation AS
SELECT
    d.line_id,
    d.machine_id,
    d.window_start,
    d.defect_rate,
    d.fail_count,
    d.dominant_tool,
    d.dominant_lot,
    p.event_type AS recent_process_change,
    p.reference_id AS changed_item,
    p.event_time AS change_time
FROM defect_rate_rolling d
JOIN process_events p
    ON d.machine_id = p.machine_id
    AND p.event_time BETWEEN d.window_start - INTERVAL '1 hour' AND d.window_end
WHERE d.defect_rate > 0.05;

Step 3: Alerts

Detect defect rate spikes and critical defect occurrences, sending machine hold signals to SCADA:

CREATE MATERIALIZED VIEW defect_alerts AS
SELECT
    line_id,
    machine_id,
    defect_code,
    window_start,
    window_end,
    defect_rate,
    fail_count,
    critical_defects,
    dominant_tool,
    dominant_lot,
    CASE
        WHEN critical_defects >= 3             THEN 'MACHINE_HOLD_REQUIRED'
        WHEN defect_rate >= 0.10               THEN 'HIGH_DEFECT_RATE'
        WHEN defect_rate >= 0.05               THEN 'DEFECT_RATE_WARNING'
        ELSE 'NORMAL'
    END AS alert_type
FROM defect_rate_rolling
WHERE defect_rate >= 0.05 OR critical_defects >= 3;

CREATE SINK defect_alerts_sink
FROM defect_alerts
WITH (
    connector = 'kafka',
    topic = 'manufacturing.defects.alerts',
    properties.bootstrap.server = 'kafka:9092'
) FORMAT PLAIN ENCODE JSON;

Comparison Table

ApproachDefect Detection LagTraceabilityRoot CauseMachine Hold
End-of-shift quality report4–8 hoursPost-hocManualManual
SPC charting software (batch)15–60 minutesLimitedPartialManual
Custom streaming pipelineSecondsReal-timeYesAutomated
RisingWaveSecondsReal-timeAutomatic correlationAutomated

FAQ

Q: How does RisingWave handle the volume of inspection data from vision systems that inspect every part? RisingWave is designed for high-throughput streaming. Vision system inspection results published to Kafka at part-by-part rates (potentially hundreds per minute per line) are ingested and processed continuously. RisingWave scales horizontally to handle fleets of production lines simultaneously.

Q: Can I implement First-Time Quality (FTQ) tracking across multi-station lines? Yes. Track each part serial number across multiple inspection stations by joining inspection results on part_serial. A part that passes station 1 but fails station 3 would appear in a cross-station join that computes FTQ correctly, accounting for rework loops.

Q: How do I prevent false machine holds from temporary measurement noise? Use a minimum count threshold in your alert logic—for example, require at least 10 inspections in the window before the defect rate triggers an alert. The HAVING COUNT(*) >= 10 clause in the aggregation ensures the alert is based on a statistically meaningful sample.

Key Takeaways

  • Real-time defect detection prevents downstream propagation of quality issues by detecting pattern emergence seconds after it begins, not hours later.
  • RisingWave's ability to join inspection result streams with process event streams (tool changes, lot changes) provides automatic root cause correlation without manual investigation.
  • Critical defect count thresholds triggering machine hold signals via Kafka enable closed-loop quality control without manual operator intervention in the alert path.
  • Streaming SQL expresses defect detection logic at the business level—defect rate thresholds, severity classifications—without requiring custom streaming code or state management.

Best-in-Class Event Streaming
for Agents, Apps, and Analytics
GitHubXLinkedInSlackYouTube
Sign up for our to stay updated.