Case Study — Manufacturing

Document AI for Quality Assurance

A European industrial manufacturer replaced manual processing of supplier QA documents with document-AI. 78% of documents auto-processed end-to-end across 30+ supplier templates, with the engagement paying back in 8 months.

The Setting

A European industrial manufacturer producing precision components for the automotive, aerospace and industrial-machinery sectors, with around 1,400 staff across three production sites. Incoming materials and components arrived with supplier QA documentation: certificates of conformity, material composition reports, dimensional inspection reports, and corrective-action records when prior shipments had deviations. Roughly 1,200 such documents arrived weekly, spread across 30+ supplier template formats ranging from clean structured PDFs to scanned legacy templates.

A team of five QA technicians manually processed each document: extracting key fields, comparing against the purchase-order specification, flagging deviations, and entering structured data into the ERP system. The process was the bottleneck for releasing components to production. The Director of Quality wanted to evaluate whether document-AI could absorb the routine cases and leave the technicians focused on actual exceptions.

What Made This Hard

Extreme document heterogeneity

30+ supplier templates with no shared schema. Some clean structured PDFs from large suppliers, others scanned legacy templates from smaller suppliers, occasional handwritten amendments. A generic OCR-plus-LLM approach risked silent extraction errors on the legacy templates.

Tight ERP integration

Extracted data had to write into the existing ERP system (an older but central installation), with strict validation rules on every field. A document that almost-parsed needed to fail cleanly, not write partial data and corrupt the QA record.

In-house engineering capability

The manufacturer's IT team had strong ERP and integration experience but limited AI/ML exposure. The engagement model needed to transfer enough know-how that the team could maintain and extend the system after handover — not just deliver a black box.

How the Methodology Was Applied

Total engagement duration: 16 weeks across Phases 1-3 plus extended Phase 4 oversight to support the in-house team during build and through stabilization.

Phase 1 — Discovery (2 weeks)

Scope was the supplier QA document workflow specifically, excluding internal QA documents (different governance, different consumers). Baseline metric: documents processed per QA-technician-day (current state: ~48 documents/day). Risk classification: limited-risk under the EU AI Act framework. Notable scope decision in Discovery: corrective-action documents were excluded from initial scope because those triggered downstream contractual obligations and were judged too high-stakes for a first deployment.

Phase 2 — Architecture (3 weeks)

Hybrid extraction pipeline: structured templates (about 12 of the 30) routed through a deterministic field extractor. Unstructured / legacy templates routed through document-AI (OCR-plus-LLM with structured output schema). Confidence scoring per field; documents with any field below a configurable confidence threshold routed to the manual review queue rather than written to ERP. Integration via the ERP's batch staging tables, with validation rules enforced at staging — meaning a failed validation rejected the entire document for manual handling.

Phase 3 — Governance Design (2 weeks, parallel to Phase 2)

Twelve-control baseline applied. Notable choices: rollback was a per-supplier feature flag (so a specific supplier's documents could be reverted to manual processing without disabling the system globally if their template changed unexpectedly). Monitoring captured per-supplier and per-field extraction accuracy, with weekly drift reports. Named accountable owner: Director of Quality, with the CIO as escalation.

Phase 4 — Implementation Oversight (9 weeks)

Engineering was done entirely by the in-house team, with Slavin AI providing weekly architecture reviews, code-level feedback at governance gates, and pair-design sessions on the more complex extraction patterns. Two governance gates: gate 1 at week 4 (extraction accuracy on golden set), gate 2 at week 7 (ERP integration + rollback validation). The team explicitly chose to onboard suppliers in waves of 5-6 templates rather than all 30 at once; the first wave validated the pipeline architecture, subsequent waves were faster.

Measured Against the Baseline

Auto-processed end-to-end

78% of incoming documents processed without human intervention. The remaining 22% routed to the manual review queue (either low confidence on at least one field, or template not yet onboarded).

QA technician reallocation

From 5 FTE on document processing to 2 FTE handling the manual review queue and exceptional cases. The reallocated 3 FTE moved to supplier-quality improvement work, which the team had previously not had capacity for.

Field-level extraction accuracy

96-99% per supplier-template combination in steady state. Suppliers below 95% were reviewed for template changes and rolled back to manual processing pending re-onboarding.

Payback period

8 months from go-live, calculated against the engagement cost plus internal engineering cost and ongoing infrastructure cost. Driven primarily by FTE reallocation and reduction in production hold time for routine inbound material.

Production hold time

Reduced from average 26 hours to under 4 hours for components with routine QA documentation. Components requiring manual QA review unchanged.

Handover quality

Full operational handover at month 6. No ongoing Slavin AI engagement required for support. The team independently onboarded 4 additional supplier templates in the 6 months following handover.

What We Would Change

Onboarding waves should be smaller, faster, more

Initial supplier-onboarding waves of 5-6 templates were judged too large; some templates within a wave were significantly more complex than others, slowing the entire wave. Future engagements: 2-3 templates per wave, more waves, faster cadence.

Corrective-action documents back in scope earlier

Excluding corrective-action documents from initial scope was the right call in Discovery, but six months in, the team had the confidence and the system maturity to handle them — and the manual processing of corrective actions had become the new bottleneck. Future engagements: revisit the scope-exclusion list at month 4-6, not at month 12+.

Per-supplier feature flags were the standout architecture decision

The choice in Phase 2 to make rollback per-supplier rather than global proved load-bearing. Three suppliers changed templates unexpectedly in the first year; each was reverted to manual processing within hours of detection, with zero impact on the other 27 suppliers. Future engagements: per-source rollback granularity is the default, not a sophistication.

Other Engagements

Healthcare — HIPAA-Compliant Clinical Knowledge RAG

Regional healthcare provider, clinical-knowledge assistant, 42% reduction in time-to-answer with zero PHI leakage.

Read this case study →

Financial Services — Compliance Officer Assistant

Mid-tier asset manager, AI assistant for compliance officers, 3.1x review throughput with audit-grade source attribution.

Read this case study →

Methodology

The four-phase engagement model that drove this case study.

Read the Methodology page →

A Manufacturing AI Initiative in Mind?

An Architecture Review identifies the highest-leverage workflow, scope-exclusion decisions, and the integration constraints that will shape the build.