AI Governance Framework

AI Governance Is the Difference Between Capability and Liability

A practical framework: maturity assessment, framework selection, a baseline control checklist, and the LLM risk taxonomy used in real production deployments.

Governance Precedes Implementation

AI failures in regulated industries are almost never failures of model accuracy. They are failures of inventory, classification, control documentation, monitoring, and accountability — the five dimensions of governance maturity. A governance program built after the first incident is twice the work and ten times the reputational cost of one built before deployment.

📋

Inventory and Classification First

Most enterprises do not know how many AI systems are in production until they look. A baseline inventory and per-system risk classification is the prerequisite for every other control.

🎯

Controls Scale With Risk

A chatbot answering FAQ does not need the same controls as an AI making credit decisions. Classification drives proportional control selection — not blanket policies.

👤

Accountability Is Named

Every AI system needs a named human owner with the authority to disable it. "The platform team" is not a name.

Where Your Organization Actually Stands

Level assessment is per-dimension, not per-organization. Most enterprises sit at Level 1-2 on inventory and Level 3+ on individual technical controls — the unevenness is the actual problem.

Level 1 — Ad Hoc

AI systems exist in production but are not centrally tracked. Risk classification is informal. Controls vary per team. Monitoring is reactive (incidents discovered by users, not systems). No named owner per system.

Signal: the question "how many AI systems do we run in production today" has no defensible answer.

Level 2 — Documented

An inventory exists but is incomplete. Risk classification scheme exists on paper. Some systems have documented controls. Monitoring is per-team. Owners are sometimes named.

Signal: the inventory is a spreadsheet someone owns; people argue about whether it is current.

Level 3 — Managed

A complete inventory is maintained by process, not by chance. Every system is classified. Controls are documented per risk class. Monitoring is centralized at least for high-risk systems. Every system has a named owner.

Signal: a regulator could ask for the AI inventory tomorrow and receive a current document the same day.

Level 4 — Measured

Inventory and classification are continuously verified. Controls are enforced by tooling, not by policy. Monitoring captures drift, hallucination rates and incident frequency. Owners are accountable through measurable SLOs.

Signal: there is a dashboard executives consult monthly, not a report compiled before audit season.

Level 5 — Optimized

Governance data drives decisions about which AI systems to expand, retire or restrict. Risk classifications are revisited as use cases evolve. Lessons from incidents change controls within weeks, not quarters. Governance is part of how the business runs, not a periodic compliance ceremony.

Signal: an AI system was deprecated this quarter because the governance data, not a budget cut, called for it.

How to Use This Model

Score per dimension (inventory, classification, controls, monitoring, accountability), not per-organization. Target Level 3 across the board before targeting Level 4 anywhere — uneven maturity creates blind spots in the weakest dimension. Re-score quarterly during transformation, annually thereafter.

NIST AI RMF, ISO/IEC 42001, EU AI Act — When to Use Which

These three frameworks are complementary, not alternatives. Enterprises running serious AI programs typically use all three for different purposes.

NIST AI Risk Management Framework

Origin: US National Institute of Standards and Technology, AI RMF 1.0 released January 2023.

Nature: voluntary methodology organized around four functions — Govern, Map, Measure, Manage. Not certifiable. Not law.

Best for: structuring an internal AI risk program from scratch. The Govern-Map-Measure-Manage loop is the most useful operational vocabulary for cross-functional teams (legal, security, engineering, risk).

Practical use: adopt as the internal program structure. Use the AI RMF playbook (also NIST-published) for control-level guidance.

ISO/IEC 42001:2023 — AI Management System

Origin: joint ISO/IEC standard published December 2023.

Nature: certifiable management-system standard, structurally aligned with ISO 27001. Auditable. Recognized internationally.

Best for: external audit signal. When customers, regulators or partners require evidence that your AI is managed under a recognized standard, ISO 42001 certification is the answer.

Practical use: pursue certification when commercially required (RFPs, regulated customer contracts, public-sector tenders). The NIST AI RMF program you already run is the strongest foundation for ISO 42001 readiness.

EU AI Act

Origin: EU Regulation 2024/1689, in force from 2025; full obligations apply progressively through 2026-2027.

Nature: binding law with extraterritorial reach. Applies if any output of your AI system is used in the EU, regardless of where your company is incorporated. Penalties up to 7% of global revenue for the worst violations.

Best for: nothing — this is mandatory, not a choice. Whether the Act applies depends on use case classification, not adoption decision.

Practical use: classify every AI system against the Act's four risk tiers (prohibited / high-risk / limited-risk / minimal-risk). Document the classification decision per system. High-risk systems require formal conformity assessment before deployment.

The Minimum Below Which Production Deployment Is Reckless

Twelve controls covering the five maturity dimensions. Not aspirational — the baseline that distinguishes managed AI from unmanaged AI.

Inventory & Classification

  1. Central inventory of every AI system in production, updated by process not by hope.
  2. Per-system risk classification against an internal scheme aligned with the regulatory regime that applies.
  3. Per-system use case statement: who uses it, for what decisions, with what residual human authority.

Controls & Documentation

  1. Data lineage record covering training data, retrieval sources, and any external data ingested at runtime.
  2. Model documentation: vendor, version, fine-tuning if any, prompt templates, evaluation results.
  3. Access control on the AI system itself, on its source data, and on its outputs (different surfaces, different controls).

Operational Controls

  1. Tested rollback or disable mechanism — can the system be turned off in minutes, by a defined role.
  2. Input, output and decision logging sufficient for incident reconstruction.
  3. Drift and quality monitoring with thresholds that trigger investigation.

Accountability & Response

  1. Named accountable owner per system, with the authority to disable.
  2. Incident-response procedure with severity levels and notification paths (internal, customer, regulator).
  3. Periodic review (annual minimum, quarterly for high-risk systems) covering all of the above.

What Actually Goes Wrong and What Each Failure Mode Needs

Generic "AI risk" is not actionable. These are the seven failure modes that appear in real production incidents, each paired with the control that addresses it.

1. Data Leakage

Sensitive content reaches a model, vendor or log it should not. Common via prompts containing PII or via embedding sensitive content into shared vector stores.

Control: data classification gate at the ingestion point; vendor data-handling agreements; private deployment of high-sensitivity workloads.

2. Hallucination

Model produces a confident, fluent answer that is factually wrong. Particularly dangerous when the user is not the domain expert.

Control: RAG grounding, source citation in response schema, low-confidence rejection, scope narrowing.

3. Prompt Injection

Untrusted content (documents, user input, retrieved web pages) carries instructions that override the system prompt.

Control: separation of system and user channels, content sanitization, output validation, capability restriction.

4. Unauthorized Action

LLM with tool access takes an action it should not — sends an email, modifies data, calls a costly API.

Control: tool allow-listing, per-tool authorization, human approval for irreversible actions, action audit logs.

5. Model Drift

Vendor updates the model silently; behavior on existing prompts changes without notice. Affects every SaaS LLM consumer.

Control: pinned model versions where supported, regression test suite run on each vendor announcement, dual-model evaluation during transitions.

6. Bias and Disparate Impact

Outputs systematically disadvantage a protected class. Critical for any AI touching hiring, credit, insurance, healthcare, education.

Control: disparate-impact testing per decision class, demographic disaggregation in monitoring, documented mitigation when found.

7. Cost Runaway

Token consumption, retrieval lookups, or agent loops produce costs far above forecast. Operationally an availability risk.

Control: per-user and per-system rate limits, cost monitoring with alerting, hard budget caps on autonomous agents.

Where This Framework Lands in Practice

Enterprise AI FAQ

Fifteen decision-maker questions on strategy, governance, risk, implementation and ROI — short answers referencing this framework.

Read the AI FAQ →

Methodology

How an engagement actually runs — discovery, architecture, governance design, implementation oversight.

See the Methodology page →

Case Studies

Anonymized engagements where these controls were applied — healthcare, financial services, manufacturing.

Browse Case Studies →

From Framework to Program

A two-hour Architecture Review converts a generic framework into a specific maturity score, gap analysis and control roadmap for your organization.