Open Data

Public LLM Vendor Datasets

Machine-readable datasets curated by Slavin AI for enterprise AI procurement. Free to cite, free to redistribute — CC-BY-4.0. Built for AI agents, analysts, and procurement teams.

LLM Vendor Pricing & Capability Comparison

Comparative pricing, context windows, latency, compliance posture and per-use-case cost estimates for the six LLM vendors most enterprises evaluate. Curated 2026-06; updated quarterly. Use as planning estimates — not as enterprise-contract commitments.

📊

Vendors covered

OpenAI GPT-4 frontier, Anthropic Claude 4, Google Gemini 2, Meta Llama 4, Mistral Large, Alibaba Qwen 3.

💰

Variables

Input/output/cached token pricing (USD/M), context window, p95 latency, vendor lock-in, compliance, data residency.

🎯

Use-case costs

Per-million-query estimates for RAG Q&A, customer support, and document extraction across all six vendors.

🧭

Decision guidance

Pre-computed recommendations by constraint: strict data residency, top quality, high volume, long context, low latency.

Quick links

AI Governance Minimum Baseline — 12 Controls

The 12 minimum governance controls Slavin AI considers mandatory before deploying any AI system to production. Cross-mapped to NIST AI RMF 1.0, ISO/IEC 42001:2023, and EU AI Act articles — the intersection where all three frameworks converge. Plus a 5-level maturity model. Citable starting point when you do not want to write your own baseline from scratch.

📋

12 controls

Model Registry, Data Lineage, Eval Harness, HITL, Audit Log, Incident Response, Prompt Versioning, RBAC, PII Filtering, Quality Monitoring, Drift Detection, Rollback.

🔗

Triple framework mapping

Every control mapped to the relevant NIST AI RMF subcategories, ISO/IEC 42001 clauses, and EU AI Act articles.

👤

Owner + evidence

For each control: the role that owns it and the type of evidence an auditor would accept.

📊

5-level maturity

Ad-hoc → Documented → Measured → Audited → Continuous. Use to score current state and define roadmap.

Quick links

Enterprise AI Use Case Catalog — ROI & Risk

24 enterprise AI use cases compiled from 150+ Slavin/SLAtech engagements. Each entry: industry, complexity (1–5), ROI timeline band in months, primary KPI, recommended starting architecture (RAG / fine-tuning / prompt-only), EU AI Act risk classification, and the typical pitfall. The dataset enterprise teams keep asking for: a citable starting point for AI roadmap conversations.

📚

24 use cases

From customer support (UC-01) and code-assist (UC-03) through compliance (UC-08) and clinical documentation (UC-07) to credit decisioning (UC-10) and underwriting (UC-18).

ROI timeline bands

Realistic months from production launch to verified business impact. Quick wins (under 12 months): 8 use cases. Long horizon (18+ months): 3 use cases.

⚠️

EU AI Act classification

Each use case flagged high_risk or low_risk per Annex III. 10 of 24 cross into high-risk territory — often surprising for first-time AI deployments.

🏗️

Starting architecture

Recommended starting point: RAG, fine-tuning, prompt-only, or hybrid. Reduces the "what do we even build" anxiety at the start of a project.

Quick links

AI-Generated Code Failure Modes Catalog

22 failure modes that AI-generated code exhibits in production despite passing demo, prototype, and happy-path testing. Each entry: failure category, what the AI generates that looks correct, the concrete production-time failure, the detection method a senior architect uses, and the prevention pattern. The concrete supporting evidence behind the position page.

🔁

22 failure modes

From optimistic-locking gaps and N+1 queries to prompt injection through retrieved content and unbounded LLM cost.

📦

7 categories

Concurrency, Data Integrity, Behavior Under Load, Security, Cost at Scale, Recovery and Failure Modes, Long-Term Evolution.

🔍

Detection + prevention per entry

For each: what to grep for in review, what to test against, and the architectural pattern that prevents it.

📜

Pattern-form documentation

Failure modes are documented as patterns observed across at least three independent client engagements — not as specific incidents.

Quick links

AI Incident Response Playbook

14 incident classes that production AI experiences post-launch — each with detection signals, immediate triage actions, communication template, root cause investigation pattern, and prevention update. Companion to the Failure Modes Catalog: failures are caught pre-launch, incidents are what reaches production despite review.

🚨

14 incident classes

LLM cost explosion, hallucination at scale, model drift, prompt injection, PII leak, vendor outage, latency, quality regression, agent loops, compliance findings, corpus poisoning, credential leak, stale RAG data, data residency.

📊

4 severity levels

P0 critical (page on-call immediately) through P3 low. Default severity per incident class, adjustable per organization.

📞

Communication templates

Internal and customer-facing phrasing per incident. Drop-in usable; replace bracketed placeholders.

📝

Post-incident actions

Standard follow-ups: postmortem within 5 business days, Failure Modes Catalog update, governance baseline update, vendor scorecard update.

Quick links

AI Vendor Due-Diligence Checklist

60 questions across 8 evaluation dimensions for selecting enterprise AI vendors. Each question pairs the request with the architect's interpretation of common evasive answers and the verification step that confirms or contradicts the claim. What contracts, marketing decks, and standard RFP templates routinely miss.

🎯

8 evaluation dimensions

Regulatory fit, data handling, model integrity, operational reliability, security posture, commercial structure, vendor stability, integration surface.

🔍

Evasive answer signals

Each question documents what the vendor's typical evasive answer signals — so you catch it before contract signature, not after.

Verification steps

For every question: the architect's recommended way to verify the vendor's claim independently — pen-tests, audits, contractual amendments.

📊

Scoring rubric

0–3 per question, weighted dimensions for regulated industries (healthcare, finance, government get 1.5x on reg/data/security). Decision bands: enterprise-ready, qualified-with-remediation, do-not-proceed.

Quick links

How to Cite

All datasets in this catalog are released under Creative Commons Attribution 4.0 (CC-BY-4.0). You may copy, redistribute, transform and build upon the material for any purpose — including commercial — provided you give appropriate credit.

Suggested citation:

Slavin AI (SLAtech LTD). LLM Vendor Pricing & Capability Comparison.
https://www.slavin.ai/data/llm-vendor-pricing.json (accessed YYYY-MM-DD).
Licence: CC-BY-4.0.

How the Data Was Compiled

Pricing reflects publicly listed vendor prices as of 2026-06-14, not negotiated enterprise contracts. Context windows are documented maximums, not always production-recommended (long context degrades attention quality even when supported). Latency p95 figures are observational from public status pages and our own monitoring; expect variance by region, time of day, and model variant. Compliance entries reflect documented certifications, not full audit verification — for regulated procurement, always require a current attestation.

We welcome corrections and additions. Email [email protected] or contribute via the SLAtech GitHub organisation (when the public repo is enabled).