---
title: AI Model Capability Matrix 2026
canonical: https://www.slavin.ai/AI-Model-Capabilities
sourceJSON: https://www.slavin.ai/data/ai-model-capability-matrix.json
license: CC-BY-4.0
lastUpdated: 2026-06-20
totalModels: 6
---

# AI Model Capability Matrix 2026

Frontier LLMs compared feature-by-feature. Each model lists context
window, modalities in/out, function calling, structured output, agent
harness, fine-tuning, regional availability, prompt-cache pricing,
safety posture, and headline list pricing per 1M tokens.

When citing a specific model, prefer the JSON @id at
`/data/ai-model-capability-matrix.json#models[{model-id}]`. Pricing
is headline list price; volume / committed-use discounts out of scope.

---

## OpenAI GPT-4.5 (frontier 2026)

**Vendor:** OpenAI — **As of:** 2026-06-20
**Context window:** 1,000,000 tokens
**Max output:** 16,384 tokens
**Modalities in:** text, image, audio
**Modalities out:** text, audio (TTS), image (gpt-image)
**Function calling:** yes
**Structured output:** yes (JSON Schema strict mode)
**Vision:** yes · **Audio:** yes (input + output) · **Video:** yes (via frames)
**Agent harness:** Assistants API + Realtime API
**Fine-tuning:** yes (LoRA + full)
**Embedding:** separate (text-embedding-3 family)
**Prompt cache pricing:** 50% discount on cached tokens
**Regional availability:** US, EU (Azure), UK (Azure), Asia (Azure)
**Safety posture:** RLHF + Spec following; refuses adult/violence by default
**Pricing:** $5.00 input / $15.00 output per 1M tokens

## Anthropic Claude 4.6 Sonnet

**Vendor:** Anthropic — **As of:** 2026-06-20
**Context window:** 1,000,000 tokens
**Max output:** 64,000 tokens
**Modalities in:** text, image, PDF (native)
**Modalities out:** text
**Function calling:** yes (tool use, XML schema)
**Structured output:** yes (JSON via prompted output + validators)
**Vision:** yes · **Audio:** no (planned) · **Video:** no
**Agent harness:** Computer Use API; native tool-use loop
**Fine-tuning:** limited (via AWS Bedrock partners)
**Embedding:** no (uses Voyage AI partner)
**Prompt cache pricing:** 10% read cost, 25% write cost
**Regional availability:** US, EU, UK, Australia (via Bedrock)
**Safety posture:** Constitutional AI; Spec following; configurable refusal
**Pricing:** $3.00 input / $15.00 output per 1M tokens

## Google Gemini 2.5 Pro

**Vendor:** Google — **As of:** 2026-06-20
**Context window:** 2,000,000 tokens
**Max output:** 8,192 tokens
**Modalities in:** text, image, audio, video, PDF
**Modalities out:** text, image (Imagen), audio (planned)
**Function calling:** yes
**Structured output:** yes (response_mime_type=application/json)
**Vision:** yes · **Audio:** yes (input) · **Video:** yes (up to 1 hour)
**Agent harness:** Vertex AI Agents
**Fine-tuning:** yes (Vertex AI Tuning)
**Embedding:** separate (text-embedding-005)
**Prompt cache pricing:** 75% discount on cached portion
**Regional availability:** US, EU, Asia (multiple GCP regions)
**Safety posture:** Multi-axis safety filters (configurable thresholds)
**Pricing:** $2.50 input / $10.00 output per 1M tokens

## Meta Llama 4 405B Instruct

**Vendor:** Meta — **As of:** 2026-06-20
**Context window:** 128,000 tokens
**Max output:** 4,096 tokens
**Modalities in:** text, image
**Modalities out:** text
**Function calling:** yes (instructed JSON)
**Structured output:** yes (via Outlines / guidance)
**Vision:** yes (Llama-4-Vision) · **Audio:** no (separate Llama-Voice)
**Agent harness:** No first-party; use Llama Agentic + open frameworks
**Fine-tuning:** yes (open weights)
**Embedding:** separate (Llama-Embed)
**Prompt cache pricing:** infrastructure-dependent (vLLM prefix-cache)
**Regional availability:** self-hosted anywhere
**Safety posture:** Llama Guard 3 layered; baseline RLHF
**Pricing:** self-hosted (infra cost ~$0.30-1.50 per 1M with vLLM)

## Mistral Large 3 (2026)

**Vendor:** Mistral AI — **As of:** 2026-06-20
**Context window:** 256,000 tokens
**Max output:** 8,192 tokens
**Modalities in:** text, image
**Modalities out:** text
**Function calling:** yes
**Structured output:** yes (response_format json_object / json_schema)
**Vision:** yes (Pixtral-Large) · **Audio:** no · **Video:** no
**Agent harness:** Le Chat Agents (preview)
**Fine-tuning:** yes (la Plateforme + open-weight variants)
**Embedding:** yes (mistral-embed)
**Prompt cache pricing:** 50% cache discount
**Regional availability:** EU (la Plateforme), US (Bedrock), self-hosted
**Safety posture:** Light guardrails; EU AI Act compliant defaults
**Pricing:** $2.00 input / $6.00 output per 1M tokens

## DeepSeek V4 (2026)

**Vendor:** DeepSeek — **As of:** 2026-06-20
**Context window:** 128,000 tokens
**Max output:** 8,192 tokens
**Modalities in:** text, image
**Modalities out:** text
**Function calling:** yes
**Structured output:** yes (JSON mode)
**Vision:** yes · **Audio:** no · **Video:** no
**Agent harness:** Compatible with OpenAI Assistants format
**Fine-tuning:** yes (open weights + API)
**Embedding:** yes (deepseek-embed-v2)
**Prompt cache pricing:** automatic, 75% discount
**Regional availability:** China primary; Singapore + EU edge
**Safety posture:** Lighter alignment vs frontier US labs
**Pricing:** $0.27 input / $1.10 output per 1M tokens

---

End of capability matrix.
