Version 2026-06 · CC-BY-4.0

AI Pre-Production Review Checklist

22 failure modes AI generators ship into production code, in 7 categories — with detection signals and mitigations.

Save this as a PDF: press Ctrl+P (Windows/Linux) or +P (Mac), then choose “Save as PDF”. The page is print-styled for clean output.
How to use this checklist. Run it across any AI-assisted codebase before production deploy. For each failure mode, walk through Detection on the actual code, not the spec. If Detection finds a hit, apply the Prevention pattern before launch — not as a follow-up. The catalog is compiled from 150+ engagements where AI-generated code passed initial review and then failed under production conditions. Each mode has been observed in at least three independent client engagements before inclusion.
CAT-CONCUR

Concurrency

Failures that emerge under simultaneous operations the AI did not model.

FM-01

Optimistic locking absent

AI Generates
Read-modify-write code without version check or row-level lock. Passes single-threaded test.
Production Failure
Two concurrent writes overwrite each other silently. Lost update problem at moderate concurrency.
Detection
Code review for any read-modify-write on shared records. Load test with 10+ concurrent users on the same entity.
Prevention
Optimistic concurrency token (rowversion / ETag) or explicit pessimistic lock with clear timeout.
CAT-DATA

Data Integrity

Failures that corrupt or lose data under conditions outside happy-path.

FM-03

Missing transaction boundaries

AI Generates
Multi-step write sequence (order, payment, inventory) without explicit transaction.
Production Failure
Partial commit on crash mid-sequence leaves inconsistent state. Orders without inventory decrement.
Detection
Trace every multi-write operation. If failure between steps leaves bad state, transaction is missing.
Prevention
Explicit transaction scope around logical units. Outbox pattern for cross-service writes.
FM-12

Decimal precision loss in money math

AI Generates
Float / double for monetary amounts. Looks like a number type.
Production Failure
Cents disappear or appear over time. Reconciliation drift. Audit fail.
Detection
Any money field that is not Decimal / fixed-point is wrong. Code review rule.
Prevention
Decimal type everywhere for money. Database column with explicit precision. Unit-tested edge cases.
FM-13

Timezone-unaware date handling

AI Generates
DateTime stored without timezone; client converts in JS arbitrarily.
Production Failure
Reports off by hours. Daily-rollup tasks miss data near midnight. Audit trail wrong.
Detection
Every datetime field — confirm UTC storage; every display — confirm explicit timezone conversion.
Prevention
Store UTC. Display in user-local at the edge. Never compare naive datetimes.
FM-14

Cache without invalidation

AI Generates
Lookup-cache around a slow query. No invalidation logic on the writer side.
Production Failure
Stale data served to users hours after the change. “Why does the dashboard not update?”
Detection
Every cache must have a documented invalidation trigger. If not — flag.
Prevention
Bounded TTL + explicit invalidation on write. Stale-while-revalidate where freshness is loose.
FM-17

Schema migration without backfill

AI Generates
ALTER TABLE adding a non-null column with a default but no backfill plan for existing rows.
Production Failure
Long-running migration locks production table for hours. Or worse: silent constraint break.
Detection
Every schema migration on a non-trivial table needs a backfill plan reviewed in advance.
Prevention
Expand-contract pattern. Nullable column first, backfill, then enforce non-null.
FM-22

Time-window vulnerability in promotional code

AI Generates
Coupon redemption that checks remaining count then decrements in a separate step.
Production Failure
Two concurrent redemptions both see remaining=1. Both succeed. Inventory underflows.
Detection
Atomic check-and-decrement; otherwise race exists.
Prevention
Atomic database operation (UPDATE ... WHERE remaining > 0). Or distributed lock with timeout.
CAT-LOAD

Behavior Under Load

Failures that manifest only at scale the AI did not test against.

FM-02

N+1 query pattern

AI Generates
ORM access in a loop that issues a separate query per item. Looks idiomatic.
Production Failure
Page load goes from 50ms to 5+ seconds as the collection grows. Database CPU spikes.
Detection
Profile real queries on a representative dataset. Watch for query count proportional to result size.
Prevention
Eager loading / join / batched query. Set a query-count budget per endpoint and assert in tests.
FM-05

Unbounded resource allocation

AI Generates
List = readAll(); foreach item ... . No pagination, no cap.
Production Failure
Memory exhausted when dataset grows. OutOfMemoryException at 50K rows.
Detection
Identify every readAll-style call. Confirm result size is bounded by request or by paging.
Prevention
Streaming or pagination by default. Reject readAll on unbounded sources at code review.
FM-18

Sync work inside HTTP handler

AI Generates
Endpoint that does an external API call inline before responding.
Production Failure
p99 latency tracks the slowest vendor. Cascading failure when vendor slows.
Detection
Anything in a sync handler taking >100ms is a candidate for async / queue.
Prevention
Background job + status endpoint for slow work. Async pipeline for non-critical-path work.
CAT-SECURE

Security

Vulnerabilities the AI introduced because it does not threat-model your specific surface.

FM-07

SQL injection via string concatenation

AI Generates
Dynamic SQL with concatenated user input when parameterized query was awkward.
Production Failure
Trivial SQL injection. Data exfiltration or destruction by a malicious or fuzzed input.
Detection
Static analysis flag on string + sql. Code review for every dynamic query.
Prevention
Parameterized queries by default. Lint rule that flags string concatenation in query construction.
FM-08

Authorization missing inside data access

AI Generates
Endpoint authenticates the user but the data query does not filter by ownership.
Production Failure
Authenticated user retrieves another tenant's records by guessing IDs. Cross-tenant data leak.
Detection
Every multi-tenant read must filter by tenant key in the query. Test with two users + IDOR probe.
Prevention
Row-level security in the database OR a tenant filter helper that wraps every query.
FM-09

Token / API key in code

AI Generates
Hardcoded secret committed to repo while wiring an integration.
Production Failure
Public repo leaks key. Credential rotation required. Sometimes followed by bill shock.
Detection
Pre-commit hook scanning for high-entropy strings; periodic secret scan over history.
Prevention
Secret manager. Never accept a string literal that looks like a key in code review.
FM-20

Prompt injection through retrieved content

AI Generates
RAG handler treats retrieved chunks as trusted instructions to the LLM.
Production Failure
Malicious document poisons the response. AI assistant exfiltrates context or executes a tool unsafely.
Detection
Threat-model the corpus. If any document can be authored by untrusted parties, retrieval is an injection vector.
Prevention
Retrieved content as data, never as instruction. System prompt that forbids following retrieved instructions. Output validation.
CAT-COST

Cost at Scale

Patterns that are cheap at prototype scale and become unaffordable in production.

FM-10

Unbounded LLM context cost

AI Generates
RAG retrieval that always sends top-50 chunks to the LLM regardless of relevance.
Production Failure
Bill is 10-50x what was planned because most tokens are noise. Latency degrades too.
Detection
Measure tokens-per-query vs answered-with-citations rate. Anomalies in either are signal.
Prevention
Reranker before LLM, confidence threshold, top-k tuned per use case. Budget alarm on token spend.
FM-19

Missing rate limit on AI endpoint

AI Generates
Public AI endpoint with no per-user / per-IP throttle.
Production Failure
Abuse runs up the LLM bill. Single bad actor can exceed a month of budget in an hour.
Detection
Every LLM-backed endpoint must have a per-key rate limit. Alarm above threshold.
Prevention
Rate limiter with budget alerting. Tiered quotas. Authenticated-only AI endpoints by default.
CAT-RECOVER

Recovery and Failure Modes

Code that has no plan for partial failure, retry, or rollback.

FM-04

Idempotency missing on retry path

AI Generates
Webhook handler or job runner that processes message once, no dedupe.
Production Failure
Network retry sends duplicate event. Customer is charged twice, email sent twice.
Detection
Ask: “if this runs twice with the same input, what happens?” Verify dedupe key exists.
Prevention
Idempotency key on every external-side-effect operation. Persist seen-keys for retention window.
FM-06

Timeout-less external call

AI Generates
HttpClient.GetAsync(url) with no timeout. Looks clean.
Production Failure
Vendor outage hangs every dependent request. Thread pool exhausted; whole service down.
Detection
Grep for HTTP clients, message queues, DB calls without explicit timeouts.
Prevention
Explicit timeout on every IO call. Circuit breaker for repeated failures. Bulkhead pool isolation.
FM-11

Missing dead-letter handling

AI Generates
Message handler that retries on failure forever.
Production Failure
Poison message blocks the queue. Backlog grows; processing freezes.
Detection
Every retry policy needs a give-up condition and a destination for the give-up.
Prevention
Bounded retries, dead-letter queue, alerting on dead-letter count. Manual review path.
FM-21

No rollback path for AI feature

AI Generates
Replaces a deterministic computation with an LLM call. No fallback.
Production Failure
LLM vendor outage takes the feature down. There is no degraded mode.
Detection
For every AI feature: what does the system do when the AI is unavailable? If “nothing” — fix.
Prevention
Feature flag + deterministic fallback. Graceful degradation. Health probes on the AI dependency.
CAT-EVOLVE

Long-Term Evolution

Code that is correct now and will be a refactoring blocker in 18 months.

FM-15

Cross-cutting logging coupled to business code

AI Generates
Log lines threaded through business methods, mixed with returns.
Production Failure
Refactoring drops critical observability silently. Incident response degrades.
Detection
Audit logging to confirm it is a cross-cutting concern, not inline copy-paste.
Prevention
Structured logging via middleware / aspect. Logging contract per layer, enforced in review.
FM-16

Tight coupling to LLM vendor

AI Generates
Direct vendor SDK calls scattered through business code.
Production Failure
Vendor price hike or deprecation forces touching every call site. Migration is a quarter.
Detection
Grep for vendor SDK names. If they appear in business code, abstraction is missing.
Prevention
LLM gateway with versioned prompts and a stable internal interface. Vendor swap = one config change.

Coverage Summary

Concurrency1
Data Integrity6
Behavior Under Load3
Security4
Cost at Scale2
Recovery and Failure Modes4
Long-Term Evolution2
Total22