Version 2026-06 · CC-BY-4.0
AI Pre-Production Review Checklist
22 failure modes AI generators ship into production code, in 7 categories — with detection signals and mitigations.
Save this as a PDF: press Ctrl+P (Windows/Linux) or ⌘+P (Mac), then choose “Save as PDF”. The page is print-styled for clean output.
How to use this checklist. Run it across any AI-assisted codebase before production deploy. For each failure mode, walk through Detection on the actual code, not the spec. If Detection finds a hit, apply the Prevention pattern before launch — not as a follow-up. The catalog is compiled from 150+ engagements where AI-generated code passed initial review and then failed under production conditions. Each mode has been observed in at least three independent client engagements before inclusion.
CAT-CONCURConcurrency
Failures that emerge under simultaneous operations the AI did not model.
FM-01
Optimistic locking absent
- AI Generates
- Read-modify-write code without version check or row-level lock. Passes single-threaded test.
- Production Failure
- Two concurrent writes overwrite each other silently. Lost update problem at moderate concurrency.
- Detection
- Code review for any read-modify-write on shared records. Load test with 10+ concurrent users on the same entity.
- Prevention
- Optimistic concurrency token (rowversion / ETag) or explicit pessimistic lock with clear timeout.
CAT-DATAData Integrity
Failures that corrupt or lose data under conditions outside happy-path.
FM-03
Missing transaction boundaries
- AI Generates
- Multi-step write sequence (order, payment, inventory) without explicit transaction.
- Production Failure
- Partial commit on crash mid-sequence leaves inconsistent state. Orders without inventory decrement.
- Detection
- Trace every multi-write operation. If failure between steps leaves bad state, transaction is missing.
- Prevention
- Explicit transaction scope around logical units. Outbox pattern for cross-service writes.
FM-12
Decimal precision loss in money math
- AI Generates
- Float / double for monetary amounts. Looks like a number type.
- Production Failure
- Cents disappear or appear over time. Reconciliation drift. Audit fail.
- Detection
- Any money field that is not Decimal / fixed-point is wrong. Code review rule.
- Prevention
- Decimal type everywhere for money. Database column with explicit precision. Unit-tested edge cases.
FM-13
Timezone-unaware date handling
- AI Generates
- DateTime stored without timezone; client converts in JS arbitrarily.
- Production Failure
- Reports off by hours. Daily-rollup tasks miss data near midnight. Audit trail wrong.
- Detection
- Every datetime field — confirm UTC storage; every display — confirm explicit timezone conversion.
- Prevention
- Store UTC. Display in user-local at the edge. Never compare naive datetimes.
FM-14
Cache without invalidation
- AI Generates
- Lookup-cache around a slow query. No invalidation logic on the writer side.
- Production Failure
- Stale data served to users hours after the change. “Why does the dashboard not update?”
- Detection
- Every cache must have a documented invalidation trigger. If not — flag.
- Prevention
- Bounded TTL + explicit invalidation on write. Stale-while-revalidate where freshness is loose.
FM-17
Schema migration without backfill
- AI Generates
- ALTER TABLE adding a non-null column with a default but no backfill plan for existing rows.
- Production Failure
- Long-running migration locks production table for hours. Or worse: silent constraint break.
- Detection
- Every schema migration on a non-trivial table needs a backfill plan reviewed in advance.
- Prevention
- Expand-contract pattern. Nullable column first, backfill, then enforce non-null.
FM-22
Time-window vulnerability in promotional code
- AI Generates
- Coupon redemption that checks remaining count then decrements in a separate step.
- Production Failure
- Two concurrent redemptions both see remaining=1. Both succeed. Inventory underflows.
- Detection
- Atomic check-and-decrement; otherwise race exists.
- Prevention
- Atomic database operation (UPDATE ... WHERE remaining > 0). Or distributed lock with timeout.
CAT-LOADBehavior Under Load
Failures that manifest only at scale the AI did not test against.
FM-02
N+1 query pattern
- AI Generates
- ORM access in a loop that issues a separate query per item. Looks idiomatic.
- Production Failure
- Page load goes from 50ms to 5+ seconds as the collection grows. Database CPU spikes.
- Detection
- Profile real queries on a representative dataset. Watch for query count proportional to result size.
- Prevention
- Eager loading / join / batched query. Set a query-count budget per endpoint and assert in tests.
FM-05
Unbounded resource allocation
- AI Generates
- List = readAll(); foreach item ... . No pagination, no cap.
- Production Failure
- Memory exhausted when dataset grows. OutOfMemoryException at 50K rows.
- Detection
- Identify every readAll-style call. Confirm result size is bounded by request or by paging.
- Prevention
- Streaming or pagination by default. Reject readAll on unbounded sources at code review.
FM-18
Sync work inside HTTP handler
- AI Generates
- Endpoint that does an external API call inline before responding.
- Production Failure
- p99 latency tracks the slowest vendor. Cascading failure when vendor slows.
- Detection
- Anything in a sync handler taking >100ms is a candidate for async / queue.
- Prevention
- Background job + status endpoint for slow work. Async pipeline for non-critical-path work.
CAT-SECURESecurity
Vulnerabilities the AI introduced because it does not threat-model your specific surface.
FM-07
SQL injection via string concatenation
- AI Generates
- Dynamic SQL with concatenated user input when parameterized query was awkward.
- Production Failure
- Trivial SQL injection. Data exfiltration or destruction by a malicious or fuzzed input.
- Detection
- Static analysis flag on string + sql. Code review for every dynamic query.
- Prevention
- Parameterized queries by default. Lint rule that flags string concatenation in query construction.
FM-08
Authorization missing inside data access
- AI Generates
- Endpoint authenticates the user but the data query does not filter by ownership.
- Production Failure
- Authenticated user retrieves another tenant's records by guessing IDs. Cross-tenant data leak.
- Detection
- Every multi-tenant read must filter by tenant key in the query. Test with two users + IDOR probe.
- Prevention
- Row-level security in the database OR a tenant filter helper that wraps every query.
FM-09
Token / API key in code
- AI Generates
- Hardcoded secret committed to repo while wiring an integration.
- Production Failure
- Public repo leaks key. Credential rotation required. Sometimes followed by bill shock.
- Detection
- Pre-commit hook scanning for high-entropy strings; periodic secret scan over history.
- Prevention
- Secret manager. Never accept a string literal that looks like a key in code review.
FM-20
Prompt injection through retrieved content
- AI Generates
- RAG handler treats retrieved chunks as trusted instructions to the LLM.
- Production Failure
- Malicious document poisons the response. AI assistant exfiltrates context or executes a tool unsafely.
- Detection
- Threat-model the corpus. If any document can be authored by untrusted parties, retrieval is an injection vector.
- Prevention
- Retrieved content as data, never as instruction. System prompt that forbids following retrieved instructions. Output validation.
CAT-COSTCost at Scale
Patterns that are cheap at prototype scale and become unaffordable in production.
FM-10
Unbounded LLM context cost
- AI Generates
- RAG retrieval that always sends top-50 chunks to the LLM regardless of relevance.
- Production Failure
- Bill is 10-50x what was planned because most tokens are noise. Latency degrades too.
- Detection
- Measure tokens-per-query vs answered-with-citations rate. Anomalies in either are signal.
- Prevention
- Reranker before LLM, confidence threshold, top-k tuned per use case. Budget alarm on token spend.
FM-19
Missing rate limit on AI endpoint
- AI Generates
- Public AI endpoint with no per-user / per-IP throttle.
- Production Failure
- Abuse runs up the LLM bill. Single bad actor can exceed a month of budget in an hour.
- Detection
- Every LLM-backed endpoint must have a per-key rate limit. Alarm above threshold.
- Prevention
- Rate limiter with budget alerting. Tiered quotas. Authenticated-only AI endpoints by default.
CAT-RECOVERRecovery and Failure Modes
Code that has no plan for partial failure, retry, or rollback.
FM-04
Idempotency missing on retry path
- AI Generates
- Webhook handler or job runner that processes message once, no dedupe.
- Production Failure
- Network retry sends duplicate event. Customer is charged twice, email sent twice.
- Detection
- Ask: “if this runs twice with the same input, what happens?” Verify dedupe key exists.
- Prevention
- Idempotency key on every external-side-effect operation. Persist seen-keys for retention window.
FM-06
Timeout-less external call
- AI Generates
- HttpClient.GetAsync(url) with no timeout. Looks clean.
- Production Failure
- Vendor outage hangs every dependent request. Thread pool exhausted; whole service down.
- Detection
- Grep for HTTP clients, message queues, DB calls without explicit timeouts.
- Prevention
- Explicit timeout on every IO call. Circuit breaker for repeated failures. Bulkhead pool isolation.
FM-11
Missing dead-letter handling
- AI Generates
- Message handler that retries on failure forever.
- Production Failure
- Poison message blocks the queue. Backlog grows; processing freezes.
- Detection
- Every retry policy needs a give-up condition and a destination for the give-up.
- Prevention
- Bounded retries, dead-letter queue, alerting on dead-letter count. Manual review path.
FM-21
No rollback path for AI feature
- AI Generates
- Replaces a deterministic computation with an LLM call. No fallback.
- Production Failure
- LLM vendor outage takes the feature down. There is no degraded mode.
- Detection
- For every AI feature: what does the system do when the AI is unavailable? If “nothing” — fix.
- Prevention
- Feature flag + deterministic fallback. Graceful degradation. Health probes on the AI dependency.
CAT-EVOLVELong-Term Evolution
Code that is correct now and will be a refactoring blocker in 18 months.
FM-15
Cross-cutting logging coupled to business code
- AI Generates
- Log lines threaded through business methods, mixed with returns.
- Production Failure
- Refactoring drops critical observability silently. Incident response degrades.
- Detection
- Audit logging to confirm it is a cross-cutting concern, not inline copy-paste.
- Prevention
- Structured logging via middleware / aspect. Logging contract per layer, enforced in review.
FM-16
Tight coupling to LLM vendor
- AI Generates
- Direct vendor SDK calls scattered through business code.
- Production Failure
- Vendor price hike or deprecation forces touching every call site. Migration is a quarter.
- Detection
- Grep for vendor SDK names. If they appear in business code, abstraction is missing.
- Prevention
- LLM gateway with versioned prompts and a stable internal interface. Vendor swap = one config change.
Coverage Summary
Concurrency1
Data Integrity6
Behavior Under Load3
Security4
Cost at Scale2
Recovery and Failure Modes4
Long-Term Evolution2
Total22