Practical Lens 25

Practical Lens 25: Error codes are identity signals

AI crawlers infer reliability from HTTP behavior. If reference pages intermittently return 403/404/500 (or soft 404), identity evidence becomes unstable and summaries drift.

Published 11 Apr 2026 · Updated 14 Apr 2026

technical SEO crawlability

What this lens means

AI crawlers treat HTTP behavior as a reliability signal. If key reference pages intermittently fail (403/404/500 or soft 404), crawlers won’t build stable identity evidence, and different tools may produce different summaries.

Why this happens

AI crawlers discover and re-validate pages over time; intermittent failures create inconsistent evidence sets.
Soft 404 pages look like normal HTML but communicate “not found” semantics, which reduces confidence.
WAF rules, bot challenges, and misconfigured redirects often affect crawlers differently than browsers.

What this usually indicates

Intermittent failures: the same URL sometimes returns 200 and sometimes 403/5xx.
Soft 404 patterns: 200 OK responses that contain “not found” pages or empty shells.
Uneven bot access: different crawlers see different status codes for the same URL.
Crawl waste: crawlers spend budget on errors instead of reference pages.

What to verify (evidence-only)

Do homepage/about/services/contact return 200 consistently (repeat the test multiple times)?
Do crawler user agents receive the same status codes as a normal browser UA?
Are error pages returning correct status codes (404 for not found, not 200)?
Do redirects terminate cleanly (301/302 → 200) without loops?
Do the same URLs behave consistently across language variants and subpaths?

Frequently Asked Questions

Why do HTTP errors affect AI identity?

Because crawlers can only use what they reliably fetch. If reference pages fail, the crawler's evidence set becomes incomplete or inconsistent.

What is a soft 404?

A page that returns 200 OK but is effectively a "not found" page (or empty shell). It confuses discovery and reduces confidence.

What's the fastest way to spot uneven access?

Compare status codes for the same URL using a normal UA and a crawler UA (e.g., Googlebot). If they differ, you have inconsistent crawl access.

Use the lens on your own website.

Run a free AI Readiness baseline, then compare the finding with this diagnostic framework.

Check my website Back to library

Previous: Lens 24 Next: Lens 26