Self case study: AI Bot Web Readiness

December 2025: The moment we realized we had a problem

We run VerisAI.eu, a company that helps businesses become verifiable to AI systems. Ironic, then, that when we asked ChatGPT "Who is VerisAI.eu?" in December 2025, it hallucinated.

Not a small mistake. A complete fabrication. Wrong company description, wrong services, wrong everything.

We had a professional website. Clean design. Good content. But to AI crawlers, we might as well have been invisible.

The diagnosis

We ran a technical audit on our own site. Here's what we found:

Zero structured data. No JSON-LD. AI systems had to guess our identity from text fragments.
No sitemap. Crawlers had to discover pages by following links and hoping they didn't miss anything.
No canonical tags. Multiple URL variants (www vs non-www, trailing slashes) created ambiguity about which was "official."
Minimal metadata. When other sites referenced us, there was no consistent Open Graph or Twitter Card data.

Our site worked fine for humans. For machines? It was a black box.

Week 1: Building machine-readable identity

Days 1-3: The identity foundation

We started with the core problem: AI systems need explicit identity signals, not inferred ones.

We wrote 45 lines of JSON-LD (JavaScript Object Notation for Linked Data) declaring:

Our legal name: Belvo s.r.o.
Our trading name: VerisAI.eu
Our official URL: https://verisai.eu/
Our logo URL (stable asset path)
Contact point: support@verisai.eu with language support
External identifiers: Company registry number (IČO: 25659944), LinkedIn profiles of founders

Why this matters: AI systems can now verify "Is this the real VerisAI.eu?" by cross-referencing our JSON-LD against company registries and LinkedIn. No more guessing from text.

How we verified it:

Validated JSON-LD with Schema.org validator
Checked with Google Rich Results Test
Manually inspected the structured data in page source

Days 4-5: Making pages discoverable

Second problem: AI crawlers need to know which pages exist and which URL variant is official.

What we built:

robots.txt (8 lines): Declares crawl permissions and points to sitemap location
sitemap.xml (15 lines): Lists every core page with last-modified dates
Canonical tags (~2 lines per page): Declares the authoritative URL for each page

The before state: Crawlers found pages randomly by following links. They didn't know if they'd seen everything. URL variants (https://verisai.eu vs https://www.verisai.eu vs https://verisai.eu/) created confusion about which was "official."

The after state: 100% of our pages are explicitly listed. Every page declares its canonical URL. No ambiguity, no missed content.

How we verified it:

Direct HTTP fetch of robots.txt and sitemap.xml (both returned 200)
Checked canonical tags in page source
Verified all listed pages returned HTTP 200 (real pages) or 404 (intentionally missing)

Week 2: SEO metadata and final verification

Days 1-3: Metadata for search and social

Third layer: When people share our links or when search engines index us, what do they see?

We added 35 lines per page:

Open Graph tags (12 lines): Title, description, URL, image, type for Facebook/LinkedIn
Twitter Card tags (8 lines): Card type, title, description, image for Twitter/X
Standard meta tags (15 lines): Title, description, viewport, charset, robots directive

Why this matters for AI: When AI systems check how others reference us (backlinks, social shares, citations), they now see consistent metadata everywhere. No conflicting descriptions.

How we verified it:

Facebook Sharing Debugger - checked preview appearance
Twitter Card Validator - verified card rendering
Manual inspection of meta tags in page source

Days 4-5: The final test

We ran a complete verification:

Crawled our own site with curl - every core page returned HTTP 200
Validated all JSON-LD with multiple validators - no errors
Checked canonical consistency - every page declared correct authoritative URL
Tested social sharing - previews showed correct branding and description

Then we asked ChatGPT again: "Who is VerisAI.eu?"

This time, it got it right. It could verify our identity. It described our services accurately. It referenced our official URL.

The difference? 45 lines of JSON-LD, 25 lines of crawl infrastructure, and 35 lines of metadata per page.

What changed: The numbers

Before

Structured identity data: 0 lines
Pages in sitemap: 0
Canonical tags: 0
Verifiable external identifiers: 0
ChatGPT accuracy: Hallucination

After

Structured identity data: 45 lines JSON-LD
Pages in sitemap: 100% coverage
Canonical tags: Every page
Verifiable external identifiers: 3
ChatGPT accuracy: Verified correct

Breaking down the layers

Layer 1: AI Bot Readiness (70 lines total)

This is what makes us verifiable to AI systems:

JSON-LD structured data: 45 lines
robots.txt: 8 lines
sitemap.xml: 15 lines
Canonical tags: ~2 lines per page

Purpose: Machine-readable identity and complete discoverability

Layer 2: SEO Foundation (35 lines per page)

This is for search engines and social platforms:

Open Graph tags: 12 lines
Twitter Card tags: 8 lines
Standard meta tags: 15 lines

Purpose: Consistent metadata for indexing and sharing

Layer 3: Frontend (the rest)

This is for human visitors:

Semantic HTML structure
CSS styling and responsive design
Accessibility features

Purpose: User experience and visual presentation

Critical insight: These are three separate audiences. AI readiness (70 lines) ≠ SEO (35 lines/page) ≠ UX (frontend). Conflating them creates confusion.

Why we chose plain HTML + CSS

Our site is static HTML with CSS. No JavaScript rendering. Here's why that mattered for this project:

Deterministic content: Everything is in the initial HTML. No "wait for JavaScript to render" uncertainty.
Lower crawl risk: If a bot doesn't execute JavaScript (some don't), it still sees all content.
Fewer failure modes: Less complexity means fewer things that can break fetchability.

Important caveat: Plain HTML doesn't eliminate the need for identity controls. You still need JSON-LD, canonical tags, and sitemap regardless of your tech stack. Static HTML just reduces one source of crawler uncertainty.

The complete timeline

Week 1, Day 1

Discovered ChatGPT hallucination. Ran technical audit. Diagnosed missing identity signals.

Week 1, Days 2-3

Created 45 lines of JSON-LD declaring legal identity, external identifiers, and organizational relationships. Validated with Schema.org and Google validators.

Week 1, Days 4-5

Built crawl infrastructure: robots.txt (8 lines), sitemap.xml (15 lines), canonical tags on every page. Verified 100% page discoverability.

Week 2, Days 1-3

Added 35 lines per page of Open Graph, Twitter Card, and standard meta tags. Validated with social platform debuggers.

Week 2, Days 4-5

Ran complete verification: crawl test, JSON-LD validation, canonical consistency check, social preview test. Re-tested ChatGPT - verified correct response.

Week 3

Documented implementation. Wrote this case study.

What we proved

Verifiable changes:

ChatGPT went from hallucination to verified accuracy
All pages became discoverable via sitemap (0% → 100%)
Identity became machine-verifiable via JSON-LD (0 → 45 lines)
URL authority became unambiguous via canonical tags (0 → 100%)
Social/search metadata became consistent (0 → 35 lines per page)

How we verified each claim:

ChatGPT accuracy: Direct testing before and after
Page discoverability: Sitemap validation and crawl tests
Identity verification: Schema.org validator, Google Rich Results Test
Canonical consistency: Manual inspection across all pages
Metadata validation: Facebook Sharing Debugger, Twitter Card Validator

What this case study is NOT

Not a traffic claim: We don't track or disclose analytics. No visitor numbers.
Not a ranking claim: We don't promise SEO improvements or search position changes.
Not a revenue claim: We don't measure or disclose business outcomes.
Not a universal template: Your site may need different controls based on your tech stack and identity complexity.

What this IS: A documented journey from "AI can't verify us" to "AI can verify us" with specific technical changes, verification methods, and no speculation.

Key takeaways

1. AI systems need explicit signals, not inferred ones. Good human-readable content isn't enough. You need machine-readable identity (JSON-LD).

2. Discoverability requires infrastructure. Sitemap and robots.txt aren't optional. They're how crawlers know what exists.

3. URL ambiguity breaks AI verification. Canonical tags eliminate the "which URL is official?" problem.

4. The layers are separate. AI readiness (70 lines) is different from SEO (35 lines/page) is different from UX (frontend). Each serves a different audience.

5. Verification is everything. We tested every claim. ChatGPT before/after, validator results, crawl tests. No speculation.

The story: When ChatGPT didn't know who we were

December 2025: The moment we realized we had a problem

The diagnosis

Week 1: Building machine-readable identity

Days 1-3: The identity foundation

Days 4-5: Making pages discoverable

Week 2: SEO metadata and final verification

Days 1-3: Metadata for search and social

Days 4-5: The final test

What changed: The numbers

Before

After

Breaking down the layers

Layer 1: AI Bot Readiness (70 lines total)

Layer 2: SEO Foundation (35 lines per page)

Layer 3: Frontend (the rest)

Why we chose plain HTML + CSS

The complete timeline

Week 1, Day 1

Week 1, Days 2-3

Week 1, Days 4-5

Week 2, Days 1-3

Week 2, Days 4-5

Week 3

What we proved

What this case study is NOT

Key takeaways