Agent Ready

How Agent Ready scores a site

68 checks across four categories, mapped to the Vercel Agent Readability Spec and the llmstxt.org standard. Every check is open and reproducible.

Last updated

What does Agent Ready measure?

Agent Ready is an independent validator for the Vercel Agent Readability Spec and the llmstxt.org standard. The score reports how well your site exposes itself to AI agents and LLM-based clients — the same way a Lighthouse score reports how well your site performs for human users. We fetch your URL once, fan out to the discovery files and well-known endpoints AI agents probe, and grade the result.

How is the score calculated?

Two scores are reported: an overallagent readability score (0–100) and an llms.txt sub-score (0–100).

The overall score is a simple percentage: count of passing checks divided by total checks, rounded. Warns and fails both count against you; pass is the only state that earns credit. Checks marked unreliable by the JS-rendering check (P23) are excluded entirely so a single architectural choice doesn’t penalise four dependent checks at once.

Rating bands are derived from the overall score:

ScoreRatingMeaning
90–100excellentReady for AI citation; all critical surfaces present.
70–89goodDiscoverable, but a few extractability gaps.
50–69fairPartial coverage; multiple required surfaces missing.
0–49needs improvementNot yet AI-readable; start with llms.txt and AGENTS.md.

Why is the llms.txt sub-score weighted differently?

The llmstxt.org spec treats some properties as foundational and others as optional. We mirror that with weights: structural checks (file accessible, H1 present, valid markdown) count , content checks count , and the optional llms-full.txt presence check counts 0.5×. The overall score is unweighted because every check on the Vercel spec is equally normative; the llmstxt.org spec is explicitly layered.

How is the MCP server score calculated?

The MCP server scanner is a separate tool with its own 0–100 score. It connects to a live MCP endpoint and grades the tools, resources, and prompts it advertises against MCP best practices. It is independent of the site checks above and never affects a site’s Vercel score— the M-series below is not part of the 68-check site registry.

Like the llms.txt sub-score it is weighted: tool quality (M3–M7) carries the most points, because tool definitions are what an agent reasons over. Checks that don’t apply to a given server — e.g. resource quality on a tools-only server — are excluded from both the numerator and the denominator, so a server is graded on what it actually offers, not penalised for scope. Authentication (M11) is informational only. The same rating bands apply (90+ excellent, 70+ good, 50+ fair).

IDCheckWeight
M1Handshake10
M2Server metadata15
M3Tool descriptions12
M4Parameter descriptions10
M5Output schemas8
M6Tool annotations6
M7Naming conventions4
M8Resources8
M9Prompts6
M10Capability honesty10
M11Authenticationinformational
M12MCP Apps (UI)5

What does each check cover?

68 checks across four categories. Site checks run once per scan; page checks run against every fetched URL; llms.txt and protocol checks run conditionally on file presence.

Site checks (S1–S15) — 15 checks

IDCheck
S1llms.txt exists
S2llms.txt Content-Type
S3llms.txt not empty
S4llms.txt URL format
S5robots.txt — AI bots allowed
S6robots.txt — /llms.txt not blocked
S7robots.txt exists
S8sitemap.xml valid
S9sitemap.xml has lastmod
S10sitemap.md exists
S11sitemap.md has headings + links
S12AGENTS.md exists
S13AGENTS.md has required sections
S14HTTPS
S15Root OpenAPI spec

Page checks (P1–P23) — 23 checks

IDCheck
P1HTTP 200
P2Redirect chain
P3Content-Type header
P4x-robots-tag
P5Canonical link
P6Meta description
P7og:title
P8og:description
P9HTML lang attribute
P10JSON-LD present
P11JSON-LD has required fields
P12Section headings
P13Text-to-HTML ratio
P14Glossary link
P15Markdown mirror exists
P16Markdown frontmatter
P17Alternate link (markdown)
P18Link header in markdown
P19Content negotiation
P20Sitemap section in markdown
P21Code block language tags
P22API schema link
P23JS rendering dependency

llms.txt checks (L1–L10) — 10 checks

IDCheckWeight
L1File accessible3×
L2H1 present3×
L3Valid markdown3×
L4Blockquote summary1×
L5H2 file-list sections1×
L6Link format correct1×
L7Links are accessible1×
L8Optional section used correctly1×
L9Content-Type: text/plain1×
L10llms-full.txt available0.5×

Protocol checks (C1–C20) — 20 checks

Protocol checks discover-then-validate: when the relevant endpoint is absent we drop the check rather than failing it, so a marketing site doesn’t score itself against agent manifests it has no reason to ship.

IDCheck
C1MCP Server Card exists
C2MCP Server Card fields
C3MCP OAuth Protected Resource metadata
C4A2A Agent Card exists
C5A2A Agent Card fields
C6Wildcard agents.json
C7agent-permissions.json
C8UCP profile (/.well-known/ucp)
C9UCP OAuth Authorization Server metadata
C10x402 Payment Required response
C11x402 accepts entries
C12NLWeb endpoint
C13API Catalog (RFC 9727)
C14Web Bot Auth directory
C15Agent Skills Discovery
C16Content parity (no cloaking)
C17Agent-driven UI (A2UI)
C18MPP Payment challenge
C19MPP challenge params
C20AP2 payment protocol support

How do we handle JavaScript-rendered pages?

P23 detects pages where the static HTML response lacks data that only appears after client-side rendering — missing H1, empty body text, JSON-LD that injects after hydration. When P23 fires, the runner marks dependent checks (P10, P11, P12, P14) as unreliable and the scorer excludes them from both numerator and denominator. Without this, a single SPA architecture choice would compound into a 4-point score drop across unrelated checks.

How often do we refresh the spec mapping?

The Vercel Agent Readability Spec is published on the Vercel Knowledge Base; we track it and fold in new check IDs as they ship. The llmstxt.org spec changes less frequently — the structure has been stable since the late-2024 proposal. Each scan reports the spec version it was scored against, surfaced in the JSON response from the public API.

Where is the source for each check?

Every check is implemented as a single function in src/lib/checks/{site,page,llmstxt,protocol}/. The naming convention is {id}-{slug}.ts (e.g. p11-json-ld-fields.ts). Each file exports a check definition; the registry collects them into a single array that the runner iterates. If you want to see exactly what we’re asserting, read the source — one check per file.

Frequently asked questions

How is the agent readability score calculated?
Each check returns pass, warn, or fail. The score is the percentage of passing checks across all categories: site-wide checks (run once per scan) plus per-page checks for every URL we fetch. Warnings count toward the failing side. Checks marked unreliable by P23 — JS-rendered pages where the static HTML lacks the data — are excluded from both numerator and denominator so a single architectural choice doesn't penalise four downstream checks.
Which spec does Agent Ready implement?
The Vercel Agent Readability Spec drives the S- (site), P- (page), and C- (protocol) check series. The L- series implements the llmstxt.org specification for llms.txt files. Both specs are tracked at their canonical locations and updates are folded in when published.
Why does my score change between scans even though I didn't change anything?
Three causes. First, content negotiation (P19) depends on your origin honouring Accept headers — CDN caches can serve stale variants. Second, AI-bot crawl checks (S5) depend on what's currently in robots.txt for the rotating list of bot user-agents. Third, the protocol checks (C1–C20) discover-then-validate — if your /.well-known endpoint times out, the check drops rather than failing. Re-run; transient network errors are common.
Why does the llms.txt score differ from the overall score?
The overall score is pass/total across all 68 checks weighted equally. The llms.txt sub-score uses category weights from llmstxt.org: structural checks (file accessible, H1 present, valid markdown) count 3×; content checks count 1×; the llms-full.txt presence check counts 0.5×. Re-weighting matters here because the structural checks are foundational — a file that doesn't parse is worth less than one with the optional companion missing.
What's the difference between the page checks and the site checks?
Site checks (S1–S15) run once per scan against your root URL — they cover discovery files (llms.txt, robots.txt, sitemap.xml, AGENTS.md), HTTPS, and the OpenAPI spec probe. Page checks (P1–P23) run against every URL we fetch — they cover HTTP semantics, metadata, JSON-LD, markdown mirrors, and the static-render guarantee. A typical scan emits 15 + (23 × pages-fetched) check results.
Does the MCP server score affect my site's Vercel score?
No. The MCP server scanner (/mcp-server-scanner) is a separate tool that connects to a live MCP endpoint and grades its tools, resources, and prompts on its own 0–100 scale (the M1–M12 series). It is not one of the 68 site checks and never contributes to a site's Vercel agent-readability score. Its score is weighted toward tool quality, excludes checks that don't apply to a given server, and treats authentication as informational only.
Why are some checks not in my scan results?
Protocol checks (C1–C20) follow a discover-then-validate pattern: if your /.well-known endpoint returns 404, we drop the result rather than failing it. The same applies to the x402 probe — if the path doesn't return 402 Payment Required, neither C10 nor C11 appears. This keeps validator-only scans (e.g. an MCP server card check on a static marketing site) from showing irrelevant failures.