Guide · 8 min read
The 10-characteristic checklist LLMs use to decide what to cite.
Aleyda Solis published the canonical 10-point framework for AI-citation readiness. Here's every characteristic, the signal behind it, and how to audit your own pages — adapted from how we apply it across ~30 internal fleet sites.
Why this matters
When ChatGPT, Claude, Perplexity, or Gemini answer a question in your niche, they choose what to cite from a population of thousands of candidate pages. The selection isn't random and it isn't pure backlink-count. The major LLM-search surfaces converge on roughly the same 10 properties — properties that say "this page is a trustworthy, extractable, coherent source".
Aleyda Solis crystallized this into the 10-characteristic checklist. We use it as one of the two scoring layers in CitationDesk's GEO Readiness dimension. Every page we ship clears at least 7 of 10; the case-study pages clear 9.
The 10 characteristics
- Accessible — the content is in first-paint HTML, not JS-gated. GPTBot and PerplexityBot do not execute JavaScript. If your hero copy renders client-side via React-hydration, those crawlers see nothing.
- Useful — the page contains a fact, definition, dataset, framework, or synthesis that's genuinely informative — not template-driven filler. LLMs aggressively demote thin programmatic content.
- Recognizable — the site has a coherent brand identity: same name everywhere, Organization schema in JSON-LD, consistent visual logo, sameAs links to public profiles. LLMs build entity graphs from coherence.
- Extractable — section headings are quote-ready sentences, not marketing fluff ("Why Choose Us" is bad; "Sourdough hydration is calculated by dividing flour weight by water weight" is good). DefinedTerm schema marks key concepts.
- Consistent — voice, tone, and claims don't contradict across pages. If your /about says one thing and your /pricing says another, the LLM's confidence drops.
- Corroborated — facts on your page appear in at least three independent sources: your site + Reddit (organic comment, not spam) + LinkedIn + Wikipedia citation where possible. LLMs upgrade entities they see corroborated.
- Credible — E-E-A-T signals: author byline + Person schema, /about page with operator identity, outbound citations to authoritative primary sources (gov, peer-reviewed, NCHFP, SEC, etc.).
- Differentiated — the page expresses an explicit POV in a /learn or methodology section. "Our view:" or "Why we think X" sentences signal a citable source, not a content aggregator.
- Fresh —
datePublished+dateModifiedin schema, visible "Last verified [date]" near the content, and an actual refresh cadence (LLMs deprioritize stale pages once a competitor publishes a fresher equivalent). - Transactable — a reader who arrives via the citation can do something: live pricing, working contact form, free tool, email capture. LLMs increasingly weight "does this citation lead to a useful action?"
How we audit each one
Our Free Citation Readiness Score runs heuristics for all 10. Some are objective (sameAs count, Schema.org type presence, first-paragraph word count). Some are subjective and we mark them as "assumed" until the paid product's server-side crawler can sample multiple pages of the same site and analyse semantic consistency via LLM evaluation.
Specifically:
- Accessible — we check body word count from the no-JS HTML response.
- Useful — heuristic: word count + presence of distinct fact-shaped phrases.
- Recognizable — we look for Organization JSON-LD.
- Extractable — we count H2 headings and verify quote-ready phrasing patterns.
- Consistent — single-URL audit can't verify; Pro / Team tiers sample multiple pages.
- Corroborated — we count
sameAshints in JSON-LD blocks as a starting proxy. - Credible — we look for Person schema.
- Differentiated — single-URL audit assumes pass; multi-page sampling adds POV analysis.
- Fresh — we look for
article:published_timemeta or<time datetime>elements. - Transactable — we look for any link or form containing contact, pricing, signup, or subscribe.
The four most common gaps
Across ~200 sites we've audited so far, four characteristics fail more often than the others:
- Recognizable (no Organization schema) — most static-reference sites ship Article schema only and skip Organization. The fix is one JSON-LD block in
<head>. Five minutes. - Credible (no Person schema) — content shipped without an author byline + sameAs. LLMs don't know who wrote it, so they don't weight it as expert content.
- Corroborated (no Wikipedia / Reddit / LinkedIn presence) — solo operators ship the site but never plant the corroboration signals elsewhere. We talk about that in our other guides.
- Fresh (no
dateModifiedupdates) — pages published once and never re-saved. LLMs use freshness as a tiebreak when picking between similar candidates.
What to ship this week
If you have one hour this week, ship these three things in order:
- Add Organization schema + Person schema in your global
<head>. IncludesameAswith at least 2 public profiles per entity. - Rewrite the first 100 words of your most-visited page to lead with a quote-ready fact, definition, or statistic.
- Add
dateModifiedrendering on every Article-type page, ideally updating monthly.
Then run the Free Citation Readiness Score on your most-visited page and verify the GEO Readiness dimension moved from ~0.4 to ~0.7.
Score your own site against this guide.
The free Citation Readiness Score runs every signal from this guide against any URL. ~90 seconds, no signup.