CitationDesk

Guide · 8 min read

The 10-characteristic checklist LLMs use to decide what to cite.

Aleyda Solis published the canonical 10-point framework for AI-citation readiness. Here's every characteristic, the signal behind it, and how to audit your own pages — adapted from how we apply it across ~30 internal fleet sites.

Why this matters

When ChatGPT, Claude, Perplexity, or Gemini answer a question in your niche, they choose what to cite from a population of thousands of candidate pages. The selection isn't random and it isn't pure backlink-count. The major LLM-search surfaces converge on roughly the same 10 properties — properties that say "this page is a trustworthy, extractable, coherent source".

Aleyda Solis crystallized this into the 10-characteristic checklist. We use it as one of the two scoring layers in CitationDesk's GEO Readiness dimension. Every page we ship clears at least 7 of 10; the case-study pages clear 9.

The 10 characteristics

  1. Accessible — the content is in first-paint HTML, not JS-gated. GPTBot and PerplexityBot do not execute JavaScript. If your hero copy renders client-side via React-hydration, those crawlers see nothing.
  2. Useful — the page contains a fact, definition, dataset, framework, or synthesis that's genuinely informative — not template-driven filler. LLMs aggressively demote thin programmatic content.
  3. Recognizable — the site has a coherent brand identity: same name everywhere, Organization schema in JSON-LD, consistent visual logo, sameAs links to public profiles. LLMs build entity graphs from coherence.
  4. Extractable — section headings are quote-ready sentences, not marketing fluff ("Why Choose Us" is bad; "Sourdough hydration is calculated by dividing flour weight by water weight" is good). DefinedTerm schema marks key concepts.
  5. Consistent — voice, tone, and claims don't contradict across pages. If your /about says one thing and your /pricing says another, the LLM's confidence drops.
  6. Corroborated — facts on your page appear in at least three independent sources: your site + Reddit (organic comment, not spam) + LinkedIn + Wikipedia citation where possible. LLMs upgrade entities they see corroborated.
  7. Credible — E-E-A-T signals: author byline + Person schema, /about page with operator identity, outbound citations to authoritative primary sources (gov, peer-reviewed, NCHFP, SEC, etc.).
  8. Differentiated — the page expresses an explicit POV in a /learn or methodology section. "Our view:" or "Why we think X" sentences signal a citable source, not a content aggregator.
  9. FreshdatePublished + dateModified in schema, visible "Last verified [date]" near the content, and an actual refresh cadence (LLMs deprioritize stale pages once a competitor publishes a fresher equivalent).
  10. Transactable — a reader who arrives via the citation can do something: live pricing, working contact form, free tool, email capture. LLMs increasingly weight "does this citation lead to a useful action?"

How we audit each one

Our Free Citation Readiness Score runs heuristics for all 10. Some are objective (sameAs count, Schema.org type presence, first-paragraph word count). Some are subjective and we mark them as "assumed" until the paid product's server-side crawler can sample multiple pages of the same site and analyse semantic consistency via LLM evaluation.

Specifically:

  • Accessible — we check body word count from the no-JS HTML response.
  • Useful — heuristic: word count + presence of distinct fact-shaped phrases.
  • Recognizable — we look for Organization JSON-LD.
  • Extractable — we count H2 headings and verify quote-ready phrasing patterns.
  • Consistent — single-URL audit can't verify; Pro / Team tiers sample multiple pages.
  • Corroborated — we count sameAs hints in JSON-LD blocks as a starting proxy.
  • Credible — we look for Person schema.
  • Differentiated — single-URL audit assumes pass; multi-page sampling adds POV analysis.
  • Fresh — we look for article:published_time meta or <time datetime> elements.
  • Transactable — we look for any link or form containing contact, pricing, signup, or subscribe.

The four most common gaps

Across ~200 sites we've audited so far, four characteristics fail more often than the others:

  1. Recognizable (no Organization schema) — most static-reference sites ship Article schema only and skip Organization. The fix is one JSON-LD block in <head>. Five minutes.
  2. Credible (no Person schema) — content shipped without an author byline + sameAs. LLMs don't know who wrote it, so they don't weight it as expert content.
  3. Corroborated (no Wikipedia / Reddit / LinkedIn presence) — solo operators ship the site but never plant the corroboration signals elsewhere. We talk about that in our other guides.
  4. Fresh (no dateModified updates) — pages published once and never re-saved. LLMs use freshness as a tiebreak when picking between similar candidates.

What to ship this week

If you have one hour this week, ship these three things in order:

  1. Add Organization schema + Person schema in your global <head>. Include sameAs with at least 2 public profiles per entity.
  2. Rewrite the first 100 words of your most-visited page to lead with a quote-ready fact, definition, or statistic.
  3. Add dateModified rendering on every Article-type page, ideally updating monthly.

Then run the Free Citation Readiness Score on your most-visited page and verify the GEO Readiness dimension moved from ~0.4 to ~0.7.

Score your own site against this guide.

The free Citation Readiness Score runs every signal from this guide against any URL. ~90 seconds, no signup.