A brand citation audit measures how often your domain is named inside the answers that ChatGPT, Claude, and Perplexity hand to users — and where each citation surfaces inside the response. With answer engines now resolving the high-intent query without a click, citation is the new ranking, and the 100-point scorecard below is the framework we run on every brand engagement.
The shift is not subtle. For an expanding band of informational and recommendation queries, the user reads the model's answer and does not click through. Traditional rank tracking measures something that no longer happens. The signal that matters now is whether your brand appears in the answer text, whether the model cites your domain as a source, and how prominently it does so relative to competitors. That measurement requires a different instrument than the SEO tool you renew every January.
What follows is the production framework. Ten query archetypes — the shapes a real customer's question takes in front of an LLM — each scored on ten checks. The audit produces a single 0-100 number for the brand, a per-archetype breakdown that shows where citation is strong and where it leaks, and a remediation roadmap ranked by citation-probability lift. Source baselines come from a 500-brand cross-vertical sample we maintain internally; treat the specific numbers as indicative rather than universal, and re-derive against your own corpus before committing to any single target.
- 01LLM citations replace traditional rankings for high-intent queries.Track citation rate by query archetype as the primary visibility metric. Position-one organic ranking has no value if the model never surfaces the click.
- 02Freshness signals are weighted heavily.Modified-time accuracy and quarterly refresh cadence appear to matter more than for classic SEO. Stale content gets demoted in answer synthesis even when the page itself still ranks organically.
- 03Content depth wins over keyword density.Original data points, named methodologies, and proprietary numbers cite at roughly two to three times the rate of restated commodity content in our sample. Cite-bait beats keyword stuffing.
- 04Schema markup lifts citation probability.Article + Organisation + Person consistency across the page family appears to correlate with brand-anchored citation. The schema does not generate the citation; it qualifies the page for inclusion.
- 05Track citations per query archetype, not per query.Individual query results are noisy across runs. Patterns within an archetype are the signal. Audit by archetype, refresh quarterly, watch the trend line — not the daily fluctuation.
01 — Why CitationAnswer engines replace the click — citation is the new ranking.
For a growing share of the queries that used to drive organic traffic, the search experience has changed shape. The user asks ChatGPT, Claude, or Perplexity a question; the model returns a synthesised answer; the user reads it and moves on. The blue link beneath the answer is increasingly decorative. For high-intent informational queries — comparisons, definitions, recommendations, quick how-tos — that resolution-without-click pattern has become the default rather than the exception.
The implication for marketing teams is structural, not tactical. Rank tracking measures the order of links beneath an answer the user never scrolled to. CTR optimisation assumes a click that increasingly does not happen. The metric that actually correlates with downstream brand exposure and consideration is whether the brand is named inside the answer — and that is a different measurement layer from anything classic SEO tooling exposes.
Citation visibility behaves like search visibility used to behave in 2008 — a measurable, optimisable surface where the brands running the playbook earliest accrue a compounding advantage. The playbook is not Answer Engine Optimisation in the marketing-deck sense; it is a systematic audit of the surfaces, signals, and content shapes that make a domain citable. For the broader methodology context, our answer-engine optimisation guide covers the wider playbook; this post is the audit framework we run inside it.
None of this means classic SEO is obsolete. The underlying signals that earn citations — domain authority, structured data, content depth, freshness — are largely the signals that earn organic rank. The audit framework below is best understood as a re-weighting of those signals around the citation outcome, not as a replacement stack. Teams that treat the two as competing investments consistently underperform teams that treat them as one stack with two scoreboards.
02 — Ten ArchetypesDefinitional, comparative, recommendation — and seven more.
The 100-point scorecard partitions the citation problem into ten query archetypes. An archetype is the shape of a real user question — not a literal query string, but the pattern that a cluster of related queries share. The audit runs a representative sample of queries inside each archetype against each target LLM, and the archetype score (0-10) summarises how often the brand appears in the answer text across that sample.
Splitting the problem this way matters because the surfaces and signals that drive citation differ by archetype. A definitional query rewards crisp glossary content with structured data. A recommendation query rewards listicle-style comparisons with opinionated picks. A buying-decision query rewards detailed specifications with original benchmark numbers. A single citation-rate average across all queries obscures exactly the information a remediation roadmap needs.
Definitional queries
“What is …”, “Define …”Crisp single-paragraph answers, glossary entries, definitional schema. Citations cluster on Wikipedia, Investopedia-style references, and the brand domain when a glossary page is structured cleanly. The fastest archetype to improve on.
Reward: clean glossaryComparative queries
“X vs Y”, “Best of …”Tabular comparisons, head-to-head feature matrices, named benchmarks. Citations reward genuine comparative depth — half-page comparison tables citing original measurements outperform marketing-page boilerplate by a wide margin.
Reward: tables + benchmarksRecommendation queries
“Recommend …”, “What should I …”Listicle-style, opinionated picks with stated criteria. Models reward sources that justify a recommendation rather than list options. Editorial voice and named methodology earn citations here disproportionately.
Reward: opinion + criteriaBuying-decision queries
“Is X worth it”, “Does X support Y”Detailed specifications, original benchmark numbers, real-world pricing context. Citations cluster on review sites and the brand domain when product pages publish full specs in machine-readable shape (FAQ schema, Product schema, structured pricing).
Reward: specs + pricingHow-to queries
“How to …”, “Steps to …”Numbered step content, HowTo schema, embedded code snippets where relevant. Citation rate correlates strongly with answer-shape match — a numbered list with concise steps cites at multiples of an essay-style how-to.
Reward: step structureTroubleshooting queries
“Why is X happening”, “Fix Y”Symptom-cause-fix structure, error-code references, support documentation. Stack Overflow and Reddit dominate this archetype for technical domains; brand documentation cites when it ranks for the exact symptom string with a clean H2 hierarchy.
Reward: symptom anchorsPricing / cost queries
“How much does X cost”Published pricing pages with current numbers, comparison tables across tiers, cost-per-unit context. Models reward pages that publish current pricing in clean tabular form; demo-only sales pages cite at much lower rates regardless of domain authority.
Reward: published numbersTrend / forecast queries
“Outlook for …”, “Will X …”Named methodologies, dated forecasts, original numbers tied to a publication date. The freshness signal binds hardest here — a year-old forecast is often discarded by the model entirely, even when the source domain is otherwise authoritative.
Reward: dated forecastsBrand-context queries
“Tell me about X”, “What is X known for”Organisation schema, Wikipedia-style brand summaries, third-party coverage. Citation rate here correlates with off-domain signal — analyst coverage, press, podcast appearances — more than on-domain content. The audit measures the gap between brand description coherence and third-party reinforcement.
Reward: third-party signalHyperlocal / contextual queries
“In <city>”, “Near me”, language-localisedLocalised landing pages with named geography, language-tagged content, hreflang consistency. Hyperlocal citation lifts dramatically when the page targets the specific phrase the user would have asked rather than a generic service page tagged by geography.
Reward: phrase-precise pagesEach archetype is scored 0-10 across the ten checks introduced in sections 03 to 06 (surfaces, freshness, schema and authority, content depth) plus tracking and benchmarking signals from sections 07 and 08. The full scorecard is the matrix product: ten archetypes × ten checks = the 100 audit points. Brands rarely score evenly across archetypes; a typical pattern is strong on definitional and brand-context queries while weak on comparative and pricing — the audit's job is to make those gaps visible.
03 — Citation SurfacesWhat gets picked up and what doesn't.
A citation surface is the kind of content artefact the model actually quotes, paraphrases, or names as a source. Surfaces are not all equal — and the surfaces that earn citations are not always the surfaces brands invest in most heavily. The audit scores each archetype against the surfaces the brand operates, and against the surfaces the citing model demonstrably prefers for that archetype.
The pattern across our sample is consistent. Five surfaces earn the bulk of citations in answer engines: structured glossary or definition pages, comparison and listicle content, original research and data posts, well-maintained documentation, and third-party coverage on authoritative outlets. Surfaces that consistently under-earn relative to their domain authority: marketing landing pages, gated whitepapers, podcast pages without transcripts, sales-led product pages without specifications.
Glossary, comparison, research, docs, third-party
Definitional pages with clean schema, comparison tables, original data posts with named methodology, well-structured product documentation, and third-party analyst or press coverage are the five surfaces that drive most of the citations we observe. They share a common shape — they answer the question rather than market the brand.
Optimise these firstMarketing pages, gated PDFs, podcast pages
Generic marketing landing pages, gated whitepapers behind email forms, podcast pages without machine-readable transcripts, and sales-led product pages without specifications consistently under-cite relative to their domain authority. The model cannot quote what it cannot read, and it deprioritises pages that read like sales rather than information.
Reshape or deprioritiseStatus pages, changelogs, public roadmaps
Often-overlooked surfaces that punch above their domain weight for buying-decision and troubleshooting queries. A public changelog with dated releases and clear scope cites better than the equivalent marketing announcement. Open status pages cite reliably on uptime and reliability queries.
Underused, high leverageWikipedia, analyst sites, podcast transcripts
The audit must measure the brand's presence on surfaces the brand does not own — Wikipedia coverage, analyst-publication mention rate, podcast and panel appearances with public transcripts. For brand-context and trend archetypes, off-domain surfaces drive a majority of the citation outcome.
Audit, then placeOne non-obvious pattern: the model's preference between two functionally equivalent pages on the same domain frequently turns on shape, not signal. A glossary page with one focused paragraph, a definitional schema block, and a short FAQ cites at a noticeably higher rate than the same definition embedded in a 2,000-word essay that mentions twenty other concepts. The audit's surface check therefore scores both presence and shape — does the page exist, and is it structured for a one-paragraph quote?
"The model cannot quote what it cannot read, and it deprioritises pages that read like sales rather than information. Reshape the surface before chasing more links."— Our reading of the 500-brand baseline
04 — FreshnessModified-time signals, quarterly refresh, model knowledge-cutoff awareness.
Freshness behaves differently in answer engines than in classic organic search. A page can hold its organic ranking for years on stable intent while losing citation share inside answer engines as its content drifts past relevance. The audit treats freshness as a first-class signal across all ten archetypes, with two distinct sub-signals: the page-level modified-time accuracy, and the content's alignment with the citing model's knowledge cutoff.
Modified-time accuracy is the lower-effort half. A page that ships a current article:modified_timeand a visible "Updated date" in the body, and whose content actually reflects that update date, signals freshness to both the crawl and the model. A page whose modified-time updates every sitemap refresh while the content remains static is read as noise. The audit checks both — the metadata and the substantive update.
Knowledge-cutoff alignment is the harder half. Each citing model has a training cutoff, with a retrieval layer on top that pulls current content for queries where the model recognises the cutoff matters. For trend and forecast archetypes especially, content dated after the model's training cutoff cites disproportionately — but only when the date is machine-readable and consistent across the page, the schema, and the URL pattern.
Quarterly cycle
Material refresh — not date-bump — on a 90-day cycle across the top citation-earning pages. Pricing tables, comparison content, trend forecasts, and methodology pages benefit most from quarterly attention.
Top-cite pagesDate integrity
The modified-time in the meta, the visible date in the body, the schema datePublished/dateModified, and the actual content update all match. Any drift between these signals reads as noise to the model.
Audit the four datesCurrency tagging
Content explicitly tagged with the year and the data vintage (“As of Q2 2026”) earns citation lift on trend and forecast archetypes versus the same content without an explicit currency marker.
Trend / forecast liftA practical note on the refresh cadence number. Ninety days is the cadence that produced the strongest citation lift in our 500-brand sample for the pages most exposed to freshness signal — pricing, comparison, trend, and methodology. It is not a universal target. Reference content with stable intent (definitions, historical facts) does not benefit from a 90-day refresh, and aggressive re-dating of such pages can backfire if the content has not materially changed. The audit applies the cadence target only to the archetypes where freshness is a binding constraint.
05 — Schema + AuthorityStructured data, author signals, brand-anchor consistency.
Schema markup does not generate citations. What it does is qualify a page for inclusion in the model's candidate set — and on ambiguous queries, qualify the brand entity as the right anchor for a class of citations. The audit scores three schema clusters for every archetype: page-level schema correctness, author and organisation consistency, and brand-entity coherence across the wider web.
Page-level correctness is table stakes. Article with author, datePublished, dateModified; FAQPage on FAQ-shaped content; Product with full pricing and specification properties on commerce pages; HowTo on how-to content with named steps. None of this is novel; the audit verifies that it is consistently and correctly applied across the page family, because patchy coverage is read as noise.
Author and organisation consistency is the higher-leverage half. An Author entity with a consistent canonical URL, a verifiable real-world identity, and cross-page reuse across articles earns citation lift on recommendation, trend, and brand-context archetypes specifically. An Organization entity with consistent name, logo, sameAs references to verified social and Wikipedia presences, and a stable canonical URL anchors the brand for brand-context queries — and for the cluster of comparison queries that name the brand by category.
Page-level correctness
Article, FAQPage, Product, HowTo schema applied consistently across the page family. Validates via the Schema.org validator and Rich Results test. Necessary but not sufficient — without it the page is often disqualified from the candidate set.
Table stakesAuthor + Organisation consistency
Author entity with verifiable real-world identity, canonical URL, cross-page reuse. Organisation entity with stable name, sameAs to verified socials, Wikipedia link where applicable. Anchors brand-context citation specifically.
High leverageBrand-anchor coherence
Brand identity reads the same way across owned and off-domain surfaces — Wikipedia, Crunchbase, analyst databases, podcast bios, panel descriptions. The model reconciles the brand entity from these signals; coherence improves recognition, drift reduces it.
Compounding assetSchema theatre
Schema applied to pages where it does not match the content — FAQPage on essays, HowTo on essays, Product schema on category pages. The Rich Results test passes; the model deprioritises pages where schema and content disagree. Worse than no schema at all.
AvoidThe brand-anchor coherence check is the most underweighted by brands in our sample. A brand can ship perfect on-page schema and still under-cite for brand-context queries because the cross-surface picture is incoherent — different name spellings on Crunchbase versus the website, different category descriptors on LinkedIn versus the Organisation schema, no Wikipedia presence, inconsistent sameAs references. The model reconciles, but the reconciliation costs the brand entity weight in the citation decision. The remediation is administrative rather than technical, and it compounds: every quarter the brand entity reads more coherently, the citation rate on brand-anchored archetypes improves.
06 — Content DepthData points, named methodologies, original numbers.
The single largest controllable lever in citation rate is content depth — specifically, the presence of original data, named methodologies, and proprietary numbers that other sources do not already carry. Across the 500-brand sample, pages with at least one citable original number cite at roughly two to three times the rate of pages that restate commodity content, holding domain authority constant.
The mechanism is intuitive in retrospect. Answer engines synthesise from multiple sources; when a brand publishes a number that no other source has, the model cannot synthesise without attributing it. A named methodology functions similarly — the model cites the method by name when explaining the result, anchoring the brand to the concept. The audit therefore scores each archetype on the density and citability of original artefacts on the brand's relevant pages.
Original data
Survey, benchmark, internal datasetOne citable number that no other source carries. A benchmark from your own customer base, a survey result, an internal performance measurement. The model cites the source by name to attribute the number — that is the citation.
≈ 2-3× citation liftNamed methodology
Frameworks, scorecards, taxonomiesA method with a name (and a definition page on your domain) that the model has to reference when explaining the technique. The 100-point scorecard in this post is itself an example — naming a method creates a citable anchor.
Long-tail citationProprietary numbers
Pricing, performance, scaleNumbers tied to your operation — current pricing, performance benchmarks, scale figures — published in machine-readable form. Buying-decision and pricing archetypes cite these disproportionately when the page is structured for direct quotation.
Buying-decision liftContent depth is not synonymous with word count. A 400-word piece with a single original number and a clean methodology citation consistently out-cites a 3,000-word piece that restates the same commodity arguments other sources already cover. The audit's depth check counts citable artefacts per page, not paragraphs — and an artefact is a number, methodology, or named framework that the model must attribute rather than synthesise.
For teams without an obvious source of original data, the practical move is to instrument the work you already do. Internal benchmarks, customer-base distributions, deployment metrics, engagement data — all common operating data that becomes citable once published with methodology. This is the same pattern that powers a working agentic SEO crawler: extraction first, then publish the result with the methodology named.
07 — TrackingMeasuring citations across ChatGPT, Claude, Perplexity APIs.
The audit is only as good as the measurement infrastructure beneath it. Citation tracking at the scale this scorecard implies — ten archetypes, twenty to fifty representative queries per archetype, three citing models, quarterly cadence — is a programmatic workload, not a manual check. The pattern we run is a small Node script that issues a representative query set against each model's API, parses each response for brand mentions and source citations, and writes the results to a per-archetype scorecard.
// scripts/track-citations.mjs — citation tracking entrypoint
import { readFile, writeFile, mkdir } from "node:fs/promises";
const ARCHETYPES = JSON.parse(
await readFile("./audit/archetypes.json", "utf8"),
);
// Each archetype: { id, queries: string[] }
const MODELS = [
{ id: "chatgpt", fetch: askChatGpt },
{ id: "claude", fetch: askClaude },
{ id: "perplexity", fetch: askPerplexity },
];
const BRAND_DOMAIN = "your-brand.com";
const BRAND_NAMES = ["YourBrand", "Your Brand"]; // alias list
const scorecard = {};
for (const arch of ARCHETYPES) {
scorecard[arch.id] = {};
for (const model of MODELS) {
const results = await Promise.all(
arch.queries.map((q) => model.fetch(q)),
);
scorecard[arch.id][model.id] = {
total: results.length,
cited_by_domain: results.filter((r) =>
r.sources.some((s) => s.includes(BRAND_DOMAIN))).length,
cited_by_name: results.filter((r) =>
BRAND_NAMES.some((n) => r.answer.includes(n))).length,
raw: results, // persist for delta analysis
};
}
}
await mkdir("./audit/runs", { recursive: true });
const stamp = new Date().toISOString().slice(0, 10);
await writeFile(
`./audit/runs/${stamp}.json`,
JSON.stringify(scorecard, null, 2),
);
// ----- Model adapters (illustrative) -----
async function askChatGpt(query) { /* OpenAI Responses API + web tool */ }
async function askClaude(query) { /* Anthropic Messages + web search */ }
async function askPerplexity(query) { /* Perplexity online API */ }A few production notes on the tracking layer. Aliases matter — the brand name spelling, common misspellings, the legal entity name, and product names should all count as a brand mention. Source URL parsing is model-specific and changes between API versions; treat the adapter functions as moving targets and version the parsing logic alongside the model SDK version. Query sampling should be stable run-to-run so that quarterly deltas measure citation change rather than query drift; archive the query list with each scorecard run.
Individual query results are noisy across runs — the same model asked the same question one hour apart can return different citations, especially when retrieval is in the loop. The signal lives in the pattern across the archetype, not the individual answer. The scorecard therefore reports cite rate per archetype per model (with confidence intervals on a meaningfully sized query sample), and the quarterly trend line on that rate. Single results are diagnostic; the trend is the metric.
"Individual queries are noisy. The pattern across an archetype is the signal. Audit the archetype, track the trend, ignore the single-query whipsaw."— Our citation-tracking methodology
The cost of running the tracking layer is modest at quarterly cadence — a few hundred queries per model per run, a handful of dollars in API spend per audit. The cost of running it weekly is high and the signal-to-noise ratio drops accordingly; weekly cadence makes sense only during active remediation sprints when you are testing a specific hypothesis (e.g. did the schema rebuild lift definitional cite rate within four weeks). Default to quarterly for the standing audit and switch to weekly only during change windows.
08 — Baseline NumbersFrom a 500-brand audit.
The chart below summarises the citation-rate distribution we observe across the 500-brand sample, segmented by archetype. Citation rate is the percentage of representative queries within an archetype that surface the brand domain or brand name in the model's answer, averaged across ChatGPT, Claude, and Perplexity. The numbers are indicative — your vertical, brand scale, and content investment level will shift the baseline — but the ordering across archetypes is consistent enough across the sample to be operationally useful as a benchmark.
Average citation rate by archetype · 500-brand sample
Source: Digital Applied 500-brand citation baseline · Q2 2026Two patterns are worth pulling out. First, brand-context cites the highest on average — the model is willing to name the brand when the user already named it — but the gap between top-quartile and bottom-quartile brands inside this archetype is the widest of any single archetype, almost entirely explained by Wikipedia presence and Organisation-schema coherence. Second, trend and pricing archetypes cite the lowest on average and have the highest ceiling, because few brands invest in the freshness and machine-readability signals these archetypes reward. A modest programme of dated forecasts and published pricing tables tends to produce outsized citation gains over a single audit cycle.
For new programmes, we recommend scoring the brand against this baseline within the first audit, picking the two or three archetypes with the largest gap between current score and benchmark, and concentrating remediation there for one full quarter. Spreading remediation thinly across all ten archetypes in the first quarter consistently underperforms the focused approach in our sample.
LLM citations are the new ranking — measure them like one.
The shift from rank to citation is structural. For a growing share of the queries that drive revenue, the user reads the model's answer and never reaches the link beneath it. Treating that as a minor tooling problem misses the change — the metric, the measurement infrastructure, and the content investments all need to re-anchor around whether the brand is in the answer text, not whether the brand ranks beneath it.
The 100-point scorecard is the production framework we run on every engagement. Ten archetypes by ten checks, scored quarterly, mapped to a remediation roadmap ranked by citation-probability lift. The single highest-leverage move in our sample is reliably the same — publish one original, citable number per quarter per archetype — but the audit's job is to surface the brand-specific sequence, not the universal one.
The compounding move is the cadence. A single audit is diagnostic; quarterly audits become a control system. Citation share is the scoreboard; the archetypes are the dimensions; the surfaces, schema, freshness, and depth are the levers. Brands that adopt the cadence early bank a measurable lead before the rest of the market re-points its measurement stack at the surface that now matters.