The Short Answer

Mentions outperform backlinks for AI citations by roughly 3:1. The result is everywhere. The mechanism is almost nowhere. LLMs tokenize text and learn entity associations from sentence-level co-occurrence patterns. A hyperlink is invisible inside a tokenized sentence. An unlinked descriptor like "Anshul Rana, an AI SEO consultant in India" is the entire signal. Once you see the mechanism, the budget shift becomes obvious. Traditional SEO got the trust architecture half-right by proxying authority through links. AI search reads the underlying conversation directly and inherited the other half.

The Ahrefs December 2025 study of 75,000 brands found that unlinked brand mentions correlate with AI citation visibility at 0.664. Backlinks correlate at 0.218. That is a roughly 3x gap and it has become the most-cited statistic in off-page strategy right now.

Almost nobody has explained why.

The takes you see on LinkedIn assert the result ("mentions are the new links") and pivot straight to tactics. That skips the only question that matters: what is the actual mechanism? If you do not understand the mechanism, the tactics you ship will be approximations of the right thing, and you will get approximate results. The mechanism is at the tokenization layer, it is straightforward to explain, and once you see it you cannot unsee the budget mistake most SEO programs are still making.

The data, briefly

Before the mechanism, the evidence that the mechanism explains. Four numbers from the recent off-page research, none of which the industry has fully metabolised.

0.664
Spearman correlation between unlinked brand web mentions and AI citation appearance across 75,000 brands
Ahrefs, December 2025
0.218
Correlation between backlinks and AI citation appearance in the same dataset
Ahrefs, December 2025
82%
Share of AI-cited links coming from earned media across 1M+ analyzed citations
Muck Rack, 2026
80%
Brands affected by the Mention-Source Divide: cited as source, recommended as a different brand
SEMrush, 2026

The single most-shared takeaway is the 3:1 mention-to-link advantage. The most under-discussed is the Mention-Source Divide. Brands are getting their content extracted and used in AI answers but the recommendation in the same answer goes to a competitor. That gap is not a content quality problem. It is a mention density problem. And the mechanism underneath explains why.

How LLMs actually process text

Language models process text by breaking it into tokens, the subword units that get converted into the numerical embeddings the model trains on. The training process learns relationships between tokens based on how often they appear near each other across trillions of examples. Two things follow from this that are non-obvious for anyone whose mental model of search is still anchored in the link graph era.

First, the structure that survives tokenization is linguistic, not architectural. Plain words and the order they appear in are signal. HTML tags, link targets, JSON-LD schema blocks, structured metadata, and anchor attributes are reduced to text-as-text or stripped entirely during preprocessing. A controlled February 2026 experiment by Mark Williams-Cook confirmed this for the schema case: both ChatGPT and Perplexity extracted information from fabricated, invalid JSON-LD blocks. The model was not parsing the schema as structured data. It was reading the script tag as plain text.

Second, the same logic applies to hyperlinks. A backlink is HTML markup pointing one URL at another. It carries zero meaning inside the tokenized sentence stream the model actually trains on. The anchor text contributes a word or two if you are lucky. The link target contributes nothing. Whatever authority signal Google's classical algorithm derived from the existence of that hyperlink is invisible to the LLM.

A hyperlink to your site is invisible to a tokenizer. An unlinked sentence about your brand is the entire signal. This is not a metaphor. It is the literal mechanics of how language models learn. Every off-page tactic that does not account for this is optimising for a layer the model never sees.

Why an unlinked descriptor is the entire signal

Consider the sentence: "Anshul Rana, an AI SEO consultant in India, has helped over a thousand sites optimize for AI search." Every word in that sentence trains an entity embedding. The model now knows my name is associated with "AI SEO consultant," with "India," with "AI search," and with a competence claim. Four signals from one sentence. None require a hyperlink. The same sentence on a hundred third-party domains builds a dense, retrievable entity that an LLM can confidently surface when asked about AI SEO consultants in India.

Now compare that to a backlink to anshulrana.in with the anchor text "AI SEO consultant" buried in a generic paragraph that does not describe me by name. The link looks valuable in a backlink audit. The training-data view of that placement is close to zero. The model sees "AI SEO consultant" as a generic phrase appearing near unrelated text. There is no entity binding, no descriptor density, no resolvable association. The backlink earned a slot in your Ahrefs report. It did not earn a slot in the model's weights.

This is why the 3:1 correlation gap exists in the data. Not because hyperlinks are broken, but because the signal they were proxying for is now being read directly by the system, and the proxy is no longer required.

The four mechanisms that turn mentions into citations

The work LLMs do during training collapses into four mechanisms, each one independent of the link graph and each one explainable in a paragraph.

01
Tokenization
Subword Units, Plain Text

LLMs convert text into tokens before training. HTML attributes, anchor metadata, and link graph structure get stripped or treated as plain text. The sentence "Vercel hosts the fastest Next.js sites" is dense signal. The link markup <a href="vercel.com">Vercel</a> resolves to the word "Vercel" alone, with nothing extra carried by the markup itself. The link is invisible. The sentence around it is the entire input.

Example: A press release mentions your brand twice and links once. The model reads both mentions identically. The link does not amplify the second one.

02
Co-occurrence
Statistical Association Learning

The model learns that two tokens appearing together frequently across the training corpus carry a strong probabilistic association. "Vercel" and "Next.js" appear together so often that asking "best Next.js hosting" surfaces Vercel almost automatically. A single backlink to vercel.com contributes nothing to that density. Two hundred sentences across two hundred third-party domains do. Co-occurrence is the math behind why mention density beats link density, and why concentrated PR placements lose to distributed editorial mentions.

Example: A brand that appears once on the Wall Street Journal and never elsewhere is invisible to an LLM. A brand that appears in 50 niche newsletters with descriptor sentences is retrievable.

03
Entity Disambiguation
Contextual Binding

Mentions resolve which entity the model means. "Anshul Rana, AI SEO consultant" is unambiguous. "Anshul" alone could be hundreds of people. The descriptors in the sentence pin the entity to a specific node in the model's representation space. Repeat that descriptor binding across 200 third-party sentences and you have a strong, retrievable entity that the model can surface confidently. Backlinks do not disambiguate. They point at a URL. The model is not looking at URLs during retrieval. It is looking at entities.

Example: "Anshul Rana from India" appearing repeatedly across podcast notes, expert roundups, and conference recaps binds the entity. A backlink from one high-DR site does not.

04
Sentiment Encoding
Affective Signal

"Vercel is the gold standard for Next.js hosting" and "Vercel pricing is opaque and hostile to small teams" both train associations, but with opposite polarity. The model encodes sentiment alongside association. This is why agencies talking about "share of voice" in AI search have a half-formed metric. Share of voice without sentiment is vanity. A negative-mention-heavy footprint is worse than a smaller, neutral-to-positive one. Reputation work is now upstream of off-page strategy, not adjacent to it.

Example: A G2 review thread full of complaints trains the same association as a glowing case study. The model surfaces both. Sentiment shapes how the brand is positioned in the AI answer.

The correlation data, formalised

The Ahrefs December 2025 study tested multiple brand properties against AI citation appearance across ChatGPT, Google AI Mode, and AI Overviews. The full ranking explains the mechanism cleanly: the metrics closest to "linguistic proximity in third-party text" correlate strongest. The metrics closest to "link graph artefacts" correlate weakest.

Off-Page Signal Correlation with AI Citation Visibility
Signal Type Mechanism Correlation
YouTube Mentions Spoken co-occurrence 0.737
Unlinked Web Mentions Text co-occurrence 0.664
Branded Anchor Text Hybrid (text + link) 0.527
Brand Search Volume Demand signal 0.334
Backlinks Link graph artefact 0.218

The pattern is consistent with the mechanism. YouTube mentions correlate highest because video transcripts produce some of the highest-density brand descriptor text in the training corpus. Unlinked web mentions come next because they are pure sentence-level signal. Branded anchor text scores higher than raw backlinks because it carries linguistic content (the anchor phrase) alongside the link, and the text portion contributes to tokenization. Raw backlinks rank lowest because they are the purest link graph artefact, the one with the least linguistic content surviving into the tokenized stream.

Why traditional SEO got the trust architecture half-right

Google's PageRank algorithm proposed in 1998 that authority is a property of how a document is talked about across the network, not what the document says about itself. That insight was correct and remains correct. The implementation used hyperlinks as the proxy for "being talked about," and for two decades the proxy worked because the web's text and link structures were tightly correlated. Pages that were widely linked were also widely discussed in the prose around those links.

The mechanism the SEO industry inherited from that was "build links." The mechanism Google was actually rewarding was distributed editorial discussion. Links were the trail. The conversation was the signal. The conflation worked well enough that nobody had to disentangle it for two decades.

AI search inherited the underlying signal directly. LLMs train on the conversation itself, every sentence, every clause, every co-occurrence across the web text corpus. The trail (the hyperlink) is no longer needed because the model has the underlying discussion in its training data. This is why mention density outperforms link density in citation correlation studies. It is not that links stopped working. It is that links were always a proxy for something else, and AI search reads the something else directly.

PageRank proposed that authority is what others say about you, not what you say about yourself. Links were the implementation. Sentences were the signal. AI search collapsed the difference by reading the sentences directly.

The off-page audit framework

Stop auditing off-page work by backlink count. Start auditing by mention density, descriptor quality, and sentiment polarity. Three questions for every brand-tracking workflow.

01
How many third-party sentences mention your brand with a descriptor?

A descriptor sentence is one that names your brand and binds it to a category, location, or capability. "Vercel, the deployment platform for Next.js." "Anshul Rana, an AI SEO consultant in India." Count these across your last 12 months of earned coverage. If the number is under 100, you have a co-occurrence density problem before you have anything else. This is the metric the link graph never measured.

02
How diverse is the source footprint?

Two hundred descriptor sentences across two hundred domains is dramatically stronger than two hundred sentences across ten domains. Co-occurrence learning rewards distribution. Map your top 20 mention sources and check the spread. If 80 percent of your descriptor density sits on five domains, you are a concentrated entity that the model will surface narrowly. Broaden the footprint into niche publications, podcasts, expert roundups, community platforms, and conference recaps.

03
What is the sentiment polarity of those mentions?

Run sentiment analysis across your tracked mentions. Calculate the net polarity. A footprint that is 40 percent positive, 40 percent neutral, 20 percent negative trains a different entity association than one that is 70 percent positive. If negative mentions are concentrated in high-density domains (review platforms, forums, comment threads), they will dominate the model's representation of your brand. Reputation work is now part of off-page work. Treat them as one budget line.

The pages that win citations in AI search earned the entity binding through sentence-level signal across many sources. The brands that win recommendations earned the same plus a positive sentiment footprint. The brands that lose, even when their content gets used as source, lost the mention density race upstream.

What this changes for off-page strategy

Three practical reframings that follow directly from the mechanism.

Stop optimising for the link, start optimising for the sentence around the link. The hyperlink is incidental. The descriptor sentence is the signal. A backlink from a top-tier publication in a sentence that reads "Anshul Rana, an AI SEO consultant from India who specialises in entity optimisation" is dramatically more valuable than ten generic anchor-text backlinks placed inside paragraphs that do not name the brand or its category. Brief PR teams to write the sentence, not just earn the link. Treat descriptor copy as the deliverable. The link is the receipt.

Co-occurrence is a distributed game. One sentence in one publication does not build entity association. Two hundred sentences across two hundred third-party domains does. The shape of the program shifts: fewer high-effort campaigns aimed at tier-one placements, more distributed mention work across podcast appearances, expert quotes in roundups, niche newsletters, Reddit threads, conference recap posts, and industry roundtables. The brands winning AI citations in 2026 look more like distributed entities than concentrated press releases.

Sentiment is a budget line. If 60 percent of your brand mentions read neutrally and 20 percent read negatively, you have a sentiment problem upstream of your link problem. AI engines encode that polarity. Reputation work belongs in the off-page strategy doc now, not in a separate PR or customer-success deck. Audit review platforms, address negative concentration, fund response and remediation across the long tail of comment threads.

The unified signal stack

Here is the part the industry split keeps getting wrong. Traditional SEO and AI search are not separate disciplines with separate signal stacks. They share a stack. The classical SEO industry built one set of tools (link graphs, anchor text analysis, domain authority scores) to approximate a deeper signal it could not directly observe. AI search observes the deeper signal directly because LLMs read the prose, not just the citations of the prose.

The implication is that AEO, GEO, and traditional SEO converge on the same set of recommendations: build distributed editorial mention density, write extractable content with clean entity binding, earn brand-defining descriptor sentences across third-party sources, and treat schema as a signal to Google's pipeline rather than as a direct LLM parser input. The display surfaces differ. The underlying work is the same.

The industry split between "SEO" and "AI SEO" is a fiction created by people selling separate retainers. The signal stack is one stack. Your strategy should be too. Mentions are not the new links. They were always the underlying signal. Links were just the proxy that worked until a system came along that could read the prose directly.

Frequently Asked Questions

Yes, for classic search ranking and referral traffic. No, as a direct input into LLM training or AI citation logic. The Ahrefs December 2025 study of 75,000 brands found backlinks correlated with AI citation appearance at 0.218, while unlinked brand mentions correlated at 0.664. Backlinks remain useful, but the marginal off-page dollar buys roughly 3x more AI visibility when spent on earned editorial mentions than on link placements.
LLMs convert text into tokens during training, then learn statistical associations between tokens based on how often they co-occur across trillions of examples. A brand name appearing near descriptor words across many sources builds a dense entity embedding. Hyperlink markup is stripped or treated as plain text during preprocessing, so it carries no special weight inside the tokenized stream. The sentence around a mention is the signal. The link itself is not.
From the LLM training perspective, they are functionally equivalent. The model reads the surrounding sentence, not the anchor tag. For Google's broader pipeline, a linked mention adds a small ranking and Knowledge Graph signal. For ChatGPT, Perplexity, and Gemini citation behavior, unlinked editorial mentions perform as well as linked ones, often better when they include rich descriptor context.
There is no single threshold, but the Ahrefs and Muck Rack data points to density and diversity, not raw volume. Roughly 82 percent of AI-cited links come from earned media, and brands that appear in distributed editorial mentions across many third-party domains consistently outperform brands with concentrated mentions on a few large sites. Target a wide footprint of contextual mentions across podcasts, expert roundups, niche publications, and community platforms rather than a small number of high-effort placements.
Yes. LLMs encode sentiment alongside association during training. A brand name that consistently appears in positive editorial contexts builds a different association than one mentioned with neutral or negative framing. This is why share of voice without sentiment analysis is a misleading metric, and why reputation work has moved from a separate PR concern into core off-page strategy in 2026.
Yes. The signals that drove classical ranking, including entity authority and distributed editorial discussion, were always proxied through hyperlinks. AI engines read the underlying discussion directly through tokenization and co-occurrence learning. The result is one unified signal stack: clean entity binding, distributed mention density, brand-defining descriptor sentences, and extractable on-page content. AEO, GEO, and SEO converge on the same recommendations because they are reading the same underlying signal.

Related Reading on Brand Signals and AI Citations

For the practical tooling layer that surfaces these mentions and tracks the descriptor footprint, see Brand Mention Tools: An SEO Guide. For the playbook on earning the descriptor sentences in the first place, see How to Get Your Brand Mentioned by ChatGPT. For the broader format-and-intent framework these off-page signals feed into, see Search Intent Beats Word Count: The New SEO Reality. For the foundational discipline distinction, see SEO vs AEO vs GEO: What Is the Difference and Which One Do You Need?.

Sources and Further Reading

Primary Reference Links

  1. Ahrefs: How to Audit Brand Mentions for AI Search Visibility
  2. Soar: Backlinks vs Brand Mentions, the 2026 AI Visibility Playbook (Ahrefs Data Synthesis)
  3. Machine Relations: Brand Mentions vs Backlinks for AI Visibility, What the Data Shows
  4. RankScience: The Mention-Source Divide in AI Search
  5. Ziptie: FAQ Schema and LLM Tokenization Behaviour (Williams-Cook Experiment)
  6. Google Search Central: Structured Data Documentation
  7. LinkedIn: Anshul Rana on AI SEO and Entity Optimization

Working Together

If you want this audit framework run on your brand (mention density mapping, descriptor sentence inventory, source diversity scoring, and a 90-day mention earning roadmap) that is exactly the engagement I take on. You can reach me on Upwork, connect on LinkedIn, or visit The Digital Geek for agency-level engagements covering AEO audits, GEO strategy, and entity optimisation at scale.

Anshul Rana, AI SEO, AEO and GEO Specialist
Anshul Rana
SEO, AEO & GEO Specialist | Top Rated Plus on Upwork
I am an SEO, AEO, and GEO specialist with 8+ years of experience helping businesses get found on Google and AI search platforms like ChatGPT, Claude, Gemini, and Perplexity. I hold the Top Rated Plus badge on Upwork (top 3% of freelancers) with a 100% Job Success Score, and I have worked with 1,000+ websites across India, Australia, the US, and the UK. I specialize in technical SEO, answer engine optimization, generative engine optimization, schema markup, and local SEO.