PageRank’s genius was, structurally, that it was a proxy. The system could not directly observe whether a webpage was trustworthy — there is no sensor for trustworthiness — so it measured the thing it could measure: who linked to whom, and how authoritative the linkers were. Links correlated with trust well enough, for long enough, to anchor an entire industry. Two decades of SEO craft accumulated on the assumption that the link graph was the trust graph, because for practical purposes the conflation kept working.1
Approximations drift. A language model does not need the link proxy because it has access to a strictly richer signal set:
- Co-occurrence patterns. How often does claim X appear near entity Y across the corpus? The model learns the association directly, with no link required as the carrier.
- Citation context. When source A discusses source B, what does A say? Linking with skepticism, linking with endorsement, linking with summary — the link’s neighbourhood text carries far more signal than the link itself does, and the model reads the neighbourhood as well as the link.
- Entity consistency. Does the same claim get attributed to the same entity across many independent sources? The convergence is itself a trust signal — and one that PageRank’s edge-weighting could not directly represent.
- Repetition by trusted sources. A claim repeated across many sources the model already trusts becomes part of the model’s prior. No link in this chain is doing the work that PageRank’s link did; the statistical weight of independent repetition is.
The link graph and the trust graph, in other words, were always different objects. They were close enough to conflate for the two decades when PageRank was the only large-scale trust signal available. They are diverging now, because the model has access to signals the link-counting era could not even represent.
What this changes operationally
Three observable consequences in the audit data, each of which inverts a piece of received link-building wisdom.
Unstructured mentions can outperform structured links. Across the
visibility corpus, a brand mention beside a trusted entity — with no
hyperlink at all — sometimes produced more citation lift than a dofollow
link from a moderately-authoritative domain. The model reads the neighbourhood,
not the anchor tag. A paragraph in the New York Times that says “the SEO
practitioner Gilad Sasson has argued for years…” delivers entity-graph signal
that does not depend on the paragraph linking to my site. The
link-equity-only mental model can’t see that signal at all.
Toxic-link cleanup yields diminishing returns. The Penguin-era discipline of disavowing manipulated links was a defensive necessity when link-counting was load-bearing; in a substrate where links are one signal among many, the returns to disavow work decay sharply. Most of the audits we run now find that the marginal hour spent on disavow work would have been better spent on entity-disambiguation infrastructure or on producing one well-sourced claim.2
Reputation work outperforms link building for entity-grade visibility. If the trust graph is built from co-occurrence with trusted entities, the operational target is to be discussed near trusted entities, not to be linked from them. Industry analysis pieces that cite you by name, podcast appearances on shows the model has indexed, conference talks on stages with trusted speakers — these accumulate entity-graph weight that link-building campaigns cannot.
What this preserves
Nothing in the above means link-building craft becomes worthless. Two things it preserves:
- Crawlability prerequisites. A page that cannot be fetched cannot be indexed, embedded, retrieved, or cited. The technical-SEO hygiene that link-era practitioners built is the prerequisite layer for everything downstream. It does not stop mattering; it stops being the ceiling.
- High-authority editorial citations. A genuine citation from a high-authority publication still moves the needle, partly because the link still helps at retrieval and partly because the editorial process that produces such a citation usually generates the surrounding-context tokens the trust graph actually reads. The pursuit of those citations was always the legitimate end of “link building”; it remains correct now, with the understanding that the linked-from page does most of the work, not the link itself.
The thing to internalise
PageRank’s proxy held for two decades because no better signal was available at scale. A better signal is now available at scale. The mental model that treats links as the unit of trust will systematically misallocate effort in the current substrate. The mental model that treats entity neighbourhoods as the unit of trust will pull the practice toward the work that actually compounds: being legible as a specific identity, being repeatedly discussed by sources the model trusts, being the named author of claims that are specific enough to be reproduced.
The link graph approximated the trust graph well enough to build an empire. It will not approximate the next one. Plan accordingly.
References
- Brin, S., & Page, L. (1998). The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems, 30(1–7), 107–117. — The PageRank paper; explicit that the link signal is a proxy for importance, not a direct measure of trust.
- Kleinberg, J. M. (1999). Authoritative Sources in a Hyperlinked Environment (HITS). Journal of the ACM, 46(5), 604–632. — The parallel link-authority algorithm of the era; useful historical anchor for the proxy thinking that link-era SEO inherited.
- Singhal, A. (2012). Introducing the Knowledge Graph: things, not strings. Google Official Blog, May 16, 2012. — The substrate's first explicit signal beyond the link graph; the entity-graph era begins here.
- Vrandečić, D., & Krötzsch, M. (2014). Wikidata: A Free Collaborative Knowledgebase. Communications of the ACM, 57(10), 78–85. — The structured-entity layer that LLMs are demonstrably trained against — one of the signals replacing the link proxy.
- Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL 2019. — Co-occurrence-as-signal at scale; the architectural source of the trust signals that no longer require links to travel.
- Sasson, G. (2026). Statement-level visibility, or: why ranking a page no longer matters. Algoholic, Vol. III, Essay 04. — The visibility framework where these signal-replacement effects are measured directly.
- Sasson, G. (2026). Entity disambiguation, for humans who share a name. Algoholic, Vol. III, Essay 06. — The structural infrastructure that thickens the entity neighbourhood when links can't do the job.
Footnotes
-
Brin & Page in the 1998 paper are explicit that PageRank is a proxy for “the importance of web pages,” and import “the citation analysis of academic literature” as the analogue. The paper does not claim links measure trust; it claims links measure importance, and importance correlates with trust. The conflation happened in the industry, not in the original work. ↩
-
This is not a recommendation to stop disavow work entirely. It is a recommendation to budget it correctly. If you are spending more time on link cleanup than on entity infrastructure, the budget is upside-down for the current substrate. ↩
