The working archive · since 1999
Forty essays, twenty-six years, one continuous thread.
From AltaVista keyword density to LLM citation behaviour. The full archive of published research, in reverse chronological order. The work itself is the point.
Generative retrieval & statement-level visibility.
The substrate shifted. We rebuild the framework from the ground up — claim-level scoring, LLM citation behaviour, applied measurement.
Statement-level visibility, or: why ranking a page no longer matters.
The unit of competition has shifted. LLM-driven retrieval doesn't surface URLs — it surfaces claims. We propose a measurement framework.
A taxonomy of LLM citation behavior across 14 frontier models.
What gets cited, what gets paraphrased, what disappears. A controlled audit across GPT-4o, Claude, Gemini, and 11 others.
GEO is not SEO with prompts. A position paper.
The framing of GEO as 'prompt-optimized SEO' mistakes the surface for the substrate. Generative engine optimization is a distinct discipline.
Ranking ≠ retrieval ≠ generation. A decomposition.
Three operations, often conflated. We separate them with notation, examples, and applied measurement.
How LLMs read right-to-left: retrieval in Hebrew and Arabic.
Legacy crawlers tokenised RTL text differently than transformers embed it. The gap is now a visibility risk — and an opportunity.
Entity disambiguation, for humans who share a name.
A model that cannot tell you apart from a stranger will average you together. The fix is structural, not editorial.
Chunking is the new pagination.
How your document is split into retrieval chunks now determines whether its claims survive with their context intact.
Reproducibility as a ranking signal.
A claim that reappears across runs and paraphrases is treated as more trustworthy. Consistency is now optimisable.
Against the prompt as product.
Selling 'prompt packs' as a GEO strategy confuses the input you control with the system you don't.
Measurement, not advice.
The industry has no shortage of opinion about AI search. It has a severe shortage of controlled measurement. This archive picks a side.
The transformer years.
BERT, MUM, helpful-content. The four-year interregnum when search learned to read — and most practitioners did not notice the shift had happened.
Panda, Penguin, BERT: a field guide to twenty years of correction.
Each major update corrected a specific exploit. Read in sequence, they trace a single trajectory — toward meaning.
The link graph is not the trust graph anymore.
PageRank approximated trust with links because links were the only signal at scale. Models have other signals now.
Cannabis, YMYL, and the hardest vertical in search.
Regulated health-adjacent verticals are where every ranking system shows its real priorities. What works there generalises.
GA4 and the quiet end of the session.
The event model replaced the session model for a reason that matters more in the AI era than anyone expected.
The ten-year warm-up.
Pre-transformer search. Panda, Penguin, manual penalties, the link-graph era. Where the apprenticeship happened.
Notes from the first Hebrew SEO panel, SMX Israel 2012.
Jerusalem, January 2012. The only Hebrew-language session on the agenda — and what it got right about the decade ahead.
What optimising for AltaVista taught me about LLMs.
Before PageRank swallowed the index, ranking was about presence and proximity. Some of those instincts are suddenly useful again.