Essay 05 · Vol. III · RTL · AI Search · Published April 23, 2026

How LLMs read right-to-left: retrieval in Hebrew and Arabic.

Legacy crawlers tokenised RTL text differently than transformers embed it. The gap is now a visibility risk — and an opportunity.

Twenty years of Hebrew SEO taught a hard lesson: most search infrastructure was built left-to-right first and patched for RTL afterward. The patches leaked. Directionality bugs, mis-segmented compounds, niqqud stripped or retained inconsistently — Hebrew was a second-class citizen of the index.

Transformers do not inherit those exact bugs, but they inherit training-distribution ones. A model that saw a thousand English documents for every Hebrew one will embed Hebrew claims into a sparser, noisier region of vector space. Retrieval from that region is less reliable, and attribution rates fall accordingly.

The practical consequence: a correct Hebrew claim is, today, materially less likely to be retrieved and cited than its English equivalent — holding quality constant. For an Israeli practice that has published in Hebrew since 1999, this is both a threat to existing authority and a clear lever: bilingual claim-pairing, explicit entity anchoring, and structured translation of the canonical statements.

This essay sketches a measurement: matched claim pairs, Hebrew and English, run across the same fourteen models, scored for retrieval survival. The RTL penalty is real, quantifiable, and — with the right structure — recoverable.

Gilad Sasson

Gilad Sasson

aka Algoholic · גלעד ששון

Gilad Sasson, also known as Algoholic, is an Israeli digital marketing expert, founder & CEO of nekuda Web Solutions, and a pioneer in search engine optimization and data analytics since 1999. Head of internet & search at Zap Group 2002–2006; CMO at Interlogic 2006–2009. Speaker at SMX Israel, TNW Amsterdam, Web Summit Dublin, DMIEXPO.