How AI is Decoding the World's Lost Ancient Languages
Machine learning algorithms and 3D imaging are translating ancient cuneiform and unrolling carbonized scrolls, unlocking millennia of lost human history.
By Factlen Editorial Team
- Computational Archaeologists
- Focus on scaling up translation and using AI to process massive datasets of ancient texts.
- Classical Papyrologists
- Value the preservation of artifacts and the ability to read previously destroyed texts without physical intervention.
- Linguistic Purists
- Caution against AI hallucinations and emphasize that machine translation misses cultural context.
What's not represented
- · Indigenous communities whose ancestral languages are being studied
- · Museum curators managing the physical artifacts
Why this matters
Hundreds of thousands of ancient documents detailing the laws, philosophies, and daily lives of early human civilizations have sat unread for centuries due to a lack of human translators. By using artificial intelligence to decode these texts in seconds, researchers are unlocking a massive new window into human history that was previously thought lost to time.
Key points
- Museums hold over 500,000 untranslated cuneiform tablets, far exceeding the capacity of human experts.
- A new AI model uses Neural Machine Translation to instantly convert Akkadian cuneiform into English with high accuracy.
- The Vesuvius Challenge is using 3D X-rays and AI to read carbonized Roman scrolls without physically unrolling them.
- In 2025, researchers successfully used AI to discover the title of a lost philosophical text inside a sealed Herculaneum scroll.
- Algorithms are now being trained to map the mathematical structure of completely undeciphered scripts like Minoan Linear A.
The sheer volume of untranslated history is staggering. Museums and universities worldwide hold more than half a million clay tablets inscribed with cuneiform, the wedge-shaped writing system of ancient Mesopotamia. Yet, because the Akkadian language has not been spoken for 2,000 years, only a few hundred experts globally can read them.[3][4]
This creates a profound human bottleneck. The vast majority of these documents—detailing the political, economic, and daily lives of early civilizations—remain locked away, untranslated and inaccessible. But a new wave of artificial intelligence is beginning to clear the backlog, acting as a digital Rosetta Stone for antiquity and unlocking millennia of lost human history.[1][3][6]
The breakthrough relies on Neural Machine Translation (NMT), the same underlying technology that powers modern translation apps. A team of computer scientists and archaeologists from Tel Aviv University and Ariel University recently trained an AI model on a massive dataset of digitized cuneiform texts from the Open Richly Annotated Cuneiform Corpus.[1][4]
Translating Akkadian is notoriously difficult because the language is polyvalent; a single cuneiform sign can have multiple different readings depending on its function in a sentence. Traditionally, scholars must first transliterate the cuneiform into Latin script based on phonetics, and then translate that transliteration into English.[4]

The Israeli research team taught their AI to perform both tasks. The model learned to translate into English from both the Latin transliterations and directly from the Unicode representations of the cuneiform symbols themselves. The results were evaluated using the Best Bilingual Evaluation Understudy 4 (BLEU4) metric, a standard tool for measuring machine translation accuracy.[1][3]
The AI achieved a BLEU4 score of 37.47 when translating from transliterations, and 36.52 when translating directly from cuneiform. In the world of machine translation, a score around 37 is considered highly effective for an early-stage model, providing a translation that is entirely usable as a first-pass understanding of the text.[1][3]
However, the system is not flawless. The AI occasionally produces "hallucinations"—sentences that are syntactically perfect in English but completely misrepresent the original Akkadian meaning. Because of this, researchers advocate for a "human-in-the-loop" approach, where the AI rapidly processes thousands of tablets to find relevant texts, which human experts then refine for cultural nuance.[1][4]

While NMT is translating readable tablets, a different branch of AI is tackling ancient texts that cannot even be opened. In 79 AD, the eruption of Mount Vesuvius buried the Roman town of Herculaneum in volcanic ash, carbonizing an entire library of papyrus scrolls.[2][5]
While NMT is translating readable tablets, a different branch of AI is tackling ancient texts that cannot even be opened.
For centuries, these scrolls presented an impossible puzzle. The papyrus is so fragile that any physical attempt to unroll it causes the carbonized layers to crumble into dust. To solve this, a global initiative called the Vesuvius Challenge was launched, offering substantial cash prizes to anyone who could read the scrolls using non-invasive technology.[2][5]
The process begins at a particle accelerator, such as the Diamond Light Source in the UK, which blasts the scrolls with powerful X-rays to create microscopic 3D scans of their internal structure. The challenge then shifts to software engineers, who use AI autosegmentation algorithms to digitally peel apart the crumpled layers of the 3D scan.[2][5]
Because the ancient Romans used carbon-based ink, it does not show up brightly on an X-ray like metal-based ink would. Instead, machine learning models are trained to detect the microscopic textural changes on the papyrus where the ink was applied. In 2023, a 21-year-old engineering student named Luke Farritor became the first to successfully train an AI to spot these faint Greek letters, winning a portion of the $700,000 grand prize.[5]

The technology is advancing at a staggering pace. In early 2025, the Bodleian Libraries and the Vesuvius Challenge announced a historic milestone: researchers had successfully generated an image of the inside of a scroll and identified its title. The text, "On Vices," was written by the Greek philosopher Philodemus, offering new insights into ancient ethical guidance.[2]
Beyond translating known languages and unrolling fragile scrolls, AI is also being deployed against the ultimate linguistic puzzle: completely undeciphered scripts. Languages like Minoan Linear A and the script of the Indus Valley Civilization have baffled linguists for decades because there is no bilingual text to serve as a translation key.[7]
To crack these codes, researchers at MIT and Google have developed algorithms that analyze the structural sparsity and distributional similarity of the unknown symbols. By mapping how frequently certain characters appear together, the AI can compare the mathematical structure of the mystery language to the structures of known, related languages.[7]
This method has already proven successful on scripts we do understand. When tested on Linear B—an early form of Greek deciphered by humans in the 1950s—the AI was able to algorithmically map the phonetics and accurately translate the text. The algorithm achieved similar success with Ugaritic, a 3,000-year-old cuneiform language.[7]

Applying this to isolated languages like the Indus Valley script remains a monumental challenge, as the AI must find a statistical match without knowing which language family the script belongs to. Yet, the sheer computational power of modern machine learning offers the first real hope of breaking these glyphs.[6][7]
The implications of these technologies extend far beyond academic curiosity. Language is the ultimate artifact of human consciousness. When a writing system is deciphered, an entire civilization speaks again, allowing us to discover ancient peoples through their own laws, poetry, and daily correspondence.[6][7]
As AI models grow more sophisticated and 3D imaging becomes more precise, the bottleneck of historical translation is rapidly disappearing. We are entering a new Renaissance of classical antiquity, one driven not by the discovery of new ruins, but by the silicon and code that can finally read the ruins we already have.[6]
How we got here
1950s
Architect Michael Ventris manually deciphers Linear B, an early form of Greek.
2020
Researchers publish initial papers on using AI to transliterate Akkadian cuneiform into Latin script.
2023
The Vesuvius Challenge is launched to decode carbonized scrolls using 3D scans and machine learning.
2023
An AI model successfully translates Akkadian cuneiform directly to English using Neural Machine Translation.
2025
Vesuvius Challenge researchers discover the title 'On Vices' inside a sealed Herculaneum scroll.
Viewpoints in depth
Computational Archaeologists
Focus on scaling up translation and using AI to process massive datasets of ancient texts.
This camp views the sheer volume of untranslated artifacts as a data problem that only machine learning can solve. By deploying Neural Machine Translation and autosegmentation algorithms, they argue we can clear a centuries-long backlog of historical documents in a matter of years. They prioritize speed, broad pattern recognition, and the digitization of global archives over perfect initial accuracy.
Classical Papyrologists
Value the preservation of artifacts and the ability to read previously destroyed texts without physical intervention.
For these scholars, the primary triumph of AI is conservation. They have spent decades watching fragile carbonized scrolls crumble into dust during physical unrolling attempts. The ability to use particle accelerators and AI ink-detection to read a text while leaving the physical artifact completely sealed is seen as a holy grail for preserving the fragile remnants of antiquity.
Linguistic Purists
Caution against AI hallucinations and emphasize that machine translation misses cultural context.
While acknowledging the utility of AI as a first-pass tool, this camp warns against relying too heavily on machine outputs. They point out that ancient languages are highly polyvalent and deeply tied to cultural context that algorithms cannot comprehend. They advocate for a strict "human-in-the-loop" approach, ensuring that AI-generated translations are rigorously vetted by human experts to prevent historical inaccuracies.
What we don't know
- Whether AI can successfully decipher isolated languages like the Indus Valley script that have no known cognates.
- The full contents of the hundreds of Herculaneum scrolls that remain unread.
- How to completely eliminate 'hallucinations' in neural machine translation of dead languages.
Key terms
- Cuneiform
- A logo-syllabic writing system used in the ancient Middle East, characterized by wedge-shaped marks pressed into clay tablets.
- Neural Machine Translation (NMT)
- An approach to translation that uses an artificial neural network to predict the likelihood of a sequence of words, rather than translating word-for-word.
- Autosegmentation
- An AI technique used to digitally separate the microscopic, crumpled layers of a 3D-scanned object, like a rolled papyrus scroll.
- BLEU4 Score
- A metric from 0 to 100 used to evaluate the quality of machine-translated text by comparing it to high-quality human translations.
- Linear A
- An undeciphered writing system used by the ancient Minoan civilization of Crete, which currently has no known translation.
Frequently asked
Will AI replace human archaeologists and linguists?
No. Researchers emphasize a "human-in-the-loop" model where AI provides a rapid first-pass translation, which human experts then refine for cultural and historical nuance.
How does AI read a scroll that is still rolled up?
Scientists use a particle accelerator to create a high-resolution 3D X-ray of the scroll. AI algorithms then digitally separate the layers and detect the slight density differences of carbon ink.
Can AI translate a language if we have no bilingual text?
Yes, but it is difficult. AI looks for structural patterns and compares them to known cognate languages, a technique that has successfully deciphered Ugaritic and Linear B.
Sources
[1]PNAS NexusComputational Archaeologists
Translating Akkadian to English with neural machine translation
Read on PNAS Nexus →[2]University of OxfordClassical Papyrologists
Inside of Herculaneum scroll seen for the first time in almost 2,000 years
Read on University of Oxford →[3]The Times of IsraelLinguistic Purists
Groundbreaking AI project translates 5,000-year-old cuneiform at push of a button
Read on The Times of Israel →[4]Big ThinkComputational Archaeologists
AI translates 5,000-year-old cuneiform tablets instantly
Read on Big Think →[5]R&D WorldComputational Archaeologists
How an engineer used AI to decode some of history's oldest sealed scrolls
Read on R&D World →[6]Factlen Editorial TeamComputational Archaeologists
Synthesis by Factlen editorial team
Read on Factlen Editorial Team →[7]Mind MattersLinguistic Purists
Can AI help us decipher lost languages?
Read on Mind Matters →
Every angle. Every day.
Get culture stories with full source coverage and perspective breakdowns delivered to your inbox.







