How AI is Decoding the Ancient World
Artificial intelligence is acting as a digital Rosetta Stone, allowing researchers to read carbonized scrolls, shattered stone inscriptions, and eroded clay tablets that were previously thought lost to history.
By Factlen Editorial Team
- Digital Classicists
- Believe AI is the key to unlocking massive archives of unread historical texts at an unprecedented scale.
- Traditional Historians
- Emphasize that AI must remain an assistive tool guided by human expertise to prevent historical hallucinations.
- Heritage Advocates
- Argue that AI models trained on human history must remain open-source and accessible to all.
What's not represented
- · Indigenous communities whose ancestral languages are being digitized
- · Museum curators managing the physical conservation of the artifacts
Why this matters
For centuries, our understanding of the ancient world has been limited to the tiny fraction of texts that survived intact. By using AI to read carbonized scrolls, shattered stones, and eroded clay, we are on the verge of doubling the known corpus of classical literature and fundamentally rewriting human history.
Key points
- AI is enabling historians to read ancient texts that were previously considered permanently lost or unreadable.
- The Vesuvius Challenge uses machine learning and X-ray scans to virtually unwrap carbonized Roman scrolls.
- DeepMind's Ithaca and Aeneas models can predict missing text on shattered Greek and Latin inscriptions.
- New diffusion models are standardizing the highly variable characters of 3,000-year-old cuneiform tablets.
- Historians caution that AI must be used collaboratively to prevent the generation of fictitious historical data.
- The digital heritage community is pushing to keep these powerful AI models open-source and freely accessible.
For centuries, the ancient world has been defined as much by its silences as by its surviving voices. Millions of words written on papyrus, carved into marble, or pressed into wet clay have been lost to fires, volcanic eruptions, and the slow grind of time. Even when artifacts survive, they are often unreadable. Stone inscriptions are shattered into fragmented puzzles, and clay tablets are worn down by millennia of erosion. Papyrus scrolls, carbonized by volcanic ash, turn to dust at the slightest human touch. For generations, historians have accepted that the vast majority of these texts would remain forever locked away, their stories and secrets permanently erased from the human record.[1]
That paradigm is now collapsing in real time. In recent years, artificial intelligence has emerged as the ultimate digital Rosetta Stone, fundamentally altering how historians interact with antiquity. By combining high-resolution three-dimensional imaging, advanced computer vision, and generative machine learning, researchers are peering inside sealed scrolls and predicting the missing words on shattered monuments. This is not merely a technological novelty; it represents a profound shift in the humanities, transforming epigraphy and archaeology from disciplines defined by scarcity into fields suddenly grappling with an influx of new data.[1]
The most dramatic proving ground for this technology lies in the ashes of Mount Vesuvius. When the volcano erupted in AD 79, it buried the Roman town of Herculaneum, including a grand villa containing an extensive library of papyrus scrolls. Rediscovered in the eighteenth century, the roughly 800 surviving scrolls were completely carbonized by the intense heat of the pyroclastic flow. Early attempts to unroll them physically were disastrous, destroying the fragile artifacts and leaving the surviving cache strictly off-limits to further physical manipulation.[2]
The breakthrough came via the Vesuvius Challenge, a global competition that crowdsourced the problem of "virtual unwrapping." Researchers first used X-ray computed tomography to create massive, terabyte-sized three-dimensional scans of the rolled papyrus. Because the ancient Romans used carbon-based ink, the writing was virtually indistinguishable from the carbonized papyrus in standard X-rays. However, machine learning algorithms were trained to detect microscopic textural changes on the papyrus surface left by the ink, successfully revealing hidden Greek letters without ever exposing the scroll to the open air.[2]

The project has rapidly accelerated from reading isolated characters to uncovering substantial passages of lost literature. In May 2025, the Vesuvius Challenge awarded a $60,000 prize to researchers who successfully identified the title of a still-sealed scroll—a text on vices by the Epicurean philosopher Philodemus. The community is now automating the segmentation process, which involves identifying and separating the individual layers of papyrus inside the three-dimensional scan. Because some scrolls would be more than a dozen meters long if unrolled, manual segmentation was previously a massive bottleneck.[2]
By deploying advanced autosegmentation algorithms, the project aims to scale the technology from reading isolated paragraphs to decoding entire scrolls. The implications of this technological leap are staggering for classicists. If the remaining Herculaneum scrolls can be read in their entirety, it could potentially double the corpus of literature we currently possess from classical antiquity. Furthermore, it opens the door to scanning and reading other carbonized artifacts that may still be buried in the unexcavated portions of the Herculaneum villa, sparking hopes of finding lost works by foundational authors.[2]
While the Vesuvius Challenge tackles fragile papyrus, other AI models are solving the problem of shattered stone. The field of epigraphy—the study of ancient inscriptions—relies on texts that have often been broken, weathered, or moved from their original historical contexts. To address this, researchers at Google DeepMind, in collaboration with several universities, developed a deep neural network named Ithaca. Trained on a dataset of over 63,000 ancient Greek inscriptions, Ithaca learned to recognize the complex patterns, syntax, and standardized formulas of classical antiquity.[5]
While the Vesuvius Challenge tackles fragile papyrus, other AI models are solving the problem of shattered stone.
When presented with a damaged inscription, Ithaca generates a ranked list of the most mathematically probable missing characters. Working alone, the AI achieves a 62 percent accuracy rate in restoring damaged texts. However, the system was designed for collaboration, not replacement. When historians use Ithaca as an assistive tool, their accuracy leaps to 72 percent. Furthermore, the model can date texts to within 30 years of their creation and identify their original geographic location with 71 percent accuracy, resolving long-standing debates over the origins of specific artifacts.[5]

Building on the foundational success of Ithaca, DeepMind expanded its historical AI portfolio by introducing a new generative AI model named Aeneas in July 2025. Designed specifically for Latin inscriptions, Aeneas represents a significant leap in capability because it can process multimodal inputs, analyzing both text and images simultaneously. The model searches for syntactical parallels across thousands of historical records to restore gaps in the text, setting a new state-of-the-art benchmark in the field of digital epigraphy.[3]
While initially trained for Latin, Aeneas was built with a flexible architecture that can be adapted to other ancient languages, scripts, and media, ranging from fragile papyri to ancient coinage. This multimodal approach allows historians to cross-reference visual damage on an artifact with linguistic probabilities, creating a highly robust system for contextualizing ancient Roman history. It bridges the gap between the physical reality of the damaged artifact and the theoretical reconstruction of its original text, offering a more holistic view of the past.[3]
Moving further back in time, artificial intelligence is also untangling the profound complexities of ancient Mesopotamia. Cuneiform, one of humanity's oldest writing systems, consists of wedge-shaped marks pressed into wet clay. The language features more than 1,000 unique characters whose appearances vary wildly depending on the era, the region, and the individual scribe's handwriting. This immense variability has created a massive academic bottleneck, leaving vast amounts of historical data completely inaccessible to modern scholars who simply do not have the time to manually decipher every variation.[4][8]
An estimated 500,000 cuneiform tablets currently sit in museum archives worldwide, but only a tiny fraction have ever been translated due to the sheer labor and specialized knowledge required. In 2025, researchers from Cornell University and Tel Aviv University introduced "ProtoSnap," a system utilizing diffusion models to automate the deciphering process. The AI computationally overlays photographs of tablets with known character prototypes, aligning them until they digitally snap into place, effectively standardizing the highly variable script for rapid translation.[4][8]

The application of computer vision to ancient languages is expanding across multiple civilizations. At the SIGGRAPH 2025 conference, a multidisciplinary team showcased a system that automatically interprets Ancient Egyptian hieroglyphs. By combining natural language processing with advanced image recognition, the system provides researchers and educators with instant transliterations and translations of digitized artifacts. This dramatically lowers the barrier to entry for studying Egyptian texts, allowing non-specialists and students to engage directly with primary sources without needing decades of specialized linguistic training.[6]
Beyond the ancient world, artificial intelligence is also cracking more recent historical secrets. The "Decrypt" project at the University of Oslo utilizes machine learning to transcribe and decipher centuries-old encrypted letters, such as those written by Mary, Queen of Scots. By training models to recognize historical cipher systems, researchers are uncovering intelligence reports, conspiracies, and diplomatic correspondence that have remained unread for hundreds of years. This proves that AI's utility in historical research spans the entirety of recorded human history, from ancient clay to Renaissance espionage.[7]

Despite these monumental leaps, traditional historians emphasize the absolute necessity of human oversight in the digital humanities. Large language models are inherently prone to hallucination, and in the context of ancient history, an AI might confidently generate a plausible but entirely fictitious historical claim. Active participation from native linguists, epigraphers, and domain experts remains essential to ensure cultural fidelity and prevent the distortion of the historical record by overzealous algorithms. The technology is a powerful assistant, but it cannot replace the nuanced contextual understanding of a trained historian.[1]
To prevent these powerful tools from being locked behind corporate paywalls, a strong open-source ethos has taken root in the digital heritage community. The code for both the Vesuvius Challenge and DeepMind's models has been made freely available to the public, allowing independent researchers to build upon the foundational work. By democratizing access to these algorithms, the community ensures that the resurrection of the ancient world remains a collaborative, global endeavor, promising to rewrite our understanding of human history for generations to come.[2][3]
How we got here
AD 79
Mount Vesuvius erupts, burying the library of Herculaneum and carbonizing its scrolls.
1750
The Herculaneum scrolls are rediscovered by farmworkers digging a well.
2022
DeepMind introduces Ithaca, an AI model capable of restoring and dating ancient Greek inscriptions.
2023
The Vesuvius Challenge is launched to crowdsource the virtual unwrapping of the carbonized scrolls.
2024
The Vesuvius Challenge Grand Prize is awarded after researchers successfully read multiple passages from a scroll.
May 2025
Researchers win a $60,000 prize for using AI to identify the title of a sealed Herculaneum scroll.
July 2025
DeepMind introduces Aeneas, a multimodal AI model for contextualizing Latin inscriptions.
Viewpoints in depth
Digital Classicists
Believe AI is the key to unlocking massive archives of unread historical texts at an unprecedented scale.
For researchers focused on the digital humanities, the primary obstacle in ancient history is no longer discovery, but processing. With hundreds of thousands of cuneiform tablets and thousands of shattered inscriptions sitting in museum basements, human scholars simply do not have the lifespans required to translate them all. This camp views AI as a necessary force multiplier. By automating the tedious work of character recognition, segmentation, and syntax prediction, AI frees historians to focus on high-level analysis and synthesis, potentially doubling the known corpus of classical literature within a single generation.
Traditional Historians
Emphasize that AI must remain an assistive tool guided by human expertise to prevent historical hallucinations.
Traditional epigraphers and historians acknowledge the utility of AI but warn against treating its outputs as absolute truth. Because large language models are designed to predict the most statistically probable next word, they are inherently prone to hallucination. In a historical context, an AI might confidently fill in a missing gap on a stone tablet with a plausible but entirely fictitious name or event. This camp argues that AI should only be used as a collaborative tool, generating hypotheses that must be rigorously vetted by human experts who understand the cultural, political, and physical context of the artifact.
Heritage Advocates
Argue that AI models trained on human history must remain open-source and accessible to all.
As AI becomes the primary lens through which we decode the past, cultural heritage advocates are deeply concerned about who controls that lens. If the models capable of reading ancient texts are locked behind the proprietary APIs of major tech companies, the heritage of humanity effectively becomes privatized. This camp champions the open-source ethos seen in projects like the Vesuvius Challenge, arguing that the algorithms, training data, and resulting translations must remain freely accessible to researchers, educators, and the global public to ensure that history belongs to everyone.
What we don't know
- Whether the remaining unread Herculaneum scrolls contain lost masterpieces from foundational classical authors.
- How accurately AI models can interpret the nuanced cultural idioms and slang of ancient languages.
- Whether the physical excavation of the remaining Herculaneum villa will be permitted to uncover more scrolls.
Key terms
- Epigraphy
- The study and interpretation of ancient inscriptions on hard surfaces like stone or metal.
- Cuneiform
- One of the oldest known writing systems, characterized by wedge-shaped marks on clay tablets, used in ancient Mesopotamia.
- Virtual Unwrapping
- A technique using 3D X-ray computed tomography and software to digitally flatten and read a rolled or folded document without physically touching it.
- Diffusion Model
- A type of generative artificial intelligence that learns to construct data by reversing a process of added digital noise, used here to standardize ancient characters.
- Autosegmentation
- The automated process of identifying and separating the individual layers of a rolled or folded object within a 3D scan.
Frequently asked
What is the Vesuvius Challenge?
A machine learning competition designed to read carbonized scrolls from the ancient Roman city of Herculaneum without physically opening them.
How does AI read missing text on stone?
Models like DeepMind's Ithaca analyze the surviving letters and use patterns from tens of thousands of other inscriptions to predict the most mathematically probable missing words.
Can AI translate cuneiform?
Yes, new diffusion models can identify and standardize the highly variable wedge-shaped characters found on 3,000-year-old clay tablets, drastically speeding up translation.
What is a hallucination in AI history?
When an AI model confidently generates a plausible but entirely fictitious historical claim or translation, highlighting the need for human oversight.
Sources
[1]Factlen Editorial TeamHeritage Advocates
Synthesis by Factlen editorial team
Read on Factlen Editorial Team →[2]Vesuvius ChallengeDigital Classicists
Resurrect an ancient library from the ashes of a volcano
Read on Vesuvius Challenge →[3]Google DeepMindDigital Classicists
Google DeepMind's new AI model helps historians interpret ancient texts
Read on Google DeepMind →[4]Cornell UniversityDigital Classicists
AI Deciphers Ancient Tablets
Read on Cornell University →[5]NatureTraditional Historians
Restoring and attributing ancient texts using deep neural networks
Read on Nature →[6]SIGGRAPHDigital Classicists
Automatic Interpretation of Ancient Egyptian Texts for Education and Research
Read on SIGGRAPH →[7]University of OsloTraditional Historians
AI assists researchers in decoding old secret letters
Read on University of Oslo →[8]Discover MagazineHeritage Advocates
AI Deciphers Ancient Tablets
Read on Discover Magazine →
Every angle. Every day.
Get culture stories with full source coverage and perspective breakdowns delivered to your inbox.








