Factlen ExplainerBioacousticsScientific ExplainerJun 24, 2026, 9:24 PM· 5 min read· #5 of 5 in ai

How AI is Decoding Animal Communication and Rewriting the Rules of Biology

Advanced machine learning models are deciphering the complex languages of whales, elephants, and birds, transforming our understanding of non-human intelligence and potentially reshaping environmental law.

By Factlen Editorial Team

Share this story

Computational Ethologists 40%Conservation Technologists 30%Legal & Rights Advocates 30%

Computational Ethologists: Focus on using foundation models to discover shared structural axes of communication across all species.
Conservation Technologists: Prioritize the deployment of AI acoustic sensors to monitor ecosystem health and protect endangered populations in real-time.
Legal & Rights Advocates: Argue that scientific proof of animal language must translate into expanded legal personhood and environmental protections.

What's not represented

· Indigenous Knowledge Keepers
· Industrial Ocean Operators (Shipping/Mining)

Why this matters

Proving that animals possess complex, culturally transmitted language fundamentally alters humanity's relationship with nature. Beyond the profound philosophical shift, this scientific breakthrough provides a powerful new legal tool to protect endangered species and their habitats from industrial destruction.

Key points

AI foundation models are now being used to decode the communication of whales, elephants, and birds.
Historically, up to 97% of bioacoustic data was discarded due to background noise, a problem AI source separation has solved.
Project CETI has discovered evidence of a phonetic alphabet and combinatorial grammar in sperm whale codas.
The new Retrieval-Augmented Bioacoustics (RAB) framework prevents AI hallucinations by linking insights to verified call libraries.
Legal scholars argue that proving animals possess language could grant them new legal rights against noise pollution and habitat destruction.

97%

Bioacoustic data previously discarded due to noise

50+

Scientists collaborating on Project CETI

967 meters

Depth of sperm whale dives recorded by bio-loggers

Humanity has long assumed it holds a monopoly on complex language. For decades, field biologists have recorded the clicks of sperm whales, the rumbles of elephants, and the calls of beluga whales, only to be overwhelmed by the sheer volume of data.[7]

Historically, researchers had to discard up to 97 percent of their audio recordings because the signals were buried under the ambient noise of wind, waves, and human industry. The inability to isolate individual voices meant that the deeper structures of animal communication remained locked away.[1]

But in 2026, the same artificial intelligence architectures that power human-language chatbots are being pointed at the natural world. The result is a rapidly accelerating field known as AI bioacoustics, which is fundamentally shifting how we understand the diverse intelligences sharing our planet.[7]

Before AI can decode animal communication, it needs pristine data. This requirement has triggered a renaissance in ecological hardware, moving the field away from rudimentary recorders toward sophisticated, machine-learning-ready sensors.[6]

How foundation models process raw ecological audio into actionable scientific insights.

At the Harvard John A. Paulson School of Engineering and Applied Sciences, researchers have developed advanced "bio-loggers" specifically designed to feed high-fidelity data directly into machine learning algorithms.[3]

These non-invasive, suction-cup devices attach to the skin of sperm whales, riding along on dives that plunge nearly 1,000 meters below the surface. Equipped with synchronized, high-bandwidth hydrophones, the loggers capture not just the audio of whale "codas"—rhythmic clicking patterns—but also the exact depth, temperature, and physical orientation of the animal when it speaks.[3]

Meanwhile, organizations are building what they describe as a global nervous system of smart microphones. Deployed across rainforests, agricultural fields, and coral reefs, these autonomous acoustic sensors continuously monitor the heartbeat of ecosystems.[6]

By capturing the dawn chorus of birds or the ultrasonic echolocation of bats, the hardware provides a real-time, continuous stream of raw ecological intelligence, allowing conservationists to measure biodiversity without ever stepping foot in the habitat.[6]

AI source separation has solved the 'cocktail party problem,' rescuing vast amounts of previously unusable acoustic data.

The true breakthrough, however, lies in how this massive influx of data is processed. The Earth Species Project (ESP), a non-profit dedicated to decoding non-human communication, has pioneered the use of foundation models for biology.[1]

The true breakthrough, however, lies in how this massive influx of data is processed.

Their flagship system, NatureLM-audio, is a large audio-language model trained on a vast dataset that includes human speech, music, and millions of animal vocalizations. By learning the underlying structural axes of sound, NatureLM-audio can generalize across the Tree of Life.[1]

It can detect and classify the calls of species it has never encountered before, performing with an accuracy that far exceeds random chance. One of the most significant hurdles ESP has cleared is the "cocktail party problem"—the challenge of isolating a single voice in a noisy environment.[1]

Using AI-driven source separation, researchers can now untangle the overlapping calls of a beluga whale pod, mapping distinct vocal repertoires and identifying regional dialects that were previously indistinguishable to the human ear.[1]

Autonomous smart microphones act as a global nervous system, continuously monitoring the health of remote ecosystems.

Perhaps the most ambitious application of this technology is Project CETI (Cetacean Translation Initiative). Backed by a team of more than 50 scientists spanning linguistics, cryptography, robotics, and marine biology, CETI is focused entirely on the sperm whale.[2]

Sperm whales communicate using codas, which are highly structured and mathematically rigorous, making them an ideal candidate for machine learning translation. CETI’s AI models have already yielded groundbreaking discoveries, including evidence of a phonetic alphabet within sperm whale communication.[2]

The algorithms have identified combinatorial structures—essentially, the building blocks of grammar—and have tracked how these signaling systems are culturally transmitted across generations by matriarchs. In one remarkable instance, CETI’s synchronized recordings captured the complex, coordinated vocalizations of a dozen female whales during the birth of a calf.[2]

As AI generates hypotheses about animal language, the scientific community is demanding transparency to prevent hallucinations in ethological research. To address this, researchers recently introduced Retrieval-Augmented Bioacoustics (RAB) at the NeurIPS conference.[4]

The scale of the AI bioacoustics revolution.

Unlike standard generative models that might invent a plausible-sounding but false conclusion, RAB links every AI-generated insight directly back to a verified acoustic embedding from a call library. This evidence-guided generation allows biologists to trust the AI's outputs, turning the technology from a black-box oracle into a rigorous, citable research assistant.[4]

The implications of decoding animal communication extend far beyond biology; they are poised to disrupt the legal landscape. The More-Than-Human Life (MOTH) Program at NYU Law has partnered with Project CETI to explore the legal rights of animals capable of language.[5]

In a landmark framework, legal scholars and biologists argued that if cetaceans possess culturally transmitted language, they may qualify for expanded legal protections. This could empower advocates to block industrial activities like deep-sea mining and seismic blasting, which create devastating underwater noise pollution that disrupts animal communication.[5]

Ultimately, the rise of AI bioacoustics represents a profound shift in humanity's relationship with the natural world. The algorithms initially designed to automate human tasks are now serving as universal translators, revealing the rich, hidden inner lives of the creatures around us, and proving that the Earth has been speaking all along.[7]

How we got here

2020
Project CETI is founded to apply machine learning to sperm whale communication.
2024
Researchers publish breakthroughs identifying elephant names and beluga whale dialects using AI.
April 2025
NYU Law's MOTH Program publishes a framework on the legal impact of AI-assisted animal communication studies.
December 2025
Harvard SEAS details the deployment of advanced, machine-learning-ready bio-loggers on sperm whales.
Mid-2026
The NeurIPS conference highlights Retrieval-Augmented Bioacoustics (RAB) to ensure AI models remain scientifically grounded.

Viewpoints in depth

Computational Ethologists

Focus on using foundation models to discover shared structural axes of communication across all species.

Researchers at the Earth Species Project and Project CETI argue that by training large audio-language models on massive, cross-taxa datasets, AI can identify universal patterns in how life communicates. They emphasize that modern machine learning allows science to move beyond narrow, species-specific classification and instead map the fundamental building blocks of language, whether it originates from a beluga whale, a crow, or a human.

Conservation Technologists

Prioritize the deployment of AI acoustic sensors to monitor ecosystem health and protect endangered populations in real-time.

For hardware developers and conservationists, the immediate value of AI bioacoustics lies in actionable intelligence. By deploying smart microphones and advanced bio-loggers, they can track the presence, stress levels, and movement of elusive species without human interference. This camp argues that real-time acoustic monitoring is the most scalable way to measure biodiversity and enforce protections in remote habitats facing the threat of mass extinction.

Legal & Rights Advocates

Argue that scientific proof of animal language must translate into expanded legal personhood and environmental protections.

Legal scholars, such as those in the NYU Law MOTH Program, view AI bioacoustics as a catalyst for systemic legal reform. They contend that if AI proves cetaceans and other animals possess culturally transmitted language and complex social synchronization, the law can no longer treat them merely as property or resources. This perspective pushes for new legal frameworks that grant communicative species the right to be free from human-generated noise pollution and habitat destruction.

What we don't know

Whether the structural elements of animal communication map directly to human concepts of grammar and syntax.
How industrial ocean operators will legally respond to lawsuits based on AI-translated animal distress calls.
The ethical boundaries and potential ecological consequences of conducting two-way AI playback experiments with wild populations.

Key terms

Bioacoustics: The scientific study of the production, transmission, and reception of sounds by animals.
Bio-logger: A non-invasive sensor tag attached to an animal to record audio, movement, and environmental data.
NatureLM-audio: A large audio-language model trained on human speech, music, and animal sounds to detect communication patterns across species.
Retrieval-Augmented Bioacoustics (RAB): An AI framework that links generated insights back to specific, verified animal call recordings to ensure scientific accuracy.
Codas: Rhythmic patterns of clicks used by sperm whales to communicate and coordinate social behavior.

Frequently asked

Can we use AI to talk back to the animals?

Currently, the focus is entirely on listening and decoding. While some playback experiments exist, scientists are highly cautious about the ethical implications of attempting two-way communication.

How does AI separate animal sounds from ocean noise?

AI models use 'source separation' algorithms—similar to those that isolate human voices in a crowded room—to filter out wind, waves, and ship engines, solving the bioacoustic 'cocktail party problem.'

Why are sperm whales a primary focus for this research?

Sperm whales communicate using 'codas,' which are highly structured, rhythmic clicking patterns. This mathematical structure makes their communication an ideal candidate for machine learning translation.

Sources

[1]Earth Species ProjectComputational Ethologists
NatureLM-audio and the AI-First Approach to Animal Communication
Read on Earth Species Project →
[2]Project CETIComputational Ethologists
Decoding Sperm Whale Communication with Machine Learning
Read on Project CETI →
[3]Harvard SEASConservation Technologists
SETI/CETI Tricorder Tech: Tapping Into Whale Talk
Read on Harvard SEAS →
[4]NeurIPSComputational Ethologists
Retrieval-Augmented Bioacoustics: Evidence-Guided Generation for Animal Communication
Read on NeurIPS →
[5]NYU Law MOTH ProgramLegal & Rights Advocates
What if We Understood what Animals are Saying? The Legal Impact of AI-assisted Studies
Read on NYU Law MOTH Program →
[6]SynatureConservation Technologists
Building a Global Nervous System of Smart Microphones
Read on Synature →
[7]Factlen Editorial Team
Synthesis by Factlen editorial team
Read on Factlen Editorial Team →

Up next

Surgical Robotics

Explainer: How AI is Giving Surgical Robots the Ability to See, Feel, and Act

Artificial intelligence is transforming surgical robots from remote-controlled tools into autonomous co-pilots. By combining machine learning with haptic feedback, new systems are mastering the unpredictable environment of soft-tissue surgery.

Every angle. Every day.

Get ai stories with full source coverage and perspective breakdowns delivered to your inbox.

Get the briefing →Browse ai