Factlen ExplainerBioacousticsExplainerJun 15, 2026, 6:30 AM· 7 min read

How AI and Bioacoustics Are Decoding the Sounds of the Wild

Artificial intelligence is transforming wildlife conservation by analyzing millions of hours of environmental audio to track endangered species, detect threats, and even translate animal communication.

By Factlen Editorial Team

Share this story

Conservation Technologists 40%Field Ecologists 35%Bioacoustics Researchers 25%

Conservation Technologists: Focus on scaling AI models and hardware to process global biodiversity data.
Field Ecologists: View AI as a tool that must be paired with physical conservation interventions.
Bioacoustics Researchers: Focus on decoding the nuances of animal communication and culture.

What's not represented

· Indigenous communities whose traditional ecological knowledge is rarely integrated into algorithmic models.
· Policymakers who must translate acoustic data into actionable environmental regulations.

Why this matters

As the global biodiversity crisis accelerates, traditional methods of tracking wildlife are proving too slow and invasive. AI-powered acoustic monitoring gives scientists a scalable, real-time tool to measure ecosystem health, stop poaching, and protect endangered species before they disappear.

Key points

Passive Acoustic Monitoring (PAM) uses remote microphones to record continuous environmental audio without disturbing wildlife.
Processing terabytes of acoustic data was previously impossible for human researchers to accomplish at scale.
Deep neural networks can now convert audio into spectrograms to instantly identify the unique calls of thousands of species.
AI has enabled breakthroughs in decoding complex animal communication, such as identifying a phonetic alphabet in sperm whale clicks.
Significant challenges remain, including filtering out overlapping background noise and powering remote devices in extreme environments.

50x

Increase in speed detecting Hawaiian honeycreepers using AI

156

Distinct sperm whale codas identified by Project CETI

95,996

Bat calls detected in legacy data by Australia's ARISA model

The natural world is a symphony of information, but human ears miss the vast majority of it. As the global biodiversity crisis accelerates, scientists are racing to catalog and protect vulnerable ecosystems before they disappear. Traditional methods of tracking wildlife have long relied on physical observation, a process that is inherently slow, expensive, and limited by human endurance. Today, researchers are turning to a radically different approach to monitor the health of the planet: they are simply listening to it. By combining remote recording technology with the immense processing power of artificial intelligence, conservationists are unlocking a new era of ecological understanding.[7]

For decades, field ecologists have struggled with the limitations of visual surveys. Counting animals by sight is heavily biased toward diurnal species that live in open habitats. In dense tropical rainforests, deep ocean trenches, or nocturnal environments, visual tracking becomes nearly impossible. Elusive or critically endangered animals often alter their behavior in the presence of humans, meaning that the very act of observing them can skew the data. To build an accurate picture of an ecosystem, scientists needed a non-invasive method that could operate continuously without disturbing the habitat.[1][6]

The solution has emerged in the form of Passive Acoustic Monitoring (PAM). Researchers deploy small, autonomous recording devices—microphones in forests and hydrophones underwater—that capture continuous, high-fidelity audio of the surrounding environment. These devices record the entire soundscape, capturing everything from the high-frequency echolocation of bats and the low rumbles of elephants to the hum of distant traffic and the rustle of wind through the canopy. Because they are cheap and unobtrusive, PAM devices can be left in the field for months, providing an unprecedented, uninterrupted stream of ecological data.[1][4]

However, this technological leap created a massive new bottleneck: data processing. The problem for modern conservationists is no longer gathering information, but making sense of it quickly enough to be useful. A single large-scale monitoring project can generate hundreds of terabytes of audio in a single year. As researchers at the UK's Bat Conservation Trust have noted, manually listening to and categorizing just one season of their acoustic data would take decades of continuous human effort. The sheer volume of recordings rendered traditional analysis impossible.[4]

How artificial intelligence converts raw environmental audio into actionable conservation data.

This is where artificial intelligence steps in, fundamentally transforming bioacoustics from a niche academic pursuit into a scalable conservation tool. Machine learning models, specifically deep neural networks, are trained to process these massive audio files autonomously. The AI first converts the raw audio into visual representations of sound called spectrograms. By analyzing these complex visual patterns, the algorithms can isolate and identify the unique acoustic signatures of thousands of different species, filtering out background noise to pinpoint exactly who is calling, and when.[1][6]

Major technology companies are now dedicating significant resources to this field. Google DeepMind recently released Perch 2.0, an open-source AI model specifically designed to help conservationists analyze bioacoustic data. Trained on massive, publicly available datasets encompassing birds, mammals, amphibians, and even anthropogenic noise, the model can disentangle complex acoustic scenes. Crucially, Perch 2.0 is capable of adapting to new environments, allowing researchers to build highly accurate sound classifiers from just a single audio sample in under an hour, bypassing the need for massive labeled datasets.[3]

The real-world impact of these AI models is already being felt across the globe. Biologists at the University of Hawaii used Perch to locate the calls of critically endangered honeycreepers nearly 50 times faster than their traditional methods allowed, enabling rapid intervention against avian malaria. Similarly, in Australia, the Arthur Rylah Institute developed the ARISA model to scan years of legacy audio data. The AI successfully identified tens of thousands of bat calls and discovered new populations of the elusive Plains Wanderer and the endangered Sloane's Froglet in areas where they were previously thought to be locally extinct.[3][5]

Machine learning models can process massive acoustic datasets exponentially faster than human researchers.

The real-world impact of these AI models is already being felt across the globe.

Beyond simply identifying the presence of a species, artificial intelligence is helping scientists decode the complex nuances of animal communication. Project CETI (Cetacean Translation Initiative) represents one of the most ambitious bioacoustics efforts to date. Operating off the coast of Dominica, this interdisciplinary team of marine biologists, roboticists, and linguists is using advanced natural language processing to analyze the rhythmic clicks—known as codas—used by sperm whales to communicate in the deep ocean. By treating these vocalizations as a complex dataset, the project aims to provide the first-ever blueprint of another animal's language.[2]

By feeding vast amounts of underwater audio into sophisticated machine learning models, CETI researchers have made groundbreaking discoveries about whale culture. The AI has successfully identified a phonetic alphabet consisting of 156 distinct codas and their basic components. Furthermore, the algorithms can distinguish between different whale dialects with over 95 percent accuracy. This level of granular analysis is opening the door to understanding intergenerational knowledge transfer and social structures within whale pods, proving that these animals possess a highly structured form of communication.[2]

Project CETI is using advanced natural language processing to decode the rhythmic clicks of sperm whales.

Bioacoustics is also proving invaluable for measuring broader ecosystem health and the impact of human activity. At the National University of Singapore, researchers are using AI to compare the soundscapes of different tropical forests. By training models to differentiate between wildlife vocalizations and anthropogenic noise, such as traffic or construction, scientists can precisely measure how human disturbance alters animal activity patterns. This data is crucial for evaluating the effectiveness of restored microforests and ensuring that urban planning maintains vital ecological corridors.[6]

In addition to long-term monitoring, AI-driven acoustic systems are being deployed for real-time conservation interventions. Advanced algorithms can be trained to recognize the specific acoustic signatures of illegal activities, such as the whine of a chainsaw in a protected forest or the blast of a poacher's gunshot on a savanna. When these sounds are detected, the system can instantly transmit an alert to local park rangers, allowing them to intercept threats before irreversible damage is done to the habitat or its inhabitants.[7]

Despite these remarkable breakthroughs, the integration of AI into wildlife conservation is not without significant challenges. Researchers frequently grapple with what they call the messy world of environmental audio. Overlapping animal calls, heavy rain, wind interference, and the complex distortion of sound underwater can easily confuse algorithms that perform perfectly in a controlled laboratory setting. Training AI to reliably filter out this chaotic background noise remains one of the most pressing technical hurdles in the field.[3][4]

Furthermore, the rapid advancement of this technology has exposed a critical skills gap within the scientific community. The European BioacAI consortium, a major research initiative, has highlighted that it is exceedingly rare to find professionals who possess the necessary combination of expertise in zoology, acoustics, and machine learning. Building effective, end-to-end bioacoustic monitoring systems requires a full-stack understanding of both the ecological context and the underlying code, prompting universities to develop entirely new interdisciplinary training programs.[4]

Spectrograms allow AI algorithms to visually identify the unique acoustic signatures of thousands of species.

Hardware limitations also constrain the widespread deployment of these systems. Powering autonomous recorders in remote jungles or deep ocean trenches for months at a time is a logistical nightmare. Processing massive audio files locally on the device—known as edge computing—drains batteries rapidly, while transmitting terabytes of raw data over satellite networks is prohibitively expensive. Engineers are currently racing to develop ultra-low-power microchips that can run complex AI recognition algorithms directly on the microphone without exhausting the power supply.[4]

Nevertheless, the fusion of artificial intelligence and bioacoustics represents a profound paradigm shift in how humanity interacts with the natural world. By giving ecosystems a voice that we can finally process and understand at a global scale, technology is providing a critical new tool to monitor biodiversity. As these AI models become more sophisticated and accessible, they offer a powerful beacon of hope, ensuring that conservationists can listen to, learn from, and ultimately protect the planet's most vulnerable species before they fall silent.[7]

How we got here

Early 2000s
Passive acoustic monitoring begins gaining traction, but researchers are overwhelmed by the sheer volume of audio data.
2018
Advances in deep learning and neural networks allow computers to accurately classify animal calls using visual spectrograms.
2020
Project CETI is launched to apply advanced natural language processing to decode the communication of sperm whales.
2025
Google DeepMind releases Perch 2.0, an open-source AI model capable of identifying thousands of species in messy acoustic environments.

Viewpoints in depth

Conservation Technologists

Researchers focused on scaling AI models and hardware to process global biodiversity data.

For technologists and computer scientists, the primary challenge of the biodiversity crisis is a data processing bottleneck. They argue that the natural world produces too much information for human analysts to manually categorize. By open-sourcing models like Perch 2.0 and developing ultra-low-power edge computing devices, this camp believes that artificial intelligence is the only viable mechanism to monitor global ecosystems in real-time. Their focus is on improving algorithmic accuracy, reducing battery consumption in remote hardware, and building massive, shared databases of acoustic signatures.

Field Ecologists

Biologists who view AI as a powerful tool but emphasize the need for physical conservation interventions.

While field ecologists welcome the unprecedented data provided by bioacoustics, they caution against viewing artificial intelligence as a panacea for the extinction crisis. This camp emphasizes that knowing an endangered species exists in a specific forest does not automatically protect it from habitat destruction or climate change. They argue that AI monitoring must be tightly coupled with on-the-ground policy enforcement, anti-poaching patrols, and community-led conservation efforts. For these scientists, the technology is only as useful as the physical interventions it enables.

Bioacoustics Researchers

Scientists focused on decoding the nuances of animal communication and intergenerational culture.

For researchers involved in projects like CETI, the value of AI extends far beyond simply counting animal populations. This camp is focused on the qualitative aspects of bioacoustics—using machine learning to decode the actual meaning, dialects, and social structures embedded within animal vocalizations. By identifying phonetic alphabets in whale codas or stress markers in bird calls, they argue that AI can reveal the rich inner lives and intergenerational cultures of animals, fundamentally shifting how humanity relates to the natural world.

What we don't know

How effectively AI models trained in one specific forest or ocean can generalize to entirely different, untested ecosystems.
Whether the discovery of complex animal dialects and alphabets will eventually lead to true two-way interspecies communication.
How to sustainably power millions of remote acoustic sensors globally without creating a new stream of electronic waste.

Key terms

Bioacoustics: The scientific study of sound production, transmission, and reception in animals, used to monitor wildlife populations and behavior.
Passive Acoustic Monitoring (PAM): The use of autonomous recording devices left in the field to capture environmental sounds over long periods without human interference.
Spectrogram: A visual representation of the spectrum of frequencies of a sound as it varies with time, which AI models use to identify specific animal calls.
Coda: A distinct pattern of rhythmic clicks used by sperm whales to communicate with one another in the deep ocean.
Edge Computing: Processing data locally on the device where it is collected (like a remote microphone) rather than sending it to a centralized cloud server, saving transmission bandwidth.

Frequently asked

What is passive acoustic monitoring (PAM)?

PAM is a non-invasive method of studying wildlife by placing remote recording devices in habitats to capture continuous environmental audio, known as soundscapes, without disturbing the animals.

How does AI identify different animal sounds?

Artificial intelligence converts raw audio recordings into visual graphs called spectrograms. Deep neural networks then analyze these visual patterns to match them against known acoustic signatures of specific species.

Can artificial intelligence translate whale language?

While full translation is still in its infancy, AI has helped researchers at Project CETI identify a 'phonetic alphabet' of 156 distinct clicks, or codas, used by sperm whales, and can distinguish between different whale dialects.

What are the main limitations of bioacoustics?

Current challenges include filtering out overlapping background noise, the massive battery and data storage requirements for remote recording devices, and a lack of labeled audio data for rare species.

Sources

[1]MDPIBioacoustics Researchers
A Methodological Literature Review of Acoustic Wildlife Monitoring Using Artificial Intelligence Tools
Read on MDPI →
[2]Project CETIBioacoustics Researchers
Listen to the whales: Applying advanced machine learning to translate sperm whale communication
Read on Project CETI →
[3]Google DeepMindConservation Technologists
How AI is helping advance the science of bioacoustics to save endangered species
Read on Google DeepMind →
[4]Horizon MagazineConservation Technologists
AI listens in to help protect wildlife
Read on Horizon Magazine →
[5]Arthur Rylah InstituteField Ecologists
Wildlife call recognition using artificial intelligence
Read on Arthur Rylah Institute →
[6]National University of SingaporeField Ecologists
Listening to the rainforest: NUS researcher uses AI to monitor biodiversity through sound
Read on National University of Singapore →
[7]Factlen Editorial Team
Synthesis by Factlen editorial team
Read on Factlen Editorial Team →

Stay informed

Every angle. Every day.

Get environment stories with full source coverage and perspective breakdowns delivered to your inbox.

Get the briefing →Browse environment