Factlen ExplainerBiomedical AIEvidence PackJun 29, 2026, 11:40 PM· 5 min read

AI Model 'MAMMAL' Outperforms AlphaFold 3, Signaling a New Era for Multimodal Drug Discovery

IBM Research has released MAMMAL, a multimodal foundation model trained on 2 billion biological samples that achieves state-of-the-art results across nine drug discovery benchmarks. By integrating proteins, small molecules, and gene expression data, the open-source model outperformed AlphaFold 3 in specific antibody-binding tests, offering a new unified tool for biomedical research.

By Factlen Editorial Team

Share this story

Computational Biologists 35%Open-Source Advocates 25%Structural Biologists 20%Clinical Translators 20%

Computational Biologists: Argue that unified, multi-modal architectures are the future of drug discovery, moving beyond single-task models.
Open-Source Advocates: Emphasize the importance of releasing model weights and code publicly to democratize biotech research.
Structural Biologists: Maintain that while multimodal models excel at classification, dedicated structural models like AlphaFold remain essential for understanding physical mechanisms.
Clinical Translators: Focus on the reality that all computational predictions must ultimately survive rigorous wet-lab validation and human trials.

What's not represented

· Regulatory Agencies
· Patients Awaiting Novel Therapies

Why this matters

Drug discovery is notoriously slow and expensive because biological data—from proteins to gene expression—is highly fragmented. By providing an open-source AI model that understands multiple biological 'languages' at once, researchers can accelerate the early stages of developing new medicines and treatments.

Key points

IBM Research introduced MAMMAL, a multimodal AI model for drug discovery.
The model integrates proteins, antibodies, small molecules, and gene expression data.
MAMMAL achieved state-of-the-art results on 9 out of 11 drug discovery benchmarks.
It outperformed AlphaFold 3 in specific antibody-antigen binding classification tasks.
The model weights and codebase are open-source, lowering the barrier for biotech startups.

2 billion

Biological samples in pre-training data

458 million

Model parameters

9 of 11

Drug discovery benchmarks where MAMMAL achieved State-of-the-Art

5 of 7

Antigen targets where MAMMAL outperformed AlphaFold 3 in binding prediction

The introduction of AlphaFold revolutionized structural biology by predicting the 3D shapes of proteins from their amino acid sequences. However, drug discovery is not merely a structural problem—it is a complex, multi-modal puzzle. A promising therapy must interact with biological targets, affect cellular pathways, avoid toxicity, and perform safely in human biology.[1][7]

To bridge these disparate domains, researchers at IBM have introduced MAMMAL (Molecular Aligned Multi-Modal Architecture and Language), a biomedical foundation model designed to treat diverse biological inputs as parts of a unified computational language. Published in the Nature portfolio journal npj Drug Discovery, MAMMAL represents a shift from specialized, single-task AI tools to a versatile, cross-modal architecture.[1][3]

The evidence supporting MAMMAL’s capabilities is anchored in its massive scale and diverse training data. The model was pre-trained on approximately 2 billion biological samples. This dataset spans four distinct modalities: protein sequences, antibody sequences, small-molecule representations, and gene expression profiles.[2][3][4][5]

By integrating these modalities, MAMMAL operates differently than large language models designed for human text. Instead of conversational prompts, researchers use a structured syntax to input molecular strings, amino acid sequences, or transcriptomic lab tests, allowing the model to learn the complex relationships across different biological domains.[2][3]

The MAMMAL architecture integrates four distinct biological modalities into a single foundation model.

The primary claim of the research is MAMMAL’s performance across a suite of standard drug discovery benchmarks. Evaluated on 11 diverse downstream tasks that span multiple stages of the pharmaceutical pipeline, the model achieved state-of-the-art (SOTA) results on nine of them, while remaining highly competitive on the remaining two.[1][3]

These benchmarks are not purely academic exercises; they represent critical hurdles in drug development. For instance, on molecular toxicity tests like ClinTox and blood-brain barrier penetration (BBBP), MAMMAL achieved Area Under the Receiver Operating Characteristic (AUROC) scores of 0.986 and 0.937, respectively. This represents a measurable improvement over previous leading models like MoLFormer.[1]

MAMMAL achieved state-of-the-art results on 9 out of 11 evaluated drug discovery benchmarks.

The most heavily scrutinized claim in the evidence pack is MAMMAL’s performance relative to Google DeepMind’s AlphaFold 3. In a specific antibody-antigen binding benchmark, fine-tuned MAMMAL prediction scores were compared against AlphaFold 3’s confidence scores, which served as a proxy for binding likelihood.[1][5]

The data revealed that MAMMAL outperformed AlphaFold 3 in binding classification for five out of seven tested antigen targets. On larger, structurally complex targets like CD206 and VWF, MAMMAL demonstrated superior discriminative ability.[1][5]

The data revealed that MAMMAL outperformed AlphaFold 3 in binding classification for five out of seven tested antigen targets.

However, the researchers and independent analysts are careful to contextualize this finding. This result does not suggest that MAMMAL is universally superior to AlphaFold 3. AlphaFold 3 was explicitly designed for structural prediction and maintains an advantage on smaller targets where precise physical geometry is the primary driver of binding.[1]

In specific antibody-antigen binding tests, MAMMAL outperformed AlphaFold 3 on five out of seven targets.

Instead, the evidence indicates that for specific classification tasks—where binding likelihood depends heavily on sequence context and cross-modal interactions—a modality-aligned foundation model can outperform a purely structural system. The consensus among analysts is that they are complementary tools rather than direct replacements.[1][7]

A significant strength of the MAMMAL project is its commitment to transparency and reproducibility. Unlike many commercial biomedical AI breakthroughs that rely on proprietary internal evaluations, IBM Research has made the 458-million-parameter model publicly available.[1][4]

The pretrained model weights are hosted on Hugging Face, and the fine-tuning codebase is accessible via the BiomedSciAI GitHub organization. This open-infrastructure approach allows independent researchers to reproduce the benchmark results, apply the model to proprietary datasets, and independently verify the AlphaFold 3 comparisons.[1][4]

For AI-native biotech startups and academic labs, this open access fundamentally alters the barrier to entry. Small teams working on antibody design or cancer drug response prediction can now leverage a massive foundation model without the prohibitive computational cost of pre-training from scratch.[1][2]

Despite the strong benchmark performance, the evidence pack carries clear limitations and uncertainties. The most critical caveat is that computational benchmarks, no matter how rigorous, do not eliminate the need for physical experimentation.[1][7]

Open-source foundation models allow smaller biotech teams to accelerate early-stage drug discovery without massive computational budgets.

As one industry analysis noted, IBM Research's MAMMAL is not a miracle cure machine. The model can predict toxicity or binding affinity with high statistical accuracy, but drug candidates still fail in clinical trials at notoriously high rates because human biology is vastly more complex than any training dataset.[1]

Furthermore, while MAMMAL integrates four major modalities, it does not yet capture the entirety of a living system's dynamic environment, such as real-time metabolic changes or complex immune system cascades. The predictions remain probabilistic hypotheses that require rigorous wet-lab validation.[2][7]

Looking forward, the introduction of MAMMAL signals a maturation in the field of AI-driven pharmacology. The bottleneck in drug discovery is increasingly not a lack of data, but the fragmentation of that data across isolated computational silos.[1]

By proving that a single, unified architecture can process small molecules, proteins, and gene expression data simultaneously, MAMMAL provides a blueprint for the next generation of biomedical research. It moves the industry one step closer to an integrated computational ecosystem where the language of biology can be translated into viable therapeutics with unprecedented speed.[3][5]

How we got here

2018–2020
DeepMind introduces AlphaFold 1 and 2, solving the 50-year-old protein folding problem using AI.
May 2024
AlphaFold 3 is released, expanding structural predictions to DNA, RNA, and small molecules.
May 2026
IBM Research publishes the MAMMAL paper in npj Drug Discovery, introducing a cross-modal foundation model.
June 2026
MAMMAL's open-source weights and codebase gain traction among biotech startups for fine-tuning specific drug discovery tasks.

Viewpoints in depth

Computational Biologists

Argue that unified, multi-modal architectures are the future of drug discovery.

Researchers in this camp emphasize that biological systems do not operate in isolation. A drug must bind to a protein, alter a cellular pathway, and avoid toxic side effects simultaneously. By treating these diverse inputs as a single computational language, multi-modal models like MAMMAL represent a necessary evolution from single-task AI tools, allowing for more holistic predictions early in the discovery pipeline.

Open-Source Advocates

Emphasize the importance of releasing model weights and code publicly to democratize biotech research.

This perspective highlights the growing divide between proprietary AI systems and open scientific research. Advocates argue that by releasing the 458-million-parameter model on Hugging Face, IBM is enabling smaller biotech startups and academic labs to innovate without needing the massive computational budgets required to pre-train foundation models from scratch.

Structural Biologists

Maintain that dedicated structural models like AlphaFold remain essential for understanding physical mechanisms.

While acknowledging MAMMAL's impressive classification benchmarks, structural biologists caution against viewing it as a replacement for 3D modeling. They argue that understanding the exact physical geometry of how a molecule binds to a target—which AlphaFold 3 excels at—is still crucial for rational drug design and optimizing therapies for specific physical interactions.

Clinical Translators

Focus on the reality that all computational predictions must ultimately survive rigorous wet-lab validation.

Professionals focused on clinical trials and regulatory approval maintain a skeptical optimism. They point out that while AI can drastically narrow down the pipeline of candidate molecules, it cannot simulate the full complexity of a living human body. High benchmark scores do not guarantee clinical efficacy, and the true test of models like MAMMAL will be their success rate in producing FDA-approved therapies.

What we don't know

How MAMMAL's predictions will ultimately translate to success rates in late-stage human clinical trials.
Whether the model can accurately predict complex, cascading immune system reactions to novel biologics.
How quickly regulatory agencies will adapt to evaluating drug candidates generated by multi-modal AI systems.

Key terms

Foundation Model: A large-scale AI model trained on a vast quantity of unlabeled data that can be adapted (fine-tuned) for a wide range of specific tasks.
Multimodal AI: An artificial intelligence system capable of processing and integrating multiple different types of data simultaneously.
AlphaFold 3: A highly advanced AI system developed by Google DeepMind that predicts the 3D structure of proteins and their interactions with other molecules.
Wet-Lab Validation: The process of testing computational predictions in a physical laboratory using actual biological samples and chemicals.

Frequently asked

Does MAMMAL replace AlphaFold 3?

No. MAMMAL and AlphaFold 3 are complementary tools. AlphaFold excels at predicting physical 3D structures, while MAMMAL is designed to analyze cross-modal relationships and sequence contexts.

What makes MAMMAL different from previous models?

Earlier models typically focused on one type of data, like only proteins or only small molecules. MAMMAL integrates four different biological modalities into a single computational language.

Is MAMMAL available to the public?

Yes. IBM Research has released the pre-trained model weights on Hugging Face and the fine-tuning code on GitHub for researchers to use.

Will this model cure diseases on its own?

No. While it significantly accelerates the early stages of drug discovery, all AI-generated candidates still require extensive physical testing and clinical trials.

Sources

[1]Startup FortuneOpen-Source Advocates
IBM Research's MAMMAL AI Model Outperforms AlphaFold 3 in Drug Discovery Benchmarks
Read on Startup Fortune →
[2]Cure Cancer With AIClinical Translators
A Plain-English Guide to IBM's Biomedical Foundation Model
Read on Cure Cancer With AI →
[3]arXivComputational Biologists
MAMMAL -- Molecular Aligned Multi-Modal Architecture and Language
Read on arXiv →
[4]Hugging FaceOpen-Source Advocates
ibm/biomed.omics.bl.sm.ma-ted-458m
Read on Hugging Face →
[5]ResearchGateComputational Biologists
MAMMAL: A foundation model for cross-modal learning in drug discovery
Read on ResearchGate →
[6]DVNXStructural Biologists
MAMMAL multi-modal AI model outperforms AlphaFold 3 in drug discovery benchmarks
Read on DVNX →
[7]Factlen Editorial TeamComputational Biologists
Synthesis by Factlen editorial team
Read on Factlen Editorial Team →

Stay informed

Every angle. Every day.

Get data analysis stories with full source coverage and perspective breakdowns delivered to your inbox.

Get the briefing →Browse data analysis