Robot DexterityExplainerJun 12, 2026, 10:06 AM· 4 min read· #10 of 76 in technology

Nvidia and RLWRLD Launch DexBench, an Open-Source Standard to Solve Robot Dexterity

A new open-source benchmark aims to give the robotics industry a universal yardstick for measuring how well humanoid robots use their hands, bridging the gap between simulation and factory floors.

By Factlen Editorial Team

Share this story

Physical AI Developers 40%Open-Source Advocates 35%Industrial Integrators 25%

Physical AI Developers: Companies building the foundation models and hardware, advocating for standardized benchmarks to prove their systems' capabilities.
Open-Source Advocates: Researchers and platforms pushing to democratize robotics by making training data and simulation tools freely available.
Industrial Integrators: Enterprise buyers and logistics operators who need reliable, vendor-agnostic metrics before deploying robots at scale.

What's not represented

· Labor unions concerned about the rapid deployment of highly dexterous robots in manufacturing.

Why this matters

As robots move from research labs to real-world factories, the ability to accurately measure and share how they use their hands determines how quickly they can take over complex, fine-motor tasks. An open-source standard prevents vendor lock-in, allowing companies to mix and match robotic hardware while using the same training data.

Key points

RLWRLD and NVIDIA have launched DexBench, an open-source benchmark for evaluating humanoid robot dexterity.
The framework tests 18 atomic tasks across five core domains, including spatial precision and grasp diversity.
DexBench integrates directly with NVIDIA's Isaac Lab to help bridge the gap between virtual simulations and physical deployment.
The cost of collecting robot training data has plummeted 60% since 2024, accelerating open-source development.
Standardized metrics allow enterprise buyers to objectively compare robotic hardware from different vendors.

Atomic tasks in DexBench

Core dexterity domains evaluated

$23B

Humanoid investment in 2026

−60%

Drop in data collection costs since 2024

In 2026, the robotics industry has largely solved the problem of getting machines to walk. Humanoid robots can navigate factory floors, climb stairs, and maintain their balance when shoved. Yet, despite a record $23 billion poured into humanoid development this year, the industry's most advanced machines still struggle with a task most toddlers master: picking up an irregularly shaped object without dropping or crushing it.[2]

Dexterous manipulation—the ability to perform fine motor skills like precision assembly, sorting, and packaging—has emerged as the decisive frontier in physical artificial intelligence. But progress has been bottlenecked by a fundamental measurement problem. Until now, comparing the dexterity of one robot to another has been an "apples-to-oranges" exercise, with vendors using incompatible metrics and proprietary testing environments to claim superiority.[1][4][5]

That fragmentation is beginning to end. On Tuesday, Seoul-based physical AI startup RLWRLD and computing giant NVIDIA launched DexBench, a universal, open-source benchmark and data standard designed to evaluate humanoid robot dexterity. The initiative aims to give researchers, manufacturers, and enterprise buyers a shared yardstick for skills that have historically been measured ad hoc.[1][2][4]

DexBench establishes a rigorous evaluation framework built around five core domains: grasp diversity, spatial precision, temporal precision, contact precision, and context awareness. These domains are tested across 18 specific "atomic tasks" drawn directly from real-world industrial environments, ranging from opening cabinet doors to pouring liquids and manipulating delicate components.[1][2][3]

The DexBench framework evaluates robot hands across five core domains of dexterity.

"Without a shared language for measuring and reproducing the precise movements of a robot hand, the commercial potential of dexterity AI remains constrained," said Junghee Ryu, CEO of RLWRLD. By establishing a common data standard, the initiative attempts to move the industry beyond isolated model development and toward a unified infrastructure.[1][4][5]

A critical feature of DexBench is its deep integration with NVIDIA's open Isaac Lab and Isaac Lab-Arena frameworks. This dual-validation setup allows developers to run the exact same evaluation suite in a virtual simulation and on physical hardware. Bridging this "sim-to-real gap" is essential, as models that perform flawlessly in controlled digital environments often fail when confronted with the unpredictable lighting, friction, and physics of the real world.[1][2][3][4]

A critical feature of DexBench is its deep integration with NVIDIA's open Isaac Lab and Isaac Lab-Arena frameworks.

The push for standardization reflects a broader, rapid maturation of the open-source robotics ecosystem in 2026. Over the past year, the underlying economics of robot training have shifted dramatically. The cost of collecting high-quality teleoperation data—where human operators guide robots to demonstrate tasks—plummeted by 60% between 2024 and late 2025, dropping to roughly $118 per hour.[6][8]

The cost of collecting high-quality robot training data has fallen sharply, accelerating open-source development.

This collapse in data costs has fueled a surge in open-source development. Platforms like Hugging Face's LeRobot and datasets from institutions like the Silicon Valley Robotics Center have democratized access to sophisticated training tools. A capable robotic manipulation model that previously required enterprise-level compute and proprietary infrastructure can now be fine-tuned on a mid-range workstation using publicly available data.[6][8]

The open-source momentum is global. At the recent International Conference on Robotics and Automation (ICRA) in Vienna, Chinese robotics firm AGIBOT released its own full-stack open-source toolchain, including the AGIBOT WORLD dataset and Genie Sim 3.0 evaluation environment. Like DexBench, AGIBOT's initiative focuses on standardizing metrics and providing comparable results across simulation and physical testing.[7]

For enterprise buyers—ranging from automotive manufacturers to logistics giants—these open standards are not just academic exercises; they are commercial necessities. As robots transition from research novelties to deployed workforce assets, operators need reliable ways to compare hardware from different vendors. An open benchmark allows a logistics manager to objectively evaluate whether a sub-$10,000 robotic arm from a new startup can perform a sorting task as reliably as a legacy industrial system.[2][5][8]

Standardized benchmarks allow enterprise buyers to confidently deploy robots alongside human workers.

Furthermore, shared data standards mean that training data is no longer locked to a specific piece of hardware. If a company spends thousands of hours training a robot to assemble a circuit board, open-source compatibility ensures that the resulting "policy" can be transferred to a different robot model with minimal friction. This portability is driving the adoption of Vision-Language-Action (VLA) models, which now power 40% of new robotic deployments.[2][8]

Despite the rapid progress in software and standardization, significant challenges remain. The performance gap between closed commercial pilots and reproducible public benchmarks is still wide, particularly for long-horizon tasks that require multiple sequential steps. While pick-and-place operations are largely solved, maintaining reliability over extended, complex workflows in chaotic environments continues to test the limits of current foundation models.[6][7]

Ultimately, initiatives like DexBench and AGIBOT's open toolchains signal that the humanoid robotics industry is entering its deployment phase. By replacing proprietary claims with verifiable, open-source yardsticks, the sector is building the necessary infrastructure to scale physical AI out of the laboratory and into the global supply chain.[3][4][5][7]

How we got here

Early 2024
Teleoperation data collection costs average $340 per hour, limiting open-source dataset growth.
April 2025
Hugging Face acquires Pollen Robotics, signaling a major push into open-source hardware.
March 2026
The cost of high-quality teleoperation data falls to $118 per hour, fueling a surge in open-source model training.
June 2026
RLWRLD and NVIDIA launch DexBench to standardize dexterity evaluation across the industry.

Viewpoints in depth

Physical AI Developers

Companies building the foundation models and hardware, advocating for standardized benchmarks to prove their systems' capabilities.

For startups like RLWRLD and giants like NVIDIA, the lack of a universal benchmark has been a commercial bottleneck. When every robotics company uses its own proprietary tests to claim their robot has the 'best' hands, enterprise buyers remain skeptical. By creating an open standard like DexBench, these developers hope to replace marketing claims with verifiable data. They argue that integrating these benchmarks directly into simulation engines like Isaac Lab will drastically accelerate the training loop, allowing the entire industry to iterate on hand designs and control policies faster.

Open-Source Advocates

Researchers and platforms pushing to democratize robotics by making training data and simulation tools freely available.

The open-source community views standardized benchmarks as the key to breaking the monopoly of heavily funded tech giants. Organizations like the Silicon Valley Robotics Center and platforms like Hugging Face argue that robotics should follow the trajectory of software engineering, where shared libraries and open datasets drive collective progress. They emphasize that standardizing how data is formatted and measured allows a breakthrough in one university lab to be instantly tested and adopted by developers worldwide, preventing duplicated effort and lowering the barrier to entry for new startups.

Industrial Integrators

Enterprise buyers and logistics operators who need reliable, vendor-agnostic metrics before deploying robots at scale.

For the companies actually buying and deploying these robots—such as automotive OEMs and warehouse operators—the priority is reliability and interoperability. Industrial integrators are frustrated by 'vendor lock-in,' where a robot trained for a specific task cannot be easily replaced by a competitor's machine without starting the training process from scratch. They view open standards as a critical insurance policy. If dexterity metrics and data formats are standardized, buyers can confidently mix and match hardware from different manufacturers, knowing the software policies will transfer seamlessly.

What we don't know

Whether legacy industrial robotics companies will adopt open-source benchmarks or stick to proprietary metrics.
How quickly models trained on DexBench tasks in simulation will adapt to the unpredictable edge cases of live factory floors.

Key terms

Dexterous Manipulation: The ability of a robot to perform fine-grained, complex tasks with its hands, such as precision assembly or sorting.
Sim-to-Real Gap: The difference in performance when an AI model trained in a virtual simulation is deployed on a physical robot in the real world.
Vision-Language-Action (VLA) Model: An AI architecture that processes visual inputs and text commands to directly output physical movements for a robot.
Teleoperation Data: Training data collected by having a human remotely control a robot to demonstrate how a task should be performed.

Frequently asked

Why is robot dexterity so difficult to achieve?

While robots excel at rigid, repetitive motions, human-like hands require adapting to unpredictable shapes, weights, and textures in real-time, which is computationally complex.

What exactly does DexBench measure?

It evaluates five domains—grasp diversity, spatial precision, temporal precision, contact precision, and context awareness—across 18 specific industrial tasks.

How does this help open-source robotics?

By providing a shared data standard and evaluation framework, researchers worldwide can compare models accurately and share training data without hardware lock-in.

Sources

[1]PR NewswirePhysical AI Developers
RLWRLD Launches DexBench Initiative in Collaboration with NVIDIA
Read on PR Newswire →
[2]The Robotics MediaPhysical AI Developers
RLWRLD, NVIDIA Launch DexBench
Read on The Robotics Media →
[3]News1Industrial Integrators
RealWorld (RLWRLD) joins forces with NVIDIA to standardize humanoid hands
Read on News1 →
[4]Robotics and Automation NewsIndustrial Integrators
RLWRLD, Nvidia launch initiative to develop next-generation industry standards
Read on Robotics and Automation News →
[5]Brief GlancePhysical AI Developers
The New Gold Standard: How NVIDIA and a Startup Are Defining Robot Dexterity
Read on Brief Glance →
[6]Tech TimesOpen-Source Advocates
The Open Source Robot Learning Stack Becomes Production-Grade
Read on Tech Times →
[7]The Robot ReportOpen-Source Advocates
AGIBOT releases full-stack toolchain for robot validation
Read on The Robot Report →
[8]Silicon Valley Robotics CenterOpen-Source Advocates
SVRC Robot Benchmarks 2026
Read on Silicon Valley Robotics Center →

Up next

Enterprise AI

The End of 'Tokenmaxxing': Why Enterprise AI is Shifting to Model Routing

Microsoft CEO Satya Nadella is urging the tech industry to stop using massive, expensive AI models for simple tasks. The enterprise focus is now shifting toward 'model routing' and Small Language Models to make AI economically sustainable.

Every angle. Every day.

Get technology stories with full source coverage and perspective breakdowns delivered to your inbox.

Get the briefing →Browse technology