NVIDIA Releases Physical AI Focused Open Source AI Models

NVIDIA has released Cosmos 3, a new open-source foundation model for physical AI that combines physical reasoning, world generation and action generation within a single architecture.

The company is making Cosmos 3 Nano (8B parameters) and Cosmos 3 Super (32B parameters) available, along with training scripts, deployment tools, model checkpoints, and six synthetic datasets for robotics, autonomous driving, warehouse automation, and other physical AI applications.

The release consolidates capabilities that were previously spread across separate Cosmos models. Cosmos 3 uses a Mixture-of-Transformers (MoT) architecture built around two interconnected components: a vision-language ‘Reasoner’ tower and a diffusion-based ‘Generator’ tower.

The Reasoner processes multimodal inputs, including text, images, videos, audio and actions to understand physical environments, while the Generator produces future observations and action sequences conditioned on that understanding.

According to NVIDIA, the architecture is designed to eliminate the need for orchestration across multiple models and inference pipelines.

The Reasoner can operate independently for perception and analysis tasks, while generation workloads activate both towers, allowing the model to combine scene understanding with predictive world modelling and action generation.

The company is releasing two model sizes targeting different deployment environments. Cosmos 3 Nano, with 8 billion parameters, is designed for workstation-class systems, including NVIDIA RTX PRO 6000 GPUs, and for real-time robotics applications.

Cosmos 3 Super, at 32 billion parameters, targets datacenter deployments on Hopper and Blackwell GPU platforms for synthetic data generation and large-scale physical reasoning workloads. It supports multiple input-output combinations, covering text-to-image generation, video prediction, video reasoning, action-conditioned world modelling and robot policy learning.

NVIDIA positions the model as a foundation for applications including robotic manipulation, autonomous driving systems, warehouse monitoring, smart spaces and embodied AI agents.

Alongside the models, NVIDIA is open-sourcing six synthetic data generation datasets through Hugging Face. The datasets cover embodied robot scenes, physical interaction simulations, spatial reasoning tasks, digital human environments, autonomous driving scenarios and warehouse operations.

Some datasets include physics annotations, such as object velocities, centre-of-mass displacements, and semantic segmentation labels, intended for post-training and evaluation workflows.

NVIDIA is also releasing post-training recipes covering supervised fine-tuning and action-oriented training. The workflows allow developers to adapt Cosmos 3 to domain-specific datasets and use cases, including forward-dynamics prediction, inverse-dynamics modelling, and policy generation for robotics systems.

For deployment, Cosmos 3 is available through NVIDIA NIM microservices. The initial release includes the Cosmos 3 Reasoner NIM, while a Generator NIM is planned.

NVIDIA also introduced Cosmos Human Evaluation (HUE), an open-source benchmark framework that evaluates generated videos using binary fact-verification questions across semantic alignment, physical laws, geometric reasoning and visual integrity.

The framework is intended to provide a more granular assessment of physical AI video generation systems than existing leaderboard-based evaluations.

According to NVIDIA, Cosmos 3 leads its parameter classes on VANTAGE-Bench and ranks at or near the top of several public physical AI and video-generation benchmarks, including PAI-Bench, R-Bench, Physics-IQ and RoboLab.

The company also said Cosmos 3 currently ranks as the leading open-source model on selected image and video-generation leaderboards tracked by Artificial Analysis.

ALSO READ: Alteryx Inspire 2026: Three Questions Every Data Leader Should Take to Orlando

Join Our Core Community

CEOs, AI and the New Burden of Knowing Enough

Why Data Sovereignty Is Becoming an Enterprise AI Control Problem

This Startup Went from a Team of 20 to 6. Yet, Humans are their Most Valued Asset.

From Generic Models to Living Twins: A Practitioner’s Guide to ML in Design Workflows

Designing AI‑Ready Public Infrastructure: Global Lessons from India’s Aadhaar Builder

Banks Are Drowning in Data and Starving for Insight

Unstructured Data, Deterministic Answers

Data Layer Precedes Compute, GPU Capacity in Sovereign AI

Why Data Reliability Now Governs Scaling GenAI

Cloud 3.0 and Data Sovereignty: Why Workload Placement Is Now a Strategic Decision

Samsung Sets Up Robotics Division to Drive Humanoid Push: Report

Microsoft & Mistral Announce Multibillion-Dollar AI Infrastructure Partnership in Europe

Google to Develop Frozen v2 Chip to Improve Gemini Efficiency

Google Launches Mythos Rival Flash Cyber

Databricks Secures Fresh Funding at $188 Bn Valuation

NVIDIA Releases Physical AI Focused Open Source AI Models

Alongside the models, NVIDIA is open-sourcing six synthetic data generation datasets through Hugging Face.

Samsung Sets Up Robotics Division to Drive Humanoid Push: Report

Microsoft & Mistral Announce Multibillion-Dollar AI Infrastructure Partnership in Europe

Unpack More

Mistral Bets Furthers Physical AI With New Autonomous Robot Navigation Model

NVIDIA Introduces Revenue-Sharing Model for AI Cloud Partners

Palantir to Bring NVIDIA’s Nemotron Models to Sovereign Environments

Accenture to Scale Digital Twins Across Unilever’s Global Manufacturing Network

Why Data Reliability Now Governs Scaling GenAI

Middle East: The Sovereign AI Testbed US, EU and Asia Can Learn From

NVIDIA’s VP of Solutions Architecture on What It Actually Takes to Build a Sovereign AI Factory