NVIDIA Unveils Nemotron 3 Open Models to Power Multi-Agent AI Systems

NVIDIA on Monday announced the NVIDIA Nemotron 3 family of open models, datasets and libraries aimed at supporting the development of transparent and efficient multi-agent AI systems across industries.

The Nemotron 3 lineup includes Nano, Super and Ultra models built on a hybrid latent mixture-of-experts (MoE) architecture, which NVIDIA says is designed to reduce inference costs, limit context drift and improve coordination among multiple AI agents.

“Open innovation is the foundation of AI progress,” NVIDIA founder and CEO Jensen Huang said. “With Nemotron, we’re transforming advanced AI into an open platform that gives developers the transparency and efficiency they need to build agentic systems at scale.”

Among the three models, Nemotron 3 Nano is available immediately. It is a 30-billion-parameter model that activates up to 3 billion parameters per task and is optimised for low-cost inference use cases such as software debugging, summarisation and AI assistants. NVIDIA said the model delivers up to four times higher token throughput than Nemotron 2 Nano and reduces reasoning token generation by up to 60%.

It is available on Hugging Face and through inference providers such as Baseten, DeepInfra, Fireworks, FriendliAI, OpenRouter and Together AI. The model is also offered as an NVIDIA NIM microservice for deployment on NVIDIA-accelerated infrastructure.

Nemotron 3 Nano will also be available on AWS via Amazon Bedrock and supported on multiple cloud platforms in the coming months.

On the other hand, Nemotron 3 Super is a roughly 100-billion-parameter model, designed for multi-agent applications requiring low latency, while Nemotron 3 Ultra, with about 500 billion parameters, is intended for deep reasoning and long-horizon planning tasks.

Both Super and Ultra use NVIDIA’s 4-bit NVFP4 training format on Blackwell GPUs to reduce memory requirements. These models are expected to be available in the first half of 2026.

The launch comes as companies move beyond single AI chatbots toward collaborative agent-based systems, where multiple models work together on complex workflows.

According to NVIDIA, Nemotron 3 allows developers to route tasks between frontier proprietary models and open Nemotron models within the same workflow to balance reasoning capability and cost efficiency.

NVIDIA said the Nemotron 3 family also aligns with its sovereign AI strategy, allowing governments and enterprises to deploy models tailored to local data, regulations and policy requirements. Organisations across Europe and South Korea are among those adopting the open models, the company said.

Several enterprise customers and partners, including Accenture, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, Synopsys and Zoom, are integrating Nemotron models into AI workflows spanning manufacturing, cybersecurity, software development and communications.

Perplexity CEO Aravind Srinivas said the company is using Nemotron within its agent routing system to optimise performance. “We can direct workloads to fine-tuned open models like Nemotron 3 Ultra or use proprietary models when tasks require it,” he said.

Alongside the models, NVIDIA released three trillion tokens of pretraining, post-training and reinforcement learning datasets, including an Agentic Safety Dataset for evaluating multi-agent systems. The company also open-sourced NeMo Gym, NeMo RL and NeMo Evaluator to support training, customisation and evaluation of agentic AI.

ALSO READ: Broadcom Reveals $21 Billion Google TPUs Order from Anthropic

Join Our Core Community

Intent: The Missing Data Layer in Generative AI

The Death of the Generalist: 5 Specialised Copilots Rewriting the Enterprise Stack

Building an AI-Ready Leadership Culture: Inside Raja Sampathi’s Transformation Framework

Why Frontline Knowledge Beats AI Expertise: Rethinking Enterprise AI Talent

Overcommitment to the Cloud Means Enterprises Could Miss an AI Edge

Data as the New Diagnostic: How Ahead Health is Turning Algorithms Into Preventive Care

Why Data Leaders Are Wary of a Synthetic Future

Is Your Enterprise Data Stack Ready for Agentic AI? 10 Signs to Check

2025’s Top 16 Acquisitions in AI & Data

Geopatriation for Cloud Sovereignty: Why 75% Are Moving Home by 2030

NTT DATA, AWS Partner to Accelerate Enterprise Cloud, Agentic AI Adoption

ServiceNow to Deploy Anthropic’s Claude For 29,000 Employees

Google Brings Gemini-Powered AI to Chrome

Anthropic Adds Interactive Work Tools Inside Claude

SoftBank To Invest an Additional $30 Bn in OpenAI: Reports

NVIDIA Unveils Nemotron 3 Open Models to Power Multi-Agent AI Systems

Perplexity CEO Aravind Srinivas said the company is using Nemotron within its agent routing system to optimise performance.

NTT DATA, AWS Partner to Accelerate Enterprise Cloud, Agentic AI Adoption

ServiceNow to Deploy Anthropic’s Claude For 29,000 Employees

Unpack More

NVIDIA Unveils Open Earth-2 Models for AI Weather Forecasting

NVIDIA Invests $150 Million in AI Inference Startup Baseten

Groq to Now Help NVIDIA Build Inference Tech

Starcloud Becomes First to Train LLMs in Space Using NVIDIA H100

AI & Data Insider’s Contributors’ Circle: Meet 2025’s Leading Voices

How AI is Finally Repealing Biology’s Most Expensive Law

Are Static Benchmarks for LLMs Giving a False Sense of Security?