What is the Nemotron family?

NVIDIA\u2019s family of open-weight LLMs. The 2026 lineup includes Llama 3.1 Nemotron Ultra 253B (flagship reasoning), Llama 3.3 Nemotron Super 49B (mid-size), Llama 3.2 Nemotron Mini 4B (edge), and Llama 3 Nemotron Nano 8B (small). Most are finetunes of Llama base models with NVIDIA reasoning, instruction, and tool-use training added.

What is NVIDIA Cosmos?

A family of world foundation models released January 2025 for robotics and autonomous systems. Cosmos generates physically realistic video from text or image prompts and reasons over physical-world dynamics. Used by robotics labs (1X, Agility, Skild AI, Physical Intelligence) for synthetic data generation and policy training.

Is NV-Embed-v2 the best embedding model?

On the MTEB v2 English leaderboard among open-weight embedders, NV-Embed-v2 is at the top with approximately 72.3 percent average. However, the CC-BY-NC licence restricts commercial deployment. For commercial use, BGE-M3 (MIT) or Qwen3-Embedding-8B (Tongyi Qianwen) are the dominant alternatives.

Can I deploy Nemotron models commercially?

Most Nemotron models use the NVIDIA Open Model Licence which permits commercial use subject to compliance with the underlying Llama Community Licence (where applicable). The licence is permissive for typical commercial deployment but has some restrictions on competitive AI service use.

Why does NVIDIA release open-weight models?

Strategic tie-in to GPU consumption. NVIDIA models are optimised for NIM microservices, TensorRT-LLM serving, and Triton Inference Server, creating a tight integration between open-weight model release and NVIDIA GPU and software stack utilisation. The open releases position NVIDIA as the default infrastructure for the open-weight AI ecosystem.

NVIDIA Nemotron and Cosmos Open Releases 2026

NVIDIA ships open-weight models as a software-and-services adjacency to the GPU business. The 2026 NVIDIA open-weight catalogue includes the Nemotron family (Nano, Mini, Super, Ultra variants), the Cosmos world foundation models for robotics, the Llama-Nemotron specialised finetune family, Eagle vision-language models, plus dozens of specialised models for synthetic data, agentic workflows, and ASR. This page consolidates the catalogue.

Key Findings

Llama 3.1 Nemotron Ultra 253B is NVIDIA\u2019s flagship 2026 open-weight model, a 253B-parameter dense reasoning model derived from Llama 3.1 with strong AIME and GPQA scores.
Llama 3.3 Nemotron Super 49B is NVIDIA\u2019s most-deployed mid-size open model with strong instruction following and tool use, competing directly with Qwen 32B and Llama 3.1 70B.
NVIDIA Cosmos (released January 2025) is a family of world foundation models for robotics and autonomous systems, generating physically realistic video from text or image prompts plus reasoning over physical-world dynamics.
Eagle 2 and Eagle 2.5 are NVIDIA\u2019s vision-language model families covering general VLM and document understanding workloads with open weights.
The NV-Embed-v2 embedding model holds the top position on the MTEB v2 leaderboard among open-weight English embedders; combined with related Nemotron variants for data generation, NVIDIA has a complete open AI primitive stack.

NVIDIA Open-Weight Model Catalogue (May 2026)

Family	Variant	Capability	License
Nemotron 4 / Nemotron 5	340B base / 15B / various	General-purpose	NVIDIA Open Model Licence
Llama 3.1 Nemotron Ultra 253B	~253B	Reasoning, agentic	NVIDIA Open Model Licence + Llama 3.1 Community
Llama 3.3 Nemotron Super 49B	~49B	General-purpose mid-size	NVIDIA Open Model Licence + Llama 3.3 Community
Llama 3.2 Nemotron Mini 4B	~4B	Edge-deploy text	NVIDIA Open Model Licence + Llama 3.2 Community
Llama 3 Nemotron Nano 8B	~8B	General-purpose small	NVIDIA Open Model Licence + Llama 3 Community
Cosmos Predict 1.0 (14B / 7B)	~14B / ~7B	World model video generation	NVIDIA Open Model Licence
Cosmos Reason 1.0	~7B	Physical-world reasoning	NVIDIA Open Model Licence
Cosmos Transfer 1.0	~7B	Video-to-video transformation	NVIDIA Open Model Licence
Eagle 2 (varies)	~varies	Vision-language	NVIDIA Open Model Licence
NV-Embed-v2	~7B	Embedding	CC-BY-NC
Canary 1B	~1B	ASR English-focused	CC-BY-4.0
Parakeet TDT 1.1B	~1.1B	ASR streaming	CC-BY-4.0
NVLM 1.0 72B	~72B	Vision-language	CC-BY-NC + Research
Nemotron-CC pretraining dataset	~6.3T tokens	Open pretraining data	Multi-licence (curated open)

Cosmos Detail

Cosmos Variant	Function
Cosmos Predict	Text or image to video; world-model future-state prediction
Cosmos Reason	Reasoning over physical-world inputs including video
Cosmos Transfer	Video-to-video transformation with control conditioning
Cosmos Tokenizer	Visual tokenisation for downstream training
Cosmos Curator	Data curation for robotics training

Cosmos is positioned for robotics, autonomous vehicles, and embodied AI use cases where physically-realistic world simulation matters more than generic video aesthetic quality.

Strategic Context

Three patterns define NVIDIA\u2019s 2026 open-weight strategy. First, the GPU-platform tie-in: NVIDIA releases open models that are optimised for NVIDIA NIM microservices, TensorRT-LLM serving, and Triton Inference Server, creating a tight integration between model release and GPU consumption. Second, the specialised model catalogue: NVIDIA releases dozens of specialised models for ASR, embedding, reranking, synthetic data, and agent workflows rather than competing on frontier language model benchmarks. Third, the Cosmos world model bet: by releasing world foundation models open-weight, NVIDIA positions the robotics and physical AI ecosystem to compound on NVIDIA-platform tools.

Brand Visibility Implications

NVIDIA open-weight releases are a high-citation procurement category for AI infrastructure decisions. AI assistant queries about "NVIDIA Nemotron vs Llama", "Cosmos world model robotics", "NV-Embed-v2 vs OpenAI", and similar terms drive direct procurement decisions. Brands selling AI infrastructure, robotics AI, NIM-adjacent products, and NVIDIA partner services face strong AI-mediated discovery surface for this category.

Methodology

Model data compiled from NVIDIA Hugging Face disclosures and NVIDIA developer documentation through 23 May 2026. Updated quarterly.

How Presenc AI Helps

Presenc AI monitors brand visibility on NVIDIA Nemotron, Cosmos, and adjacent open-weight queries across ChatGPT, Claude, Gemini, and Perplexity. For AI infrastructure brands, robotics AI vendors, NIM-adjacent products, and NVIDIA partner services, the platform identifies the prompts driving procurement-research traffic and the gaps where new content unlocks share of voice.