Research

NVIDIA Nemotron and Cosmos Open Releases 2026

NVIDIA 2026 open-weight releases: Nemotron family (Nano, Mini, Super, Ultra), Cosmos world foundation models, Llama-Nemotron specialised finetunes, Eagle vision models, plus the broader NVIDIA AI catalog.

By Ramanath, CTO & Co-Founder at Presenc AI · Last updated: May 2026

NVIDIA ships open-weight models as a software-and-services adjacency to the GPU business. The 2026 NVIDIA open-weight catalogue includes the Nemotron family (Nano, Mini, Super, Ultra variants), the Cosmos world foundation models for robotics, the Llama-Nemotron specialised finetune family, Eagle vision-language models, plus dozens of specialised models for synthetic data, agentic workflows, and ASR. This page consolidates the catalogue.

Key Findings

  1. Llama 3.1 Nemotron Ultra 253B is NVIDIA\u2019s flagship 2026 open-weight model, a 253B-parameter dense reasoning model derived from Llama 3.1 with strong AIME and GPQA scores.
  2. Llama 3.3 Nemotron Super 49B is NVIDIA\u2019s most-deployed mid-size open model with strong instruction following and tool use, competing directly with Qwen 32B and Llama 3.1 70B.
  3. NVIDIA Cosmos (released January 2025) is a family of world foundation models for robotics and autonomous systems, generating physically realistic video from text or image prompts plus reasoning over physical-world dynamics.
  4. Eagle 2 and Eagle 2.5 are NVIDIA\u2019s vision-language model families covering general VLM and document understanding workloads with open weights.
  5. The NV-Embed-v2 embedding model holds the top position on the MTEB v2 leaderboard among open-weight English embedders; combined with related Nemotron variants for data generation, NVIDIA has a complete open AI primitive stack.

NVIDIA Open-Weight Model Catalogue (May 2026)

FamilyVariantCapabilityLicense
Nemotron 4 / Nemotron 5340B base / 15B / variousGeneral-purposeNVIDIA Open Model Licence
Llama 3.1 Nemotron Ultra 253B~253BReasoning, agenticNVIDIA Open Model Licence + Llama 3.1 Community
Llama 3.3 Nemotron Super 49B~49BGeneral-purpose mid-sizeNVIDIA Open Model Licence + Llama 3.3 Community
Llama 3.2 Nemotron Mini 4B~4BEdge-deploy textNVIDIA Open Model Licence + Llama 3.2 Community
Llama 3 Nemotron Nano 8B~8BGeneral-purpose smallNVIDIA Open Model Licence + Llama 3 Community
Cosmos Predict 1.0 (14B / 7B)~14B / ~7BWorld model video generationNVIDIA Open Model Licence
Cosmos Reason 1.0~7BPhysical-world reasoningNVIDIA Open Model Licence
Cosmos Transfer 1.0~7BVideo-to-video transformationNVIDIA Open Model Licence
Eagle 2 (varies)~variesVision-languageNVIDIA Open Model Licence
NV-Embed-v2~7BEmbeddingCC-BY-NC
Canary 1B~1BASR English-focusedCC-BY-4.0
Parakeet TDT 1.1B~1.1BASR streamingCC-BY-4.0
NVLM 1.0 72B~72BVision-languageCC-BY-NC + Research
Nemotron-CC pretraining dataset~6.3T tokensOpen pretraining dataMulti-licence (curated open)

Cosmos Detail

Cosmos VariantFunction
Cosmos PredictText or image to video; world-model future-state prediction
Cosmos ReasonReasoning over physical-world inputs including video
Cosmos TransferVideo-to-video transformation with control conditioning
Cosmos TokenizerVisual tokenisation for downstream training
Cosmos CuratorData curation for robotics training

Cosmos is positioned for robotics, autonomous vehicles, and embodied AI use cases where physically-realistic world simulation matters more than generic video aesthetic quality.

Strategic Context

Three patterns define NVIDIA\u2019s 2026 open-weight strategy. First, the GPU-platform tie-in: NVIDIA releases open models that are optimised for NVIDIA NIM microservices, TensorRT-LLM serving, and Triton Inference Server, creating a tight integration between model release and GPU consumption. Second, the specialised model catalogue: NVIDIA releases dozens of specialised models for ASR, embedding, reranking, synthetic data, and agent workflows rather than competing on frontier language model benchmarks. Third, the Cosmos world model bet: by releasing world foundation models open-weight, NVIDIA positions the robotics and physical AI ecosystem to compound on NVIDIA-platform tools.

Brand Visibility Implications

NVIDIA open-weight releases are a high-citation procurement category for AI infrastructure decisions. AI assistant queries about "NVIDIA Nemotron vs Llama", "Cosmos world model robotics", "NV-Embed-v2 vs OpenAI", and similar terms drive direct procurement decisions. Brands selling AI infrastructure, robotics AI, NIM-adjacent products, and NVIDIA partner services face strong AI-mediated discovery surface for this category.

Methodology

Model data compiled from NVIDIA Hugging Face disclosures and NVIDIA developer documentation through 23 May 2026. Updated quarterly.

How Presenc AI Helps

Presenc AI monitors brand visibility on NVIDIA Nemotron, Cosmos, and adjacent open-weight queries across ChatGPT, Claude, Gemini, and Perplexity. For AI infrastructure brands, robotics AI vendors, NIM-adjacent products, and NVIDIA partner services, the platform identifies the prompts driving procurement-research traffic and the gaps where new content unlocks share of voice.

Frequently Asked Questions

NVIDIA\u2019s family of open-weight LLMs. The 2026 lineup includes Llama 3.1 Nemotron Ultra 253B (flagship reasoning), Llama 3.3 Nemotron Super 49B (mid-size), Llama 3.2 Nemotron Mini 4B (edge), and Llama 3 Nemotron Nano 8B (small). Most are finetunes of Llama base models with NVIDIA reasoning, instruction, and tool-use training added.
A family of world foundation models released January 2025 for robotics and autonomous systems. Cosmos generates physically realistic video from text or image prompts and reasons over physical-world dynamics. Used by robotics labs (1X, Agility, Skild AI, Physical Intelligence) for synthetic data generation and policy training.
On the MTEB v2 English leaderboard among open-weight embedders, NV-Embed-v2 is at the top with approximately 72.3 percent average. However, the CC-BY-NC licence restricts commercial deployment. For commercial use, BGE-M3 (MIT) or Qwen3-Embedding-8B (Tongyi Qianwen) are the dominant alternatives.
Most Nemotron models use the NVIDIA Open Model Licence which permits commercial use subject to compliance with the underlying Llama Community Licence (where applicable). The licence is permissive for typical commercial deployment but has some restrictions on competitive AI service use.
Strategic tie-in to GPU consumption. NVIDIA models are optimised for NIM microservices, TensorRT-LLM serving, and Triton Inference Server, creating a tight integration between open-weight model release and NVIDIA GPU and software stack utilisation. The open releases position NVIDIA as the default infrastructure for the open-weight AI ecosystem.

Track Your AI Visibility

See how your brand appears across ChatGPT, Claude, Perplexity, and other AI platforms. Start monitoring today.