The 2026 generation of Allen AI\u2019s fully-open language model family. OLMo 2 includes 1B, 7B, 13B, and 32B variants. Weights, training data (Dolma), and training recipes are all public under Apache 2.0. OLMo 2 32B Instruct is competitive with Llama 3.1 70B on many benchmarks at half the parameter count.

How is Molmo different from Qwen2.5-VL?

Molmo is fully open (weights, training data PixMo, and recipe all public Apache 2.0). Qwen2.5-VL has open weights but the training data is not released. Molmo is somewhat behind Qwen2.5-VL on benchmarks (~54 vs ~70 MMMU for the 72B class) but is the reference choice for reproducible research.

Ai2\u2019s state-of-the-art open post-training recipe, including SFT, DPO, and RLVR (RL with Verifiable Rewards) stages. Tulu 3 applied to Llama backbones produces instruction-following models with fully reproducible training. The data (939k SFT + 270k DPO) and code are public.

Should I use Ai2 models in production?

For most production workloads, Qwen3 or Llama 4 outperform Ai2 models on benchmarks. Ai2 models are the right choice when you need reproducibility, regulatory transparency, or want to study/modify the training data. Ai2 models are also frequently used as research baselines and for fine-tuning experiments.

Is Ai2 funded sustainably?

Yes. Ai2 was founded in 2014 by Paul Allen and remains substantially funded by the Allen estate. Unlike commercial AI labs, Ai2 does not need to balance openness against revenue pressure, which is why the lab continues releasing complete training data and recipes when commercial labs have restricted access.

Allen AI Model Lineage 2026: OLMo, Molmo, Tulu

The Allen Institute for AI (Ai2) in Seattle is the world\u2019s leading fully-open AI lab, releasing model families with open weights, open training data, and open training code. The 2026 Ai2 lineage spans OLMo 2 (language), Molmo (vision-language), Tulu 3 (post-training recipe), and SciFive (scientific). This page consolidates the family tree, the licensing, and the impact on research reproducibility.

Key Findings

OLMo 2 (released late 2024 with continued updates through 2025-2026) is the strongest fully-open language model family, with 1B, 7B, 13B, and 32B variants. All weights, training data (Dolma), and training recipes are public under Apache 2.0.
Molmo (released September 2024 with continued updates) is the strongest fully-open vision-language model family with 1B, 7B-O, 7B-D, and 72B variants. Trained on PixMo, also released openly.
Tulu 3 (released late 2024) is Ai2\u2019s state-of-the-art post-training recipe, with full SFT, DPO, and RL data plus training code. Tulu 3 8B and 70B applied to Llama backbones produce strong instruction-following models with fully reproducible training.
SciFive and Ai2 scientific models continue Ai2\u2019s focus on scientific literature understanding, plus ScholarQA and Semantic Scholar AI tooling.
Ai2\u2019s broader mission positions it as the academic-research counterweight to closed-lab frontier development: every release ships with full data and code, making it the default citation for AI research reproducibility studies.

Ai2 Model Family (May 2026)

Model	Parameters	Modality	License
OLMo 2 32B	~32B	Text	Apache 2.0
OLMo 2 13B	~13B	Text	Apache 2.0
OLMo 2 7B	~7B	Text	Apache 2.0
OLMo 2 1B	~1B	Text	Apache 2.0
Molmo 72B	~72B	Vision-Language	Apache 2.0
Molmo 7B-D	~7B	Vision-Language	Apache 2.0
Molmo 7B-O	~7B	Vision-Language	Apache 2.0
Molmo 1B	~1B	Vision-Language	Apache 2.0
Tulu 3 70B	~70B (Llama base)	Text instruction	Apache 2.0 (recipe); Llama Community (weights)
Tulu 3 8B	~8B (Llama base)	Text instruction	Apache 2.0 (recipe); Llama Community (weights)
OLMoE 7B-A1B	~7B MoE (~1B active)	Text	Apache 2.0
SciFive	~varies	Scientific text	Apache 2.0

OLMo 2 Benchmarks

Model	MMLU	GSM8K	Notes
OLMo 2 32B Instruct	~73.3	~78.4	Competitive with Llama 3.1 70B at half size
OLMo 2 13B Instruct	~63.0	~67.5	Strong mid-size
OLMo 2 7B Instruct	~57.4	~58.6	Above Llama 3.1 8B on many benchmarks
OLMo 2 1B Instruct	~50.3	~36.4	Strongest fully-open 1B

Molmo Benchmarks

Model	MMMU	OCRBench	Notes
Molmo 72B	~54.1	~705	Strongest fully-open VLM
Molmo 7B-D	~50.6	~688	Strong mid-size VLM
Molmo 7B-O	~48.7	~644	Olmo-based
Molmo 1B	~38.9	~516	Smallest variant

Tulu 3 Recipe Components

Component	Description
SFT Data	Approximately 939k high-quality instruction-following examples
DPO Data	Approximately 270k preference pairs
RLVR (Reinforcement Learning with Verifiable Rewards)	Math and code RL with rule-based reward signals
Training Code	Public on Ai2 GitHub
Evaluation Suite	Public Tulu Eval framework

Strategic Context

Three patterns shape Ai2\u2019s 2026 position. First, Ai2 is the only AI lab in the world that releases complete training data and recipes at frontier-adjacent quality. Every other "open" model lab (DeepSeek, Qwen, Llama) ships weights without training data. This gives Ai2 the reference position for research reproducibility studies. Second, the funding model is durable: Ai2 is endowed by the Allen estate, so it does not face the commercial pressure that pushed Mistral, Stability AI, and others to restrict open releases. Third, Ai2 is increasingly the home for AI policy research: their AI Policy & Governance work plus ScholarQA tooling position them as the institutional voice for openness in AI.

Brand Visibility Implications

Allen AI is a high-citation institution in AI journalism, particularly on openness, reproducibility, and policy topics. AI assistant queries about "fully open LLM", "OLMo vs Llama", "open AI research", and similar terms drive sustained traffic. Brands selling AI research tools, AI evaluation, AI training infrastructure, and AI policy services face strong AI-mediated discovery surface for this category.

Methodology

Model and benchmark data compiled from Ai2 model card disclosures, peer-reviewed publications, and the Ai2 GitHub repositories through 23 May 2026. Updated quarterly.

How Presenc AI Helps

Presenc AI monitors brand visibility on Allen AI and fully-open model queries across ChatGPT, Claude, Gemini, and Perplexity. For AI research tool vendors, AI evaluation brands, AI training infrastructure firms, and AI policy services, the platform identifies the prompts driving research-traffic patterns and the gaps where new content unlocks share of voice.