What is the most-downloaded model on Hugging Face in May 2026?

sentence-transformers/all-MiniLM-L6-v2, an 80MB English sentence-similarity embedding model, leads with approximately 259 million all-time downloads. It is the workhorse of RAG and semantic-search infrastructure across the web and exceeds every LLM's download count on the platform.

Which LLM family dominates open-weight deployment?

Qwen (Alibaba). Eleven of the top 20 text-generation models on Hugging Face are Qwen variants, with a combined ~100 million downloads. Meta Llama is a distant second at three models totalling ~20 million. This is sharply different from the Western press narrative that frames Llama as the dominant open-weight family.

What about Meta Llama? Is it not the biggest open model?

Llama is the most-talked-about and the most-discussed in Western press, but on Hugging Face download metrics, Llama trails Qwen by approximately 5x in deployment intensity among the top 20 text-generation models. Meta's consumer reach via Meta AI in WhatsApp and Instagram is larger than Qwen's consumer reach, but developer-facing open-weight downloads tell a different story.

Why is GPT-2 still in the top 20?

GPT-2 (released 2019) at 16 million downloads remains the default baseline in academic papers, tutorials, distillation pipelines, and lightweight fine-tuning demonstrations. It is small, well-understood, license-permissive, and exists in every fine-tuning toolkit's examples. The same explanation applies to BERT (64M), RoBERTa (37M combined variants), and OPT-125m (9.2M).

Are Hugging Face downloads reliable as adoption proxy?

Directionally yes, exactly no. Both CI/CD pipelines and mirror caches inflate absolute counts. Cumulative all-time totals favour older models. Relative rankings within similar age and task buckets are reliable; cross-vintage absolute comparisons are not. The Qwen dominance among text-generation models is robust because Qwen models are largely new (2024-2026), so the ranking is not an artifact of legacy accumulation.

Hugging Face Most-Downloaded Models May 2026

What the World Is Actually Downloading from Hugging Face in May 2026

Hugging Face is the dominant model registry for open-weight AI. The all-time download counters on its public model API are the most direct signal of which open-weight models developers are actually pulling into production, fine-tuning pipelines, and research workflows. This page ranks the most-downloaded models on Hugging Face as of May 14, 2026, broken into "everything" and "text-generation LLMs only." The result is sharply different from the consumer narrative around frontier closed models.

Top 15 Models on Hugging Face by All-Time Downloads (Any Task)

Rank	Model	Task	Downloads	Likes
1	sentence-transformers/all-MiniLM-L6-v2	Embeddings (sentence-similarity)	259,230,702	4,786
2	Qwen/Qwen3-VL-2B-Instruct	Multimodal (image-text-to-text)	183,999,222	403
3	google-bert/bert-base-uncased	Classic NLP (fill-mask)	63,940,780	2,651
4	google/electra-base-discriminator	Classic NLP	52,284,659	105
5	cross-encoder/ms-marco-MiniLM-L6-v2	Re-ranking (text-ranking)	50,880,509	238
6	sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2	Multilingual embeddings	47,885,224	1,226
7	BAAI/bge-small-en-v1.5	Embeddings	42,635,548	458
8	sentence-transformers/all-mpnet-base-v2	Embeddings	35,940,012	1,290
9	openai/clip-vit-large-patch14	Vision (zero-shot)	32,486,672	2,010
10	BAAI/bge-m3	Embeddings (multilingual)	24,970,361	2,997
11	openai/clip-vit-base-patch32	Vision	21,704,784	934
12	FacebookAI/xlm-roberta-base	Multilingual NLP	20,698,741	827
13	laion/clap-htsat-fused	Audio	20,255,799	85
14	FacebookAI/roberta-large	Classic NLP	20,140,914	286
15	Qwen/Qwen3-0.6B	Text-generation (LLM)	18,989,268	1,241

Top 20 Text-Generation Models (LLMs Specifically)

Rank	Model	Family	Downloads	Likes
1	Qwen/Qwen3-0.6B	Qwen	18,989,268	1,241
2	openai-community/gpt2	GPT-2 (legacy)	16,088,904	3,239
3	Qwen/Qwen2.5-7B-Instruct	Qwen	12,418,113	1,276
4	Qwen/Qwen2.5-1.5B-Instruct	Qwen	12,081,067	693
5	Qwen/Qwen3-8B	Qwen	11,735,972	1,087
6	Qwen/Qwen3-4B-Instruct-2507	Qwen	10,991,777	841
7	meta-llama/Llama-3.1-8B-Instruct	Llama	9,751,974	5,825
8	facebook/opt-125m	OPT (legacy)	9,208,234	251
9	Qwen/Qwen2.5-3B-Instruct	Qwen	8,126,945	455
10	meta-llama/Llama-3.2-1B-Instruct	Llama	7,498,621	1,402
11	openai/gpt-oss-20b	OpenAI (open release)	7,304,172	4,604
12	Qwen/Qwen3-32B	Qwen	6,853,884	692
13	Qwen/Qwen2.5-0.5B-Instruct	Qwen	5,631,806	515
14	openai/gpt-oss-120b	OpenAI (open release)	4,566,280	4,772
15	Qwen/Qwen3-4B	Qwen	4,293,722	611
16	deepseek-ai/DeepSeek-V3.2	DeepSeek	4,087,017	1,434
17	deepseek-ai/DeepSeek-R1	DeepSeek	3,819,050	13,329
18	Qwen/Qwen3-1.7B	Qwen	3,535,359	465
19	mistralai/Mistral-7B-Instruct-v0.2	Mistral	3,249,539	3,135
20	meta-llama/Meta-Llama-3-8B	Llama	3,135,302	6,532

Family Share of the Top 20 LLMs

Family	Models in Top 20	Combined Downloads
Qwen (Alibaba)	11	~100,650,000
Llama (Meta)	3	~20,400,000
GPT-OSS (OpenAI open release)	2	~11,870,000
DeepSeek	2	~7,910,000
Legacy (GPT-2, OPT-125m)	2	~25,300,000
Mistral	1	~3,250,000

Seven Things the Rankings Tell You

The single most-downloaded model on Hugging Face is not an LLM. sentence-transformers/all-MiniLM-L6-v2, an 80MB English-only embedding model, leads at 259M downloads. RAG and search infrastructure run on this one model more than on any LLM. The runner-up across all tasks (Qwen3-VL-2B-Instruct, multimodal) is the only model above 100M downloads besides MiniLM.
Qwen owns the open-weight LLM ecosystem. 11 of the top 20 text-generation models on Hugging Face are Qwen variants. The combined ~100M downloads exceed every other family combined. Meta Llama (3 models, ~20M combined) is a distant second by deployment, despite being the dominant open-weight narrative in Western press.
OpenAI's open release (gpt-oss-20b + gpt-oss-120b) is gaining ground. Two models, 11.9M combined downloads, 9,376 combined likes. The 120B variant in particular has a high like-to-download ratio (4,772 likes on 4.6M downloads), suggesting strong quality reception. OpenAI's first major open-weight release is being treated as a serious option, not a stunt.
DeepSeek-R1 is the most-liked open LLM by a wide margin. 13,329 likes on 3.8M downloads is roughly 4x the like-rate of the typical top-20 model. R1 is the only open-weight reasoning model that consumers and researchers cite by name as a frontier-grade option; the like-to-download ratio confirms perception.
BERT, GPT-2, and OPT are still in the top 20. google-bert/bert-base-uncased at 64M downloads, gpt2 at 16M, opt-125m at 9.2M. Six-to-eight year old models continue to do enormous volume because they are the default baselines in research code, tutorials, education, and lightweight fine-tuning pipelines.
Chinese embeddings are competitive with Western embeddings. BAAI bge family (bge-small-en + bge-m3 + bge-large) collectively does ~83M downloads. sentence-transformers (all-MiniLM + paraphrase + all-mpnet) does ~343M. BAAI is the leading non-sentence-transformers embedding family on the registry.
Vision is mostly CLIP. openai/clip-vit-large-patch14 (32M) and clip-vit-base-patch32 (22M) are the only vision-encoder models near the top of the all-task list, despite the rapid expansion of vision-language models like Qwen3-VL (which is classified as multimodal, not vision).

What This Means for AI Visibility

Open-weight model adoption shapes which models run inside agent stacks, on-device assistants, RAG retrievers, and custom fine-tunes. The Qwen-led ranking on Hugging Face suggests that any AI visibility programme assuming the Llama family dominates open-weight deployment is overestimating Meta and underestimating Alibaba. For Western brands optimising visibility across multi-language agents, the implication is that Chinese-language brand inclusion (in Qwen training corpora) matters more than visibility programmes typically weight it. For embeddings specifically, the dominance of all-MiniLM-L6-v2 and BAAI bge means that retrieval quality is largely a function of these few canonical embedding models, and content optimisation for retrieval should be tested against them rather than against frontier-vendor embedding APIs alone.

Methodology

Download and like counts pulled from the public Hugging Face Hub API on May 14, 2026 (the endpoints used: /api/models?sort=downloads&direction=-1&limit=30 for the all-task list and the same with filter=text-generation for the LLM list). Download counts are all-time, not monthly; this favours older models. Pipeline tags are HF's own classification (text-generation, sentence-similarity, image-text-to-text, etc.). Refreshed quarterly. Hugging Face download metrics include CI/CD pipeline pulls and mirror sync requests, so absolute numbers should be treated as proxy for cumulative deployment intensity rather than unique applications.

How Presenc AI Helps

Presenc AI monitors brand-mention rates across the major AI platforms whose downstream deployments draw from these open-weight base models. When a brand performs well on closed-frontier models but underperforms on agent stacks running Qwen or DeepSeek backends, the gap is traceable to training-data presence in the open-weight family. The Hugging Face rankings above are the input; brand-visibility-by-base-model is the output Presenc AI tracks for enterprise customers.