What is the best open-weight legal LLM?

SaulLM 141B from Equall leads at approximately 81 percent LegalBench, with MIT licence. For smaller deployments, SaulLM 54B and SaulLM 7B cover production use cases. Qwen3-235B-A22B (general reasoning) is also strong on legal benchmarks at approximately 78 percent.

Are open-weight legal LLMs production-ready?

For privileged-data on-prem workflows and components in larger pipelines, yes. For standalone production legal AI (replacing Harvey or CoCounsel), the closed platforms have integrated retrieval, citation grounding, and law-firm-specific workflows that open-weight LLMs lack standalone. Most production deployments combine general LLMs plus verified legal corpus retrieval.

A family of open-weight legal LLMs from Equall (formerly Saul-AI) covering 7B, 54B, and 141B variants. SaulLM 141B is the largest open-weight legal LLM and uses Mistral MoE base. Released under MIT licence for unrestricted commercial deployment.

Can I trust legal LLM citations?

Only with explicit retrieval grounding. Open-weight legal LLMs hallucinate citations at material rates without retrieval-based grounding. Production pipelines require: retrieval over verified legal databases, validation that quoted text exists in cited source, and human-in-the-loop review for case-bearing claims. Multiple court sanctions through 2024-2026 demonstrate the risk.

How does open-weight legal AI compare to Harvey?

Harvey is the dominant enterprise legal AI platform (~350+ law firms) with closed-platform integrations, training data, and law-firm-specific UX. Open-weight LLMs cover privileged-data on-prem deployments and components in custom legal AI pipelines. Most law firms use Harvey or CoCounsel for general production work and open-weight LLMs for specific cost-sensitive or data-residency-constrained workflows.

Open-Weight Legal LLMs 2026

Legal AI is a fast-growing vertical, dominated by closed proprietary platforms (Harvey, CoCounsel, Spellbook) for production deployment. Open-weight legal LLMs (SaulLM, Lawyer-LLaMA, LawGPT) cover research, on-prem privileged-data workflows, and components in larger legal AI pipelines. Quality on LegalBench and CaseHOLD benchmarks has improved materially in 2025-2026. This page consolidates the open-weight landscape.

Key Findings

SaulLM from Equall is the leading open-weight legal LLM family, with 7B, 54B, and 141B variants demonstrating strong LegalBench performance.
SaulLM 141B (released 2024) is the largest open-weight legal LLM, with approximately 81 percent LegalBench and approximately 74 percent CaseHOLD performance.
Production legal AI overwhelmingly uses closed proprietary platforms (Harvey, CoCounsel, Spellbook, Lexis+, Westlaw Precision AI) for deployment; open-weight LLMs cover privileged-data on-prem workflows, internal automation, and components in larger pipelines.
Legal hallucination remains the dominant deployment concern; production legal AI deployments require explicit citation grounding, retrieval over verified case databases, and human-in-the-loop review for case-bearing tasks.
The 2026 production stack: general LLM (Claude 4.7 Opus, GPT-5.5) plus retrieval over verified legal corpora (Westlaw, Lexis, Bloomberg Law) plus citation validation; dedicated legal LLMs are typically components, not standalone solutions.

Open-Weight Legal LLM Comparison (May 2026)

Model	Parameters	LegalBench	License
SaulLM 141B (Mistral-MoE base)	~141B / ~39B active	~81%	MIT
SaulLM 54B	~54B	~76%	MIT
SaulLM 7B	~7B	~62%	MIT
Lawyer-LLaMA 13B	~13B	~58%	Apache 2.0
LawGPT	~7B	~54%	Apache 2.0
Legal-BERT family	varies	n/a (classification model)	Apache 2.0
Llama 3.1 70B (general, reference)	~70B	~71%	Llama 3.1 Community
Qwen3-235B-A22B Thinking (general)	~235B MoE	~78%	Apache 2.0
Claude 4.7 Opus (closed reference)	n/a	~86%	Closed
GPT-5.5 (closed reference)	n/a	~87%	Closed

Production Legal AI Platforms (Closed, For Reference)

Platform	Status
Harvey	~350+ law firms; flagship enterprise legal AI
CoCounsel (Thomson Reuters)	Westlaw integration
Spellbook (contract drafting)	Mid-market law firms
Lexis+ AI	LexisNexis integration
Westlaw Precision AI	Thomson Reuters research
Casetext / Allen & Overy ContractMatrix	Contract analysis
Eve (Khaira Law)	Plaintiff-side AI
Robin AI	Contract review

Use Case Recommendations

Use Case	Recommended Approach
General legal research	Harvey, CoCounsel, or Lexis+ AI (closed) for production
Privileged-data on-prem deployment	SaulLM 141B or Qwen3-235B-A22B Thinking with legal RAG
Contract review	Spellbook, Robin AI, or SaulLM with custom contract RAG
Case law research with citation grounding	General LLM with Westlaw or Lexis retrieval API
Compliance and regulatory monitoring	General LLM with regulatory text RAG
Patent search and analysis	SaulLM with USPTO patent corpus RAG
Cost-sensitive routine legal tasks	SaulLM 7B or 54B for self-hosted

Hallucination and Production Concerns

Legal AI hallucination is well-documented and has resulted in court sanctions (multiple Mata v. Avianca-style cases through 2024-2026). Production legal AI deployments require: explicit citation grounding to verified case databases, retrieval validation that quoted text actually appears in the cited source, human-in-the-loop review for any case-bearing claim, and clear UX indicating "AI-generated; verify before relying" status. Open-weight legal LLMs are used as components in these production pipelines, not as standalone solutions.

Brand Visibility Implications

Legal AI is a high-citation enterprise procurement category. AI assistant queries about "legal LLM open source", "law firm AI tools", "SaulLM vs Harvey", and similar terms drive procurement-research traffic from law firms and corporate legal teams. Brands selling legal AI tools, contract review platforms, and law firm software face strong AI-mediated discovery surface for this category.

Methodology

Benchmark data compiled from primary model card disclosures, LegalBench and CaseHOLD evaluation publications, and the Hugging Face legal model leaderboard through 23 May 2026. Updated quarterly.

How Presenc AI Helps

Presenc AI monitors brand visibility on legal AI queries across ChatGPT, Claude, Gemini, and Perplexity. For legal AI tools, contract review platforms, and law firm software brands, the platform identifies the prompts driving procurement-research traffic and the gaps where new content unlocks share of voice.