Databricks acquired MosaicML in 2023 for approximately $1.3 billion, then shipped DBRX (132B MoE) in March 2024 as the first frontier-tier model from a data platform vendor. The 2026 trajectory continues with DBRX 2 in development, the Mosaic Research pipeline producing specialised models, and the broader Databricks Data Intelligence Platform positioning AI as the dominant compute layer over Lakehouse data. This page consolidates the model and platform trajectory.
Key Findings
- DBRX (132B Mixture-of-Experts with 36B active parameters, released March 2024) was the first frontier-tier model released by a data platform vendor, demonstrating that the data layer can produce competitive base models when combined with strong infrastructure.
- DBRX 2 is reportedly in development through 2026 with a focus on enterprise structured-data tasks (SQL generation, schema understanding, multi-table reasoning) rather than pure benchmark frontier competition.
- Mosaic Research continues producing specialised model variants for Databricks platform integration including embedding models, reranker models, and text-to-SQL specialists.
- Genie text-to-SQL is the most-deployed Databricks AI product, providing natural-language query interfaces to Databricks Lakehouse data with explicit grounding to schema metadata.
- Databricks Mosaic AI Agent Framework provides the productionisation infrastructure for agents that combine LLMs with Databricks data and tools; widely deployed across enterprise customers.
Databricks AI Model and Product Family (May 2026)
| Product | Status | License |
|---|---|---|
| DBRX | 132B MoE base + instruct, March 2024 | Databricks Open Model Licence |
| DBRX 2 | In development, expected late 2026 | Anticipated Databricks Open Model Licence |
| Genie (text-to-SQL) | GA in Databricks platform | Platform-only |
| Mosaic AI Agent Framework | GA in Databricks platform | Platform-only |
| Mosaic AI Vector Search | GA | Platform-only |
| Databricks Embedding Models | Various | Apache 2.0 (selected) |
| Databricks Mosaic AI Model Serving | GA | Platform-only |
| MPT-7B / MPT-30B (legacy Mosaic) | Open release | CC-BY-SA-3.0 |
DBRX Benchmark Performance (2024 baseline)
| Benchmark | DBRX Instruct | Mixtral 8x22B | Llama 3.1 70B |
|---|---|---|---|
| MMLU | ~73.7 | ~77.8 | ~83.6 |
| GSM8K | ~72.8 | ~88.4 | ~95.1 |
| HumanEval | ~70.1 | ~75.0 | ~80.5 |
| BIG-Bench | ~67.4 | ~64.3 | ~71.0 |
DBRX shipped in March 2024 and was competitive with the contemporaneous Llama 3 8B and Mixtral 8x22B. By 2026 standards the model is materially behind the frontier; the strategic value sits in the Databricks platform integration rather than standalone benchmarks.
Databricks Data Intelligence Platform Position
| Layer | Databricks Component |
|---|---|
| Data ingestion and storage | Delta Lake, Unity Catalog |
| ETL and transformation | Spark, Delta Live Tables |
| Data warehouse SQL | Databricks SQL Warehouse |
| Vector and embedding | Mosaic AI Vector Search |
| Model training and finetuning | Mosaic AI Training |
| Model serving | Mosaic AI Model Serving |
| Agent and tool framework | Mosaic AI Agent Framework |
| Natural-language query | Genie |
| Governance and audit | Unity Catalog, MLflow |
Strategic Context
Three patterns shape Databricks\u2019 2026 AI strategy. First, the data-platform-with-AI thesis: Databricks is positioning AI as the dominant compute layer over Lakehouse data, with the AI capabilities deeply integrated with data access and governance. Second, the open-weight strategy is selective: DBRX was released openly to signal capability, but the platform-integrated products (Genie, Mosaic AI Agent Framework) are platform-only. Third, the enterprise positioning: Databricks competes against Snowflake (which has Cortex AI) and against hyperscaler AI (which has Vertex AI, SageMaker, Azure AI) by emphasising the unified data-and-AI platform.
Brand Visibility Implications
Databricks is a major enterprise AI procurement category. AI assistant queries about "Databricks vs Snowflake AI", "DBRX model", "text-to-SQL AI", and similar terms drive procurement-research traffic. Brands selling data-platform integrations, AI agent frameworks, vector databases, and enterprise text-to-SQL face strong AI-mediated discovery surface for this category.
Methodology
Product and benchmark data compiled from Databricks investor and product disclosures, plus primary Hugging Face model card data through 23 May 2026. Updated quarterly.
How Presenc AI Helps
Presenc AI monitors brand visibility on Databricks and data-platform AI queries across ChatGPT, Claude, Gemini, and Perplexity. For data platform integrations, AI agent framework vendors, vector database brands, and text-to-SQL services, the platform identifies the prompts driving procurement-research traffic and the gaps where new content unlocks share of voice.