Open-weight model

Open-weight model

An open-weight model is a language model whose trained weights (parameters) are publicly released and can be freely downloaded for use in inference and fine-tuning.

Differences from "Open Source"

Although often confused, open-weight and open-source are not the same thing. Open-source software means that the source code, training data, and training procedures are all publicly available, allowing anyone to reproduce or modify them. Open-weight is a more limited concept, referring specifically to "the trained weight files being publicly available."

Meta's Llama 3 releases the model weights publicly, but the details of the datasets used for training remain undisclosed, and commercial use is subject to conditions based on monthly active users. Mistral similarly makes its weights public, while its licenses vary by model, mixing Apache 2.0 with proprietary licenses. Strictly speaking, it is more accurate to call these models "open-weight" rather than open-source.

Why the Release of Weights Matters

Having the weights on hand means that inference can be run entirely under your own organization's control. This has three key implications:

Freedom to Customize: You can fine-tune on your own data to create models specialized for specific domains. This enables deep customization that is impossible via API. Using PEFT or LoRA, fine-tuning becomes practical even on a single consumer-grade GPU.

Ensuring Data Sovereignty: Since no data is sent to external parties during inference, the model can be applied to tasks involving confidential information. This is why adoption is growing in heavily regulated industries such as finance, healthcare, and legal services.

Avoiding Vendor Lock-in: You are not dependent on a specific API provider. Your organization's AI infrastructure can be decoupled from the risks of pricing changes or service discontinuation.

Major Open-Weight Models (as of 2026)

Meta's Llama 4 series spans a wide range of sizes, from Scout (17B active / 109B total) to Behemoth (288B active / 2T total), and adopts a Mixture of Experts architecture. Google's Gemma 3 follows a lightweight approach ranging from 1B to 27B. Mistral delivers commercial-grade performance with Mistral Large 2 while also releasing lightweight versions in parallel. From China, DeepSeek-V3 and Qwen 2.5 are making their presence felt with strong multilingual performance.

When selecting a model, there is more to evaluate than performance alone. License terms (whether commercial use is permitted, user count restrictions), supported languages, and required hardware specifications must all be carefully examined in advance.