Read the model, not the label.
The same three words — "open," "open source," "open weights" — get stamped on models that are worlds apart in what they let you do. The yes/no fight (OSI vs Meta vs everyone else) makes good copy but it won't tell you whether a model fits the job in front of you.
CLEAR is our answer. Not another definition, but a decision layer on top of it: pick what you want to do with the model and get a plain Go / Caution / No-go for that job, with the reason attached.
There is no agreed definition. That's not a gap in your knowledge. It's the actual state of things.
If you feel confused, you're reading it right. The Economist called it "a battle… over the definition of open-source AI." Three things are true at once:
OSI published OSAID 1.0 in October 2024: it asks for the code, the weights, and enough information about the data to rebuild the model. Almost no commercial model meets it, and not everyone accepts it as the bar.
Google's own site separates "open models" (free weights, their terms) from "open source." "Open" and "open source" are not the same thing, and a major lab says so in writing.
Models you can truly rebuild (weights and code and data) are rare (AI2's OLMo, Switzerland's Apertus). The vast majority of "open" models share only the weights, and call it open source.
So the word in the middle of all this, "open," does a lot of hidden work. The only way through is to stop arguing about the label and look at what's actually in the box.
Who's talking, and what they actually mean.
If you've heard someone talk about open source AI, you've heard one of these voices. Here's where each really stands, in plain terms, current to 2026.
The Open Source Initiative authored OSAID 1.0: code + weights + enough data information to rebuild the model. It even publishes a page warning that "open weights" is not the same as open source.
The purist line: without the actual training data you can't study or rebuild it, so it isn't free. A 2024 Nature paper argued many "open" AI systems are, in practice, closed.
Calls Llama "open source." But its custom license restricts some uses and the training data is undisclosed, so OSI and others say it doesn't qualify. The loudest, most-cited example of the gap.
On its own cloud site it lists Gemma and Llama as "open models," separate from "open source models." Honest about the distinction, while still marketing "open."
Built closed despite the name. Then in 2025 it released gpt-oss: open-weight, Apache-2.0. Even the closed camp now ships "open" models. The weights, not the data.
V3 and R1 shipped under the MIT license, weights free to all, a deliberate move that put open-weight models on the world's front pages and rattled Silicon Valley.
Its MOF grades how open a model is (three classes); its OpenMDW license (2025) is built for models, because software licenses don't cleanly fit weights and data.
The AI Act demands a public training-data summary and gives genuinely open models a narrow carve-out. Projects like Switzerland's Apertus push fully-open as a digital-sovereignty strategy.
Notice the pattern: almost everyone says "open." Almost no one means "you can rebuild it." That's the whole problem in one line.
CLEAR doesn't rate "openness." It tells you if a model is open enough for your job.
OSI's OSAID already answers "is it open source?" (yes / no). It's a good legal anchor, and almost every commercial model fails it. That tells you nothing about whether to use Llama in a product, deploy Mistral in a hospital, or fine-tune DeepSeek on your data. CLEAR takes the same facts OSAID looks at and maps them to those decisions.
Answers "is it open source?" with one yes/no. Rigorous and important, but binary. The honest truth: almost everything fails it, so it can't help you pick between the models that fail.
Grades ~16 separate artifacts in three completeness classes. Thorough and producer-facing, made to prove what you released, not to help an adopter pick a model for next Monday.
Takes those same facts and answers Go / Caution / No-go for the job you actually have. OSAID protects the producer's claim of openness. CLEAR protects the adopter's decision.
Behind every CLEAR verdict are five facts: Code, License, Examples (training data), Access (weights), Reproducibility. They're the same ingredients OSAID inspects; CLEAR just reads them through the lens of what you want to do.
Pick a model. CLEAR shows what it lets you actually do.
Apache license, anyway?Good instinct: it's half the confusion. Apache-2.0 and MIT are software licenses. They fit the code cleanly, but it's legally unclear they even cover the weights (a file of numbers may not be copyrightable) or the data. That mismatch is exactly why a model-specific license, OpenMDW, appeared in 2025. The takeaway: a permissive license stamped on a model does not mean the whole model is open. In CLEAR terms it usually scores only the L.
CLEAR is an open draft, not a verdict carved in stone. Help refine the criteria and the scores ↓
CLEAR is a draft. Help make it the standard.
Pointing out that "open source AI" has no agreed meaning is easy, and on its own, useless. CLEAR v0.1 is our attempt at the harder thing: square the circle with a concrete, factual scorecard anyone can apply. It's deliberately unfinished and openly governed. If you work with these models, the criteria and the scores are yours to shape, on GitHub, in the open.