Elon Musk Accidentally Leaked Anthropic’s Model Sizes

How many parameters does Claude have? Until recently, the honest answer was that nobody outside Anthropic knew. Anthropic has never officially released parameter counts for any of its Claude models. That changed when Elon Musk, while defending xAI’s Colossus 2 supercomputer and its training capacity, let something slip. He stated that Grok has a total parameter count of 0.5 trillion, then added that this is half the parameter count of Sonnet and one-tenth of Opus.

That math is not complicated. Sonnet is 1 trillion parameters. Opus is 5 trillion parameters. Mark Krasman popularized the calculation publicly, but the numbers were sitting right there in the quote for anyone to read.

Bar chart showing Grok at 0.5T, Sonnet inferred at 1T, and Opus inferred at 5T parameters

How Many Parameters Does Claude Sonnet Have?

Based on Musk’s statement, Claude Sonnet has approximately 1 trillion parameters. Grok is 0.5 trillion, and Musk explicitly said Grok is half the size of Sonnet. Half of 1 trillion is 0.5 trillion, so the arithmetic is direct. Anthropic has not confirmed this number, but Musk appeared to be using it as a known reference point rather than speculating.

How Many Parameters Does Claude Opus Have?

Based on the same statement, Claude Opus has approximately 5 trillion parameters. Musk said Grok is one-tenth the size of Opus. One-tenth of 5 trillion is 0.5 trillion, which matches the stated Grok count. That makes Opus five times larger than Sonnet within the same model family, which aligns with the significant pricing gap between the two tiers.

The Raw Quote

“The total parameter count is 0.5T (500 billion). The current Grok has half the parameter count of Sonnet and one-tenth of Opus.” There is no ambiguity in it. The comment came in response to skepticism about xAI’s claim that Colossus 2 is training seven models, with the largest reaching 10 trillion parameters. Musk was positioning Grok as efficient relative to its size, saying “for its scale, it is a very powerful model.” In doing so, he gave the public the ratios needed to back-calculate Anthropic’s two flagship model sizes.

How This Compares to Earlier Estimates

Before this, the estimates floating around for Claude model sizes were considerably lower. Claude 3 Sonnet was commonly estimated somewhere between 70 billion and 250 billion parameters, and Claude 3 Opus at roughly 2 trillion. Those are outdated figures now, and the current generation has clearly scaled well past them. The new inferred figures of 1T for Sonnet and 5T for Opus reflect that progression, though which exact sub-version of Sonnet and Opus Musk was referring to adds some uncertainty. The numbers are now widely cited regardless.

For additional context, Grok 4 is reported at 1.7 trillion parameters, which would make it larger than the inferred Sonnet size. That fits with the broader picture of xAI scaling aggressively on Colossus 2. The cluster is reportedly training models up to 10 trillion parameters, which is exactly what Musk was defending when he made the comparison that leaked these figures.

Why Anthropic Keeps These Numbers Private

Anthropic keeps model sizes close to the chest. There has never been an official parameter count released for any Claude model. The community has always worked from inference, compute estimates, job postings, and the occasional slip. This is a much more direct slip than usual because it came from a competitor who apparently knows the numbers and chose to use them as a reference point in a public post.

There is a separate ongoing story around Claude Mythos, Anthropic’s next model after Opus, which was leaked through a misconfigured data store. Mythos is reportedly larger than Opus, which would put it above 5 trillion parameters if the Musk ratios are accurate. That lines up with Anthropic building progressively larger models and with the scale of compute infrastructure being deployed across the industry right now. If Opus is at 5T and Mythos is larger, the trajectory is clear even without official confirmation.

OpenAI’s model sizes remain similarly opaque. GPT-5.2 has no official parameter count. The pattern of frontier labs keeping these numbers private while occasionally letting them surface through competitor comparisons or accidental leaks is not going away anytime soon. It is commercially sensitive information, and the only reason we have these Anthropic figures at all is that Musk was trying to make a different argument entirely.

What the Numbers Actually Mean

Parameter count is one signal among many. A 5 trillion parameter model is not automatically better than a 1 trillion parameter model, and neither is automatically better than a well-optimized smaller model. What the numbers tell you is something about training compute, inference cost, and the ceiling of what the model can learn. Grok being 0.5T and competitive with 1T Sonnet on certain tasks would support Musk’s efficiency claim if the benchmarks held up, though benchmark context matters far more than raw counts.

The gap between Sonnet and Opus at 1T versus 5T is still notable. That is a 5x difference within the same model family from the same lab. Anthropic is clearly running a wide range of model sizes across its tier structure, from Haiku at the small end through Sonnet and Opus and now beyond into Mythos territory. The architecture and training choices within that range matter more than the raw counts, but having the counts at all gives researchers and developers a clearer picture of where these models sit relative to each other and relative to the competition.

The broader context here is that the entire industry is scaling in ways that were not publicly visible until moments like this one. Colossus 2 training models up to 10T, Anthropic running a 5T flagship, and the inference costs that come with serving models at that scale are all part of the same story. Parameter counts are not the whole picture, but they are a real window into how seriously these labs are investing in raw model capacity.

Links

They're clicky!

Follow on X →Ironwood →
Adam Holter
Adam Holter

Founder of Ironwood AI. Writing about AI models, agents, and what's actually happening in the space.