DeepSeek released cost figures for training their new AI models, claiming they spent just $5.6 million. This number is misleading – it only accounts for the energy costs of training the base DeepSeek v3 model.
First, DeepSeek R1 required significant additional training on top of v3 using reinforcement learning with chain-of-thought reasoning. The compute costs for this step aren’t included in their figure.
Second, the engineering hours required to develop these models add substantial costs that DeepSeek conveniently ignores. While measuring exact engineering costs can be complex, they represent a major investment.
But the biggest omission is their GPU infrastructure. DeepSeek claims to have around 2,000 GPUs, but sources indicate they actually possess closer to 50,000 NVIDIA H100s. At current prices, that’s over $2 billion in hardware alone. They likely keep this number private due to export controls, but you can’t just ignore a $2 billion infrastructure investment when discussing development costs.
The reality is that training advanced AI models costs far more than DeepSeek suggests. Their $5.6 million figure leaves out critical expenses like:
– Additional training costs for R1
– Engineering talent and development time
– A massive GPU infrastructure investment worth billions
This matters because accurate cost reporting helps set realistic expectations for AI development. When companies obscure true costs, it distorts the market’s understanding of what it takes to build competitive AI systems.
For more context on how AI companies report their development costs, see my previous analysis of market impacts here: https://adam.holter.com/deepseek-r1-just-erased-trillions-in-us-market-cap-but-the-numbers-dont-add-up/
The takeaway is simple: AI development is expensive. Really expensive. Claims of ultra-low-cost training should be met with skepticism, especially when they exclude major cost components.