DeepSeek reports shockingly low training costs for R1 in new paper
Briefly

DeepSeek reports shockingly low training costs for R1 in new paper
"DeepSeek, the Chinese AI lab that shook up the market with its impressive open-source R1 model in January, has finally revealed the secret so many were wondering about: how it trained R1 more cheaply than the companies behind other, primarily American, frontier models. Also: Worried about AI's soaring energy needs? Avoiding chatbots won't help - but 3 things could The company wrote in a published Wednesday that building R1 only cost them $249,000 -- a ridiculously low amount in the high-spending world of AI."
"For context, DeepSeek said in an earlier that its V3 model, which is similar to a standard chatbot model family like Claude, cost $5.6 million to train. That number has been disputed, with some experts questioning whether it includes all development costs (including infrastructure, R&D, data, and more) or singles out its final training run. Regardless, it's still a fraction of what companies like OpenAI have spent building models (Sam Altman himself has that GPT-4 cost north of $100 million)."
DeepSeek reported that building its R1 model cost $249,000, far lower than many frontier-model expenditures. The company previously reported spending $5.6 million to train a V3 model comparable to common chatbot families. Experts have disputed the R1 figure, questioning whether it covers full development costs or only the final training run. DeepSeek prices R1 at $0.14 per million tokens, compared with OpenAI's roughly $7.50 for an equivalent tier. AI model development consumes substantial data, GPUs, energy, water, and personnel. Export restrictions on US-made chips challenge Chinese labs, and older-chip optimization provided DeepSeek an advantage.
Read at ZDNET
Unable to calculate read time
[
|
]