
"DeepSeek, the Chinese AI lab that shook up the market with its impressive open-source R1 model in January, has finally revealed the secret so many were wondering about: how it trained R1 more cheaply than the companies behind other, primarily American, frontier models. Also: Worried about AI's soaring energy needs? Avoiding chatbots won't help - but 3 things could The company wrote in a published Wednesday that building R1 only cost them $249,000 -- a ridiculously low amount in the high-spending world of AI."
"For context, DeepSeek said in an earlier that its V3 model, which is similar to a standard chatbot model family like Claude, cost $5.6 million to train. That number has been disputed, with some experts questioning whether it includes all development costs (including infrastructure, R&D, data, and more) or singles out its final training run. Regardless, it's still a fraction of what companies like OpenAI have spent building models (Sam Altman himself has that GPT-4 cost north of $100 million)."
DeepSeek reported that building its R1 model cost $249,000, far lower than many frontier-model expenditures. The company previously reported spending $5.6 million to train a V3 model comparable to common chatbot families. Experts have disputed the R1 figure, questioning whether it covers full development costs or only the final training run. DeepSeek prices R1 at $0.14 per million tokens, compared with OpenAI's roughly $7.50 for an equivalent tier. AI model development consumes substantial data, GPUs, energy, water, and personnel. Export restrictions on US-made chips challenge Chinese labs, and older-chip optimization provided DeepSeek an advantage.
Read at ZDNET
Unable to calculate read time
Collection
[
|
...
]