DeepSeek Discloses R1 AI Training Costs in Nature Journal

Mutib Khalid

6 months ago

DeepSeek Reveals Rare Details on AI Training Costs in Nature Journal

HANGZHOU – Chinese artificial intelligence firm DeepSeek has disclosed for the first time the cost of training its flagship reasoning-focused model, R1, in a rare update that has drawn global attention.

The company revealed in a peer-reviewed article published Wednesday in the journal Nature that training the R1 model cost $294,000 and relied on 512 Nvidia H800 chips. This disclosure marks the first public estimate from the Hangzhou-based firm, which has been largely absent from the spotlight in recent months.

DeepSeek sparked international concern in January when it unveiled what it claimed were lower-cost AI systems. The news shook investors and raised questions about whether the company could disrupt the dominance of U.S. tech leaders such as Nvidia. Since then, DeepSeek and its founder, Liang Wenfeng—who is listed as a co-author of the Nature article—have made only limited appearances through product updates.

By contrast, OpenAI CEO Sam Altman has said training “foundational models” costs well over $100 million, though his company has never disclosed exact figures.

Training large AI models involves running clusters of powerful chips for extended periods to process vast datasets of text and code—a process that is both resource-intensive and costly.

DeepSeek’s claims, however, have been met with skepticism. U.S. officials previously said that the company had access to large volumes of Nvidia’s advanced H100 chips, despite American export restrictions imposed in 2022. Nvidia has maintained that DeepSeek used only lawfully acquired H800 chips.

In a supplementary document tied to the Nature paper, DeepSeek acknowledged for the first time that it also owns A100 chips, which were used during the early stages of R1’s development. Researchers explained that smaller models were initially trained on the A100 GPUs before R1 was ultimately trained for 80 hours on the H800 chip cluster.

Reuters has reported that DeepSeek’s access to an A100 supercomputing cluster has helped it attract some of China’s brightest AI talent, setting it apart from other domestic players.

Mutib Khalid

Mutib Khalid is a skilled content writer and digital marketer with a knack for crafting compelling narratives and optimizing digital strategies. Excel in creating engaging content that drives results and enhances online presence. Passionate about blending creativity with data-driven approaches, Mutib Khalid helps brands connect with their audience and achieve their goals.

DeepSeek Reveals Rare Details on AI Training Costs in Nature Journal

Karachi Trailer Accident Injures Four Near Baldia

Israel-Jordan Border Attack and UN Gaza Genocide Report

You Might also Like