DeepSeek's AI costs far exceed $5.5 million claim, may have reached $1.6 billion with 50,000 Nvidia GPUs
www.techspot.com
In brief: China's DeepSeek threw the multi-billion-dollar AI industry into chaos recently with the release of its R1 model, which is said to compete with OpenAI's o1 despite being trained on 2,048 Nvidia H800s and at a cost of $5.576 million. However, a new report claims that the true costs incurred by the firm were $1.6 billion, and that DeepSeek has access to around 50,000 Hopper GPUs. The claim that DeepSeek was able to train R1 using a fraction of the resources required by big tech companies invested in AI wiped a record $600 billion off Nvidia's share price in one day. If the Chinese startup to could make a model this powerful without spending billions on Team Green's most powerful AI GPUs, what would stop everyone else doing it?But did DeepSeek really create its Mixture-of-Experts model, which still tops the Apple App Store charts, at such a low cost? SemiAnalysis claims that it didn't.The market intelligence firm writes that DeepSeek has access to around 50,000 Hopper GPUs, including 10,000 H800s and 10,000 H100. It also has orders for many more China-specific H20s. The GPUs are shared between High-Flyer, the quantitative hedge fund behind DeepSeek, and the startup. They are distributed across several geographical locations and are used for trading, inference, training, and research.Courtesy of SemiAnalysisSemiAnalysis writes that DeepSeek has invested much more than the claimed $5.5 million figure that sent the stock market into a tailspin the report states that this pre-training cost is a very narrow portion of the total. The company's overall investment in servers is around $1.6 billion, with around $944 million spent on operating costs. The GPU investments, meanwhile, account for more than $500 million. // Related StoriesAs a reference example, Anthropic's Claude 3.5 Sonnet cost tens of millions of dollars to train, but the company still needed to raise billions of dollars of investment from Google and Amazon.It's noted that DeepSeek has sourced all its talent exclusively from China. That is a contrast to reports of other Chinese tech companies, such as Huawei, trying to poach workers from overseas, with Taiwanese employees of TSMC being highly sought-after targets. DeepSeek allegedly offers salaries of over $1.3 million for promising candidates, much more than competing Chinese AI firms pay.DeepSeek also has the advantage of mostly running its own datacenters, rather than having to rely on external cloud providers. This allows for more experimentation and innovation across its AI product stack. SemiAnalysis writes that it is the single best "open weights" lab today, beating out Meta's Llama effort, Mistral, and others.Masthead: Solen Feyissa
0 Commentarii ·0 Distribuiri ·67 Views