WCCFTECH.COM
DeepSeek R2 AI Model Rumors Begin to Swirl Online; Reported to Feature 97% Lower Costs Compared to GPT-4 & Fully Trained on Huawei’s Ascend Chips
Well, it seems like the Chinese firm DeepSeek is set to drop another model into the market pretty soon, as details about their next "DeepSeek R2" model have surfaced on the internet. DeepSeek R2 Could Potentially Disrupt The AI Markets Once Again; Said To Be Trained Dominantly By Huawei's AI Chips DeepSeek's first mainstream model, the R1, showed the Western world that China isn't behind at all when it comes to developing high-end AI models. China's release shocked the US stock market to the point where it lost billions in valuation, but it also showed that developing AI models doesn't require as high a cost as companies like OpenAI had disclosed to the public. Now, Chinese media outlets have started to report on rumors around DeepSeek's next R2 AI model, and it won't be wrong to say that the Western AI markets could see another surprising development coming from China. Before we go into the details, it is important to take the rumors with a grain of salt, since DeepSeek has yet to confirm official figures about their next model. The Chinese sources claim that the R2 model is set to adopt a hybrid MoE (Mixture of Experts) architecture, which is said to be an advanced version of the existing MoE implementation, likely featuring advanced gating mechanisms or a combination of MoE and dense layers to optimize high-end workloads. With this architecture, DeepSeek R2 is set to feature double the parameters of R1, coming in at 1.2 trillion. Just based on that figure, R2 is said to rival GPT-4 Turbo and Google's Gemini 2.0 Pro, but this isn't the only area where DeepSeek plans to make an impact. The report claims that with DeepSeek R2, unit costs per token are lower than 97.4% compared to GPT-4 -4 coming in with $0.07/M input token and a $0.27/M output token. Compared with OpenAI's pricing, DeepSeek's R2 model will be a bargain for enterprises, since it will be the most cost-efficient model out there. The release could prove to be a decisive moment for AI and the economics around it. Another interesting fact disclosed about DeepSeek R2 is that the model is said to achieve 82% utilization of Huawei's Ascend 910B chip cluster, with computing power measured at 512 PetaFLOPS at FP16 precision, which shows that DeepSeek did decide to use in-house resources for its next mainstream model. We knew that the Chinese AI firm was heavily interested in Huawei's AI chips, and by training R2 with in-house equipment, DeepSeek had essentially "vertically integrated" the AI supply chain. It is important to note once again that the developments around DeepSeek R2 are speculative and that the final model could be something different. However, based on what Chinese media sources are reporting, R2 seems like another release that will surprise the mainstream AI companies.
0 Комментарии 0 Поделились 40 Просмотры