What enterprises need to know about DeepSeeks game-changing R1 AI model
www.computerworld.com
Two years ago, OpenAIs ChatGPT launched a new wave of AI disruption that left the tech industry reassessing its future. Now within the space of a week a small Chinese startup called DeepSeek appears to have pulled off a similar coup, this time at OpenAIs expense.Nevertheless, DeepSeeks sudden success the companys free mobile app quickly surpassed even ChatGPT for downloads on Apples App Store has prompted questions. Is the DeepSeek story too good to be true? And should businesses in the US and allied countries allow employees to use an app when the companys Chinese background and operation are so opaque?What happenedThe DeepSeek storm hit on January 20 when DeepSeek launched its R1 LLM model to the public, complete with big claims around performance.Using smaller distilled LLM models, which require significantly less processing power while replicating the capability of larger models, DeepSeeks R1 matched or exceeded OpenAIs equivalent, o1-mini, in important math and reasoning tests.That performance generated a surge of interest. By Monday the DeepSeek app had overtaken ChatGPT and Temu to become the iPhone App Stores top free download and DeepSeek was reporting delays in new registrations to use the app due to what it described as large-scale malicious attacks on its services.Nobody saw this coming. Somehow, R1 was doing this with less hardware. Moreover, DeepSeek-R1 is available through an open-source MIT license, which allows for unrestricted commercial use, including modification and distribution.With AI sector share prices unsettled by all of this, the implication is that perhaps usable models dont need the huge chip clusters deployed by the established players and organizations shouldnt be paying high prices to access them.Furthermore, if a tiny startup can get by on more limited hardware while training LLMs for a fraction of the cost, perhaps strenuous US attempts to limit the export of the most powerful AI chips to most of the world including China, are already obsolete before theyve been fully implemented.Zero day AIThe speed of DeepSeeks rise is a case of zero-day disruption. Organizations have no time to react, and not just because developers across the world have piled in to test DeepSeek-R1 via its API by the thousand. Releasing a free app gives this capability to everyone, including employees who might enter sensitive data into it. By now, DeepSeek is everywhere, which makes it difficult to control.The app has raced to the top of the app charts, but I would advise anyone considering installing it and using it to exercise some caution, warned tech commentator, Graham Cluley, who also hosts the AI Fix podcast.That said, organizations should already be used to coping with this issue. Human nature being what it is, there will surely be just as much sensitive data entered into DeepSeek as weve seen entered into every other AI out there, said Cluley. Organizations should probably hold back until it has been more thoroughly audited in the same way they would with any new app.Or perhaps focusing on the risks is too negative. DeepSeek will ignite more competition in the sector, potentially turning powerful LLMs from an expensive service for the deep pocketed into a cheap utility anyone can access. Rather than dumping existing AI services, organizations should demand a better deal while avoiding becoming too locked into one LLM as new innovations appear.Censored language modelA lurking possibility is that DeepSeek isnt as good as it seems, with some skepticism already appearing around its price-performance claims. Stacy Rasgon, a senior analyst at Bernstein Research, questioned DeepSeeks underlying costs.Did DeepSeek really build OpenAI for $5M? Of course not, he wrote in a client note. The oft quoted $5M number is calculated by assuming a $2/GPU-hour rental price for this infrastructure, which is fine, but not really what they did, and does not include all the other costs associated with prior research and experiments on architectures, algorithms, or data.In use, DeepSeek makes elementary errors, not dissimilar to the ones that afflicted ChatGPT in its early days. Some of its responses also underline that the app imposes guard rails when run from a Chinese host. A good example is this report of its refusal to acknowledge the Tiananmen Square massacre, something the Chinese Government goes to extreme lengths to hide.In the short term, DeepSeeks appearance underlines the unstable nature of AI itself. Tech is used to periodic disruptions. AI suggests that these might become more routine, including of its own capabilities. It is unlikely to be the last such breakthrough in a sector that will prove harder to dominate than has been assumed.Investors and government regulators trying to control AI development wont like this but if it offers cheaper and earlier AI access across the economy it could still work as a net positive. According to Cluley, DeepSeek should be something for Silicon Valley to worry about.If its accurate that the Chinese have been able to develop a competitive AI that massively undercuts the US-based giants in terms of development cost and with a fraction of the hardware commitment then that is clearly going to upset the applecart and have a tech billionaire or two crying into their Cheerios this morning, he said.
0 Comments ·0 Shares ·18 Views