Three things to know as the dust settles from DeepSeek
www.technologyreview.com
This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first,sign up here.The launch of a single new AI model does not normally cause much of a stir outside tech circles, nor does it typically spook investors enough to wipe out $1 trillion in the stock market. Now, a couple of weeks since DeepSeeks big moment, the dust has settled a bit. The news cycle has moved on to calmer things, like the dismantling of long-standing US federal programs, the purging of research and data sets to comply with recent executive orders, and the possible fallouts from President Trumps new tariffs on Canada, Mexico, and China.Within AI, though, what impact is DeepSeek likely to have in the longer term? Here are three seeds DeepSeek has planted that will grow even as the initial hype fades.First, its forcing a debate about how much energy AI models should be allowed to use up in pursuit of better answers.You may have heard (including from me) that DeepSeek is energy efficient. Thats true for its training phase, but for inference, which is when you actually ask the model something and it produces an answer, its complicated. It uses a chain-of-thought technique, which breaks down complex questions-like whether its ever okay to lie to protect someones feelingsinto chunks, and then logically answers each one. The method allows models like DeepSeek to do better at math, logic, coding, and more.The problem, at least to some, is that this way of thinking uses up a lot more electricity than the AI weve been used to. Though AI is responsible for a small slice of total global emissions right now, there is increasing political support to radically increase the amount of energy going toward AI. Whether or not the energy intensity of chain-of-thought models is worth it, of course, depends on what were using the AI for. Scientific research to cure the worlds worst diseases seems worthy. Generating AI slop? Less so.Some experts worry that the impressiveness of DeepSeek will lead companies to incorporate it into lots of apps and devices, and that users will ping it for scenarios that dont call for it. (Asking DeepSeek to explain Einsteins theory of relativity is a waste, for example, since it doesnt require logical reasoning steps, and any typical AI chat model can do it with less time and energy.) Read more from me here.Second, DeepSeek made some creative advancements in how it trains, and other companies are likely to follow its lead.Advanced AI models dont just learn on lots of text, images, and video. They rely heavily on humans to clean that data, annotate it, and help the AI pick better responses, often for paltry wages.One way human workers are involved is through a technique called reinforcement learning with human feedback. The model generates an answer, human evaluators score that answer, and those scores are used to improve the model. OpenAI pioneered this technique, though its now used widely by the industry.As my colleague Will Douglas Heaven reports, DeepSeek did something different: It figured out a way to automate this process of scoring and reinforcement learning. Skipping or cutting down on human feedbackthats a big thing, Itamar Friedman, a former research director at Alibaba and now cofounder and CEO of Qodo, an AI coding startup based in Israel, told him. Youre almost completely training models without humans needing to do the labor.It works particularly well for subjects like math and coding, but not so well for others, so workers are still relied upon. Still, DeepSeek then went one step further and used techniques reminiscent of how Google DeepMind trained its AI model back in 2016 to excel at the game Go, essentially having it map out possible moves and evaluate their outcomes. These steps forward, especially since they are outlined broadly in DeepSeeks open-source documentation, are sure to be followed by other companies. Read more from Will Douglas Heaven here.Third, its success will fuel a key debate: Can you push for AI research to be open for all to see and push for US competitiveness against China at the same time?Long before DeepSeek released its model for free, certain AI companies were arguing that the industry needs to be an open book. If researchers subscribed to certain open-source principles and showed their work, they argued, the global race to develop superintelligent AI could be treated like a scientific effort for public good, and the power of any one actor would be checked by other participants.Its a nice idea. Meta has largely spoken in support of that vision, and venture capitalist Marc Andreessen has said that open-source approaches can be more effective at keeping AI safe than government regulation. OpenAI has been on the opposite side of that argument, keeping its models closed off on the grounds that it can help keep them out of the hands of bad actors.DeepSeek has made those narratives a bit messier. We have been on the wrong side of history here and need to figure out a different open-source strategy, OpenAIs Sam Altman said in a Reddit AMA on Friday, which is surprising given OpenAIs past stance. Others, including President Trump, doubled down on the need to make the US more competitive on AI, seeing DeepSeeks success as a wake-up call. Dario Amodei, a founder of Anthropic, said its a reminder that the US needs to tightly control which types of advanced chips make their way to China in the coming years, and some lawmakers are pushing the same point.The coming months, and future launches from DeepSeek and others, will stress-test every single one of these arguments.Now read the rest of The AlgorithmDeeper LearningOpenAI launches a research toolOn Sunday, OpenAI launched a tool called Deep Research. You can give it a complex question to look into, and it will spend up to 30 minutes reading sources, compiling information, and writing a report for you. Its brand new, and we havent tested the quality of its outputs yet. Since its computations take so much time (and therefore energy), right now its only available to users with OpenAIs paid Pro tier ($200 per month) and limits the number of queries they can make per month.Why it matters: AI companies have been competing to build useful agents that can do things on your behalf. On January 23, OpenAI launched an agent called Operator that could use your computer for you to do things like book restaurants or check out flight options. The new research tool signals that OpenAI is not just trying to make these mundane online tasks slightly easier; it wants to position AI as able to handle professional research tasks. It claims that Deep Research accomplishes in tens of minutes what would take a human many hours. Time will tell if users will find it worth the high costs and the risk of including wrong information. Read more from Rhiannon Williams.Bits and BytesDj vu: Elon Musk takes his Twitter takeover tactics to WashingtonFederal agencies have offered exits to millions of employees and tested the prowess of engineersjust like when Elon Musk bought Twitter. The similarities have been uncanny. (The New York Times)AIs use in art and movies gets a boost from the Copyright OfficeThe US Copyright Office finds that art produced with the help of AI should be eligible for copyright protection under existing law in most cases, but wholly AI-generated works probably are not. What will that mean? (The Washington Post)OpenAI releases its new o3-mini reasoning model for freeOpenAI just released a reasoning model thats faster, cheaper, and more accurate than its predecessor. (MIT Technology Review)Anthropic has a new way to protect large language models against jailbreaksThis line of defense could be the strongest yet. But no shield is perfect. (MIT Technology Review).
0 Comments
·0 Shares
·49 Views