Microsoft now hosts AI model accused of copying OpenAI data
arstechnica.com
ET TU, BRUTE? Microsoft now hosts AI model accused of copying OpenAI data OpenAI's largest investor now sells access to the "R1" model accused of breaking OpenAI's terms. Benj Edwards Jan 30, 2025 11:37 am | 48 Credit: Wong Yu Liang via Getty Images Credit: Wong Yu Liang via Getty Images Story textSizeSmallStandardLargeWidth *StandardWideLinksStandardOrange* Subscribers only Learn moreFresh on the heels of a controversy in which ChatGPT-maker OpenAI accused the Chinese company behind DeepSeek R1 of using its AI model outputs against its terms of service, OpenAI's largest investor, Microsoft, announced on Wednesday that it will now host DeepSeek R1 on its Azure cloud service.DeepSeek R1 has been the talk of the AI world for the past week because it is a freely available simulated reasoning model that reportedly matches OpenAI's o1 in performancewhile allegedly being trained for a fraction of the cost.Azure allows software developers to rent computing muscle from machines hosted in Microsoft-owned data centers, as well as rent access to software that runs on them."R1 offers a powerful, cost-efficient model that allows more users to harness state-of-the-art AI capabilities with minimal infrastructure investment," wrote Microsoft Corporate Vice President Asha Sharma in a news release.DeepSeek R1 runs at a fraction of the cost of o1, at least through each company's own services. Comparative prices for R1 and o1 were not immediately available on Azure, but DeepSeek lists R1's API cost as $2.19 per million output tokens, while OpenAI's o1 costs $60 per million output tokens. That's a massive discount for a model that performs similarly to o1-pro in various tasks.Promoting a controversial AI modelOn its face, the decision to host R1 on Microsoft servers is not unusual: The company offers access to over 1,800 models on its Azure AI Foundry service with the hopes of allowing software developers to experiment with various AI models and integrate them into their products. In some ways, whatever model they choose, Microsoft still wins because it's being hosted on the company's cloud service.In another way, though, the move is a stamp of legitimacy on an AI model that has caused consternation for OpenAI over the past week. The controversy primarily centers on whether DeepSeek used OpenAI's models to produce outputs (synthetic data) that DeepSeek then used to train or fine-tune its own models, a practice often called "distillation," which is against OpenAI's terms of service.Since the launch of DeepSeek V3 (a large language model that served as the progenitor of R1), users have reported that the model often calls itself ChatGPT, which suggests that at least some ChatGPT-produced data was used to fine-tune V3's behavior. It wouldn't be the first time AI researchers have cribbed off of OpenAI: AI experts accused Elon Musk's xAI of doing something similar with its Grok AI model in December 2023.And that's not the only issue at hand. In addition to the terms-of-service accusation and testy tweets from OpenAI employees, Microsoft also reportedly launched a probe into DeepSeek after Microsoft's security researchers discovered that the Chinese company may have extracted substantial data for training purposes through OpenAI's API during the fall of 2024, according to Bloomberg.Despite the controversies, OpenAI CEO Sam Altman welcomed the additional competition from DeepSeek earlier this week. On Monday, Altman tweeted, "deepseek's r1 is an impressive model, particularly around what they're able to deliver for the price. we will obviously deliver much better models and also it's legit invigorating to have a new competitor! we will pull up some releases."As a response to R1's rise, OpenAI is expected to release o3-mini through ChatGPT as soon as later today.Benj EdwardsSenior AI ReporterBenj EdwardsSenior AI Reporter Benj Edwards is Ars Technica's Senior AI Reporter and founder of the site's dedicated AI beat in 2022. He's also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC. 48 Comments
0 Commentarios ·0 Acciones ·52 Views