• MultiVersus will shutdown on May 30 after disappointing performance
    www.gamesindustry.biz
    MultiVersus will shutdown on May 30 after disappointing performanceSeason 5 will mark end of Warner Bros. fighting game News by Rachel Weber Senior Editor Published on Jan. 31, 2025 The Warner Bros. platform fighting game, MultiVersus, will shutdown on May 30, according to a statement by developer Player First Games.The game was released on May 28, 2024."We have an important update to share regarding MultiVersus. After careful consideration, our next Season will serve as the final seasonal content update for the game," said a post on the company's official blog. "All of us on the Player First Games and Warner Bros. Games teams have poured our heart and soul into this game. We will be forever grateful for the incredible support of the MultiVersus community throughout this journey."For players, online features will remain accessible until the end of the game's Season 5, on May 30 at 9am PST. Real money transactions for the game will cease as of today. Players will still be able to play an offline version of the game, solo or against AI, if they download the latest version of the game and log in before May 30.Late last year, Warner Bros. Discovery President and CEO David Zaslav and Chief Financial Officer Gunnar Wiedenfels was clear that the game was not performing as expected. "Results were impacted by games, for which we took another $100+ million impairment due to the underperforming releases primarily MultiVersus this quarter, bringing total writedown year-to-date to over $300 million in our games business, the key factor in this years Studios profit decline," Zaslav said."In Q4, we expect games to be flat to modestly better year-over-year, as last years launch of Hogwarts Legacy on the Switch platform in November is offset by lower costs.""But even in an industry of hits and misses, we must acknowledge that our studio business must deliver more consistently," he added. "This applies to our games business, which we recognise is substantially underperforming its potential right now."
    0 Commentarii ·0 Distribuiri ·31 Views
  • Apple will pay $20 million to settle Apple Watch battery swelling lawsuit
    www.theverge.com
    Eligible models for a payout include the Series 0, 1, 2, and 3.Apple has agreed to a $20 million settlement in a class action lawsuit over battery swelling in early Apple Watch models. If you experienced the issue and owned an Apple Watch Series 0, 1, 2, or 3, you may be eligible for a small payout. The lawsuit, Smith et al. v. Apple Inc, was filed in the United States District Court for the Northern District of California. In both the settlement agreement and claim website, Apple explicitly denies that its smartwatches ever had battery swelling issues and denies all allegations of wrongdoing and liability. Instead, it says that Apple is choosing to settle to avoid further costs of litigation. In a statement sent to The Verge, Apple spokesperson Aushawna Collins says the company strongly disagree[s] with the claims made against these early generation Apple Watch models.To be eligible for a payout, you have to have owned an eligible watch model and have reported any potential battery swelling issues to Apple between April 24th, 2015 and February 6th, 2024. Anyone who fits those criteria has until April 10th to confirm or update their payment information to receive a payout. According to the settlements FAQ site, the payment is estimated to be roughly $20 to $50 per covered watch. Accepting a payment means you also give up any future action regarding battery issues on these particular watches. Those who do not wish to be part of the settlement have until February 24th, 2025 to exclude themselves or object to the settlement.Update, January 31st: Added comment from Apple.
    0 Commentarii ·0 Distribuiri ·32 Views
  • WhatsApp disrupts spyware campaign targeting journalists
    www.theverge.com
    WhatsApp says it disrupted a spyware campaign last month that targeted journalists and civil society members, according to reports from The Guardian and Reuters. The campaign originated from an Israeli spyware company called Paragon Solutions and impacted around 90 users.WhatsApp told The Guardian that it has reached out to affected users, saying it had high confidence that they were targeted and possibly compromised. The Meta-owned app also sent a cease-and-desist order to Paragon and is exploring its legal options, The Guardian reports.Paragon, which Reuters called a competitor to Pegasus maker NSO Group, bills itself as an ethical cyber defense company. It was acquired by the Florida-based private investment firm AE Industrial Partners last year, while a recent report from Wired revealed that US Immigration and Customs Enforcement signed a $2 million contract with Paragon in September 2024.WhatsApp sued NSO Group in 2019 for targeting 1,400 users, including journalists, activists, and government officials. The spyware company has since been found liable.This is the latest example of why spyware companies must be held accountable for their unlawful actions, WhatsApp said in a statement to The Guardian. WhatsApp will continue to protect peoples ability to communicate privately. WhatsApp didnt immediately respond to The Verges request for more information.
    0 Commentarii ·0 Distribuiri ·31 Views
  • DeepSeek Fine-Tuning Made Simple: Create Custom AI Models with Python
    towardsai.net
    LatestMachine LearningDeepSeek Fine-Tuning Made Simple: Create Custom AI Models with Python 0 like January 31, 2025Share this postLast Updated on January 31, 2025 by Editorial TeamAuthor(s): Krishan Walia Originally published on Towards AI. Learn to fine-tune the DeepSeek R1 model for all your use cases.This member-only story is on us. Upgrade to access all of Medium.Not a member?Access the full article here (and dont forget to leave a clap )Why be late to leverage the best in class reasoning with this DeepSeek R1 model?Fine Tune and use it for your awesome project!!While everyones racing to build applications on ChatGPT, savvy developers are quietly discovering DeepSeek-R1s fine-tuning capabilities which is a hidden gem that turns a general-purpose AI into your specialized digital expert.Through this article, you will learn how you can turn a general-purpose DeepSeek R1 model into a specialized, and domain-specific LLM.There has been an emerging group of developers and founders that are not just discovering the latest and well-performing DeepSeek-R1 but are also looking out for ways by which they can integrate this model into their own products.By fine-tuning the model we can make it possible to answer in a specialized and more domain-specific way. With the advanced reasoning capabilities of DeepSeek, it becomes an excellent choice for almost every task that involves thinking or problem-solving, all in a more organized and thoughtful manner.In this article, we will be diving into the process of fine-tuning the DeepSeek-R1 model using Python. Through this article, Read the full blog for free on Medium.Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming asponsor. Published via Towards AITowards AI - Medium Share this post
    0 Commentarii ·0 Distribuiri ·32 Views
  • The Substances Scariest Scene Is All About Demi Moores Performance
    www.denofgeek.com
    Its already a rare win for a horror movie to get an Academy Award nomination for Best Picture, especially one as gleefully gross as Coralie Fargeats The Substance. It would be rarer still for one to actually win, but Demi Moores lead performance has collected at least one hefty early award at the Golden Globes, and shes now the odds-on favorite for Oscar night. No stranger to baring it all onscreen in previous roles, including Striptease and The Scarlet Letter, Moores turn as an aging actress in crisisand the nudity required for its body-swapping plotis no longer a cause for moral panic. Instead Moore is celebrated as brave and transgressive, almost to the point of condescension by virtue of her getting naked onscreen in her 60s.Yet the scariest part of Elisabeth Sparkles descent into self-absorbed madness and monstrosity isnt during all of those spine-splitting scenes of her giving birth to her younger, better half, Sue (a luminous Margaret Qualley, who was unfortunately among the actors snubbed by this years Academy). Moores physicality as she convulses on her bathroom floor and contorts into an ancient crone is brave, no doubt. But Elisabeth is most vulnerable, and affecting, in the quiet moments when she looks at herself in the mirror.If theres one scene that encapsulates The Substance, one perfect moment of realness in its glossy, gory satire that makes its audience squirm, its directly in the middle.With her acting prospects declared dead on her 50th birthday, Elisabeth turns to the mysterious Substance to get a new lease on life, thinking nothing about its true cost. But as younger Sue starts to steal vitality from her matrix mother, Elisabeth becomes even more obsessed with chasing the high of external validation. Shes lived her life in the public eye; her sex appeal to men is her entire sense of worth. So when Sue is offline, Elisabeth has even less of an identity, a life, than she did before taking the Substance. She has no purpose. Thus she spends all of her time alone, eating in front of the TV.Then she remembers Fred, her dorky high school classmate. She wouldnt have given him the time of day before, but now lost, Elisabeth needs her fix of affirmation and calls the man who still remembers her as the most beautiful girl in the whole wide world and asks him to dinner. She sighs in relief when he naturally, excitedly, accepts.What happens next is devastating.You Are OneDemi Moores performance in her date night preparation scene is the hinge of The Substance: we are all at war with ourselves.Maybe Fred is the thing she can use to fill those empty hours when shes not Sue. Maybe not. Fargeat plays up Elisabeths anxiety with perfectly timed tension. The music gets high-pitched and ominous as Elisabeth tries on outfits, settling on one that does look too sexy and fussy for a casual Italian dinner with an old friend. But, you see, this is her last grasp at playing the bombshell, even if its for an audience of one.The camera cuts between Elisabeth checking her appearance in the mirror, the ticking clock, the sleeping beauty of Sue in her time-out closet, and the giant, taunting Sue billboard outside her window. The more Elisabeth compares herself to Sue, the more her self-doubt increases, and this is where the physical mastery of Moores performance is at its best, in those growing cracks in the perfect Elisabeth mask.If theres a human alive who hasnt even once looked in the mirror and hated their own reflection, well, they might not be human at all. Weve all been in Elisabeths shoes, trying to hype ourselves up for a big event. But when youre trapped inside with your own dark thoughts for too long, its easy to become your own worst enemy.Its the voice inside your head that makes you try on every outfit in your closet and decide not one single thing looks right. It tells you that your hair is too thin, or it wont ever sit right. You look tired. You look old. Who are you trying to fool in that get-up? You look ridiculous. No one will want you, not even poor, dorky Fred.Join our mailing listGet the best of Den of Geek delivered right to your inbox!Youre stupid. Youre worthless. Its pathetic to even try. You shouldnt apply for that dream job. You shouldnt go to that meet-up. Maybe you shouldnt leave the house at all. Maybe you even get to a point where you cant leave the house at all.Elisabeths total desperation builds to an explosion of self-hate and ravaged makeup and she is once again alone, in the dark, staring at Freds incoming texts.Poor, dorky Fred. There was a moment where Fargeat could have shown a crack in Freds facade as well. Another nice guy turning nasty in the face of female rejection is certainly not unheard of. It wouldve almost been a relief if Fred had called Elisabeth a fucking ugly old bitch for standing him up. Bullet dodged. But Freds seemingly genuine concern makes Elisabeths inability to leave her apartment even more heartbreaking. The date with Fred was the final off-ramp Elisabeth could have taken. Maybe she could have found enough with him to keep herself from slipping away into Sue and the grotesquerie that soon follows, but Elisabeth couldnt even let herself try.We are all at war with ourselves, and who knows better how to sabotage you than you?After that aborted attempt, The Substance descends into Requiem for a Dream-levels of unsubtle disaster. Elisabeth Sparkle may be just as deluded and lost as Sarah Goldfarb (played by Ellen Burstyn in another Oscar-nominated performance), but Demi Moore and the pop culture baggage she brings injects Elisabeth with an ironic, gleeful rage that turns The Substances final half into an over-the-top bloodbath condemning the unfair, ridiculous industry thats destroying Elisabeth Sparkle and could be rewarding Demi Moore with its top prize; Hollywood and its monster, they are one.
    0 Commentarii ·0 Distribuiri ·31 Views
  • Heres everything Apple TV+ has coming in February
    9to5mac.com
    Apple TV+ kicked off 2025 with two big returning seriesSeverance and Mythic Questthat will continue airing new episodes throughout February. But the streamer also has a varied lineup of fresh debuts coming in the month ahead. Highlights include the Anya Taylor-Joy movie The Gorge, medical drama Berlin ER, a new season of Surface, a sports docu-series, new childrens show, and more. Heres everything coming to Apple TV+ in February.Love You to Death (A muerte)When: February 5What:TV ShowGenre: Romantic Comedy (Spanish language)Love You to Death (A muerte) tells the story of the cautious Ral (Joan Amargs), who reconnects with free-spirited and newly pregnant Marta (Vernica Echegui) following his heart cancer diagnosis. They resume a friendship that began in childhood, and in a relationship brought together by fate, begin to test their beliefs about love. Can the commitment-phobic Marta fall in love? And can Ral meet the love of his life?GoldieWhen: February 14What:TV ShowGenre: Kids & FamilyInspired by Emily Brundiges award-winning 2019 animated short film of the same name, Goldie follows Goldie, a giant girl with a big heart, as she sets off on epic adventures with her best friends in their beloved town of Boysenberg. Across 13 half-hour episodes, together they learn that being different is something to celebrate, and that theres space for everyone in this world even giants.The GorgeWhen: February 14What: MovieGenre: ThrillerIn The Gorge, two highly-trained operatives (Miles Teller and Anya Taylor-Joy) are appointed to posts in guard towers on opposite sides of a vast and highly classified gorge, protecting the world from an undisclosed, mysterious evil that lurks within. They bond from a distance while trying to stay vigilant in defending against an unseen enemy. When the cataclysmic threat to humanity is revealed to them, they must work together in a test of both their physical and mental strength to keep the secret in the gorge before its too late.Onside: Major League SoccerWhen: February 21What:Limited SeriesGenre: Sports DocumentaryGo beyond the pitch with the personalities that power MLS in Onside: Major League Soccer. With unprecedented access to players, coaches and clubs, this series explores the electrifying moments and captivating stories that make the 2024 season unforgettable.Surface (season 2)When: February 21What:TV ShowGenre: Mystery, DramaSet in high-end San Francisco, Surface stars Gugu Mbatha-Raw (The Morning Show) as Sophie, a woman who has suffered a traumatic head injury that has left her with extreme memory loss, believed to be a result of a suicide attempt. As Sophie embarks on a quest to put the pieces of her life back together with the help of her husband and friends, she begins to question whether or not the truth she is told is in fact the truth she has lived. Through twists and turns and a shocking love triangle, this sexy, elevated thriller asks: What if you woke up one day and didnt know your own secrets? Surface is a story of self-discovery which contemplates if we are pre-programmed to become who we are, or if we choose our own identity.Berlin ERWhen: February 26What:TV ShowGenre: Drama (German language)Managing a chaotic emergency room in the toughest and most overcrowded hospital in Berlin is no small task for the young Dr. Parker, who is seeking a fresh start in the big city after her private life implodes in Munich. In Berlin ER, when she tries to implement necessary reforms, Parker is confronted with resistance from the underpaid, ill-equipped and chronically fatigued hospital staff who only survive with an indispensable dose of black humor. But in the face of an increasingly merciless healthcare system, the battered team must put aside their differences and pull together to save lives.Apple TV+ is available for $9.99 per month and features hit TV shows and movies like Ted Lasso, Severance, The Morning Show, Silo, and Shrinking.Which of these Apple TV+ February releases are you most excited about? Let us know in the comments.Best Apple TV and Home accessoriesAdd 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.Youre reading 9to5Mac experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Dont know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel
    0 Commentarii ·0 Distribuiri ·32 Views
  • Apple seeks delay in Google search case so it wont suffer irreparable harm
    9to5mac.com
    Googles $20 billion per year search deal with Apple was ruled last summer to violate antitrust law. Though Apple isnt a defendant in the case, its outcome could have a big impact on the company. As a result, Apple has just asked for a stay on proceedings.New filing shows Apples desire to get involved with proceedingsEarlier this week, Apple was denied a motion to present witnesses at the upcoming Google trial.Apple itself is not accused of wrongdoing in the case, but back when it first made the motion, the company explained: Google can no longer adequately represent Apples interests: Google must now defend against a broad effort to break up its business units.In other words, even though Apple isnt formally involved in the case, it knows that Google has to focus on its own defense interests, so Apple wants the chance to represent itself.Since its motion was just denied, Apple has followed up with another option: requesting a stay on proceedings to protect its rights pending appeal.The new filing reads:Absent a stay, Apple will suffer irreparable harm: the deprivation of its right to participate as a party in the remedial phase of this case moving forward, including possibly at the trial itself, while its undisputed property rights are adjudicated. These harms are magnified by a position Plaintiffs revealed in a recent meet-and-confer with Apple.The full document cites that courts have commonly granted stays pending appeal of orders denying intervention, and thus Apple believes this court should do the same.If Apple isnt able to participate as desired, it has another ask instead:In the alternative, the Court should at minimum afford Apple full access to the record as a nonparty until the D.C. Circuit rules.One way or another, Apple believes its participation in court proceedingsto some degree or anotherwill be necessary to avoid having to suffer irreparable harm.Do you think Apple should be able to participate in Googles case? Why or why not? Let us know in the comments.Best iPhone accessoriesAdd 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.Youre reading 9to5Mac experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Dont know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel
    0 Commentarii ·0 Distribuiri ·33 Views
  • Trump Broke the Federal Email System and Government Employees Got Blasted With Astonishingly Vulgar Messages
    futurism.com
    But her emails...Email EmpowermentTurns out that putting underqualified kids in charge of the federal government's HR agency wasn't the smartest move. Last night, an exploit in the Office of Personnel Management's (OPM) new home-cooked email serverseems to have made it possible for anyone with an email address to blast messages to vast numbers of federal employees.As a result, over 13,000 employees with the National Oceanic and Atmospheric Administration found their inboxes bombarded with spam and messages from vulgar trolls. Some users signed the NOAA up for newsletters from entities like the Church of Scientology, or the Perfect Jean "Welcome to Jean Perfection," a screenshot reads. One particularly vulgar email offered pointers on Trump's alleged performance at a sexual act. An "Important Weather Alert" warned that the next four years have a 99 percent chance of fecal showers. "Aren't you tired of working for a complete c***?" asked one sender. A missive from a sender identified as "Craig" simply reads "yo."Ken Klippenstein, the national security reporter who revealed the breach, once again took the opportunity to plug his infamous newsletter with the subject line: "urgent, time sensitive."If you feel this paints a pretty grim picture of the state of our government agencies, you're not alone. "Goes to show you how fast this [new comms system] was cobbled together," one NOAA employee told Ken Klippenstein. "No security or screening on this address."Spam of GodThe whole thing apparently stems from an overhaul at the OPM led by oligarch-in-chief Elon Musk. On Tuesday, Wired reported that Musk had been given free reign to replace the agency's high-level staff with lackeys from his previous ventures.Those included a 21-year-old who had previously worked for Peter Thiel, and a summer intern from Neuralink who just graduated high school. It also included Amanda Scales a former xAI HR staffer who is reportedly in place as the OPM's new chief of staff.Scales is allegedly implementing what some have called a hostile takeover of the OPM, axing the Chief Information Officer Melvin Brown II for refusing to implement the new regime's in-house email server. Brown evidently made the right call, as the new system on top of all the aforementioned drama was immediately hit with a class-action lawsuit for failing to pass Bush-era cyber security checks.All this server drama is important, as it's reportedly key in DOGE's goal of gathering information on every government employee. Tuesday's much-reported "fork in the road" email memos came from this unsecured server, which unintentionally revealed the involvement of two non-government individuals, both heavily involved in Project 2025.As Trump's acolytes look to gut the federal government and install their own yes men, the drama swirling around this email server will have a lot to reveal about the new administration's unprecedented strategy.More on email leaks: In Leaked Email, Elon Musk Admits Defeat on TwitterShare This Article
    0 Commentarii ·0 Distribuiri ·35 Views
  • OpenAI Strikes Deal With US Government to Use Its AI for Nuclear Weapon Security
    futurism.com
    Remember the plot to the 1984 sci-fi blockbuster "The Terminator"?"There was a nuclear war,"a character explains. "Defense network computers. New... powerful... hooked into everything, trusted to run it all. They say it got smart, a new order of intelligence. Then it saw all people as a threat, not just the ones on the other side. Decided our fate in a microsecond: extermination."It seems like either the execs at OpenAI have never seen it or they're working overtime to make that premise a reality.Don't believe us? OpenAI has announced that the US National Laboratories will use its deeply flawed AI models to help with a "comprehensive program in nuclear security."As CNBC reports, up to 15,000 scientists working at the institutions will get access to OpenAI's latest o1 series of AI models the ones that Chinese startup DeepSeek embarrassed on the world stage earlier this month.According to OpenAI CEO Sam Altman, who announced the partnership at an event in Washington, DC, the tech will be "focused on reducing the risk of nuclear war and securing nuclear materials and weapons worldwide," as quoted by CNBC.If any alarm bells are ringing by this point, you're not alone. We've seen plenty of instances of OpenAI's AI models leaking sensitive user dataand hallucinating false claims with abandon.OpenAI's been making a huge push into government. Earlier this week, the Sam Altman-led company released ChatGPT Gov, a platform specifically designed for US government use that focuses on security.But whether the company can deliver on some sky-high expectations while also ensuring that its frequently lying AI chatbots won't leak the nuclear codes or trigger the next nuclear war is anyone's guess.The news comes after the Wall Street Journal reported that OpenAI is in early talks for a new round of funding that would value it at a gargantuan $340 billion, double its previous valuation last year.Altman has also fully embraced president Donald Trump, gifting him $1 million for his inauguration and claiming that he had "really changed my perspective on him" after trashing him in years past.OpenAI also signed onto Trump's $500 billion AI infrastructure deal, dubbed Stargate, with the plan of contributing tens of billions of dollars within the next year.Whether the company's o1 reasoning models will prove useful in any meaningful way to the researchers at the US National Laboratories remains to be seen.But given the widespread dismantling of regulations under the Trump administration, it also feels like an unbelievablyprecarious moment to be handing over any amount of control over nuclear weapons to a busted AI system.Share This Article
    0 Commentarii ·0 Distribuiri ·37 Views
  • How DeepSeek ripped up the AI playbookand why everyones going to follow its lead
    www.technologyreview.com
    When the Chinese firm DeepSeek dropped a large language model called R1 last week, it sent shock waves through the US tech industry. Not only did R1 match the best of the homegrown competition, it was built for a fraction of the costand given away for free. The US stock market lost $1 trillion, President Trump called it a wake-up call, and the hype was dialed up yet again. DeepSeek R1 is one of the most amazing and impressive breakthroughs Ive ever seenand as open source, a profound gift to the world, Silicon Valleys kingpin investor Marc Andreessen posted on X. But DeepSeeks innovations are not the only takeaway here. By publishing details about how R1 and a previous model called V3 were built and releasing the models for free, DeepSeek has pulled back the curtain to reveal that reasoning models are a lot easier to build than people thought. The company has closed the lead on the worlds very top labs. The news kicked competitors everywhere into gear. This week, the Chinese tech giant Alibaba announced a new version of its large language model Qwen and the Allen Institute for AI (AI2), a top US nonprofit lab, announced an update to its large language model Tulu. Both claim that their latest models beat DeepSeeks equivalent. Sam Altman, cofounder and CEO of OpenAI, called R1 impressivefor the pricebut hit back with a bullish promise: We will obviously deliver much better models. OpenAI then pushed out ChatGPT Gov, a version of its chatbot tailored to the security needs of US government agencies, in an apparent nod to concerns that DeepSeeks app was sending data to China. Theres more to come. DeepSeek has suddenly become the company to beat. What exactly did it do to rattle the tech world so fully? Is the hype justified? And what can we learn from the buzz about whats coming next? Heres what you need to know. Training steps Lets start by unpacking how large language models are trained. There are two main stages, known as pretraining and post-training. Pretraining is the stage most people talk about. In this process, billions of documentshuge numbers of websites, books, code repositories, and moreare fed into a neural network over and over again until it learns to generate text that looks like its source material, one word at a time. What you end up with is known as a base model. Pretraining is where most of the work happens, and it can cost huge amounts of money. But as Andrej Karpathy, a cofounder of OpenAI and former head of AI at Tesla, noted in a talk at Microsoft Build last year: Base models are not assistants. They just want to complete internet documents. Turning a large language model into a useful tool takes a number of extra steps. This is the post-training stage, where the model learns to do specific tasks like answer questions (or answer questions step by step, as with OpenAIs o3 and DeepSeeks R1). The way this has been done for the last few years is to take a base model and train it to mimic examples of question-answer pairs provided by armies of human testers. This step is known as supervised fine-tuning. OpenAI then pioneered yet another step, in which sample answers from the model are scoredagain by human testersand those scores used to train the model to produce future answers more like those that score well and less like those that dont. This technique, known as reinforcement learning with human feedback (RLHF), is what makes chatbots like ChatGPT so slick. RLHF is now used across the industry. But those post-training steps take time. What DeepSeek has shown is that you can get the same results without using people at allat least most of the time. DeepSeek replaces supervised fine-tuning and RLHF with a reinforcement-learning step that is fully automated. Instead of using human feedback to steer its models, the firm uses feedback scores produced by a computer. Skipping or cutting down on human feedbackthats a big thing, says Itamar Friedman, a former research director at Alibaba and now cofounder and CEO of Qodo, an AI coding startup based in Israel. Youre almost completely training models without humans needing to do the labor. Cheap labor The downside of this approach is that computers are good at scoring answers to questions about math and code but not very good at scoring answers to open-ended or more subjective questions. Thats why R1 performs especially well on math and code tests. To train its models to answer a wider range of non-math questions or perform creative tasks, DeepSeek still has to ask people to provide the feedback. But even that is cheaper in China. Relative to Western markets, the cost to create high-quality data is lower in China and there is a larger talent pool with university qualifications in math, programming, or engineering fields, says Si Chen, a vice president at the Australian AI firm Appen and a former head of strategy at both Amazon Web Services China and the Chinese tech giant Tencent. DeepSeek used this approach to build a base model, called V3, that rivals OpenAIs flagship model GPT-4o. The firm released V3 a month ago. Last weeks R1, the new model that matches OpenAIs o1, was built on top of V3. To build R1, DeepSeek took V3 and ran its reinforcement-learning loop over and over again. In 2016 Google DeepMind showed that this kind of automated trial-and-error approach, with no human input, could take a board-game-playing model that made random moves and train it to beat grand masters. DeepSeek does something similar with large language models: Potential answers are treated as possible moves in a game. To start with, the model did not produce answers that worked through a question step by step, as DeepSeek wanted. But by scoring the models sample answers automatically, the training process nudged it bit by bit toward the desired behavior. Eventually, DeepSeek produced a model that performed well on a number of benchmarks. But this model, called R1-Zero, gave answers that were hard to read and were written in a mix of multiple languages. To give it one last tweak, DeepSeek seeded the reinforcement-learning process with a small data set of example responses provided by people. Training R1-Zero on those produced the model that DeepSeek named R1. Theres more. To make its use of reinforcement learning as efficient as possible, DeepSeek has also developed a new algorithm called Group Relative Policy Optimization (GRPO). It first used GRPO a year ago, to build a model called DeepSeekMath. Well skip the detailsyou just need to know that reinforcement learning involves calculating a score to determine whether a potential move is good or bad. Many existing reinforcement-learning techniques require a whole separate model to make this calculation. In the case of large language models, that means a second model that could be as expensive to build and run as the first. Instead of using a second model to predict a score, GRPO just makes an educated guess. Its cheap, but still accurate enough to work. A common approach DeepSeeks use of reinforcement learning is the main innovation that the company describes in its R1 paper. But DeepSeek is not the only firm experimenting with this technique. Two weeks before R1 dropped, a team at Microsoft Asia announced a model called rStar-Math, which was trained in a similar way. It has similarly huge leaps in performance, says Matt Zeiler, founder and CEO of the AI firm Clarifai. AI2s Tulu was also built using efficient reinforcement-learning techniques (but on top of, not instead of, human-led steps like supervised fine-tuning and RLHF). And the US firm Hugging Face is racing to replicate R1 with OpenR1, a clone of DeepSeeks model that Hugging Face hopes will expose even more of the ingredients in R1s special sauce. Whats more, its an open secret that top firms like OpenAI, Google DeepMind, and Anthropic may already be using their own versions of DeepSeeks approach to train their new generation of models. Im sure theyre doing almost the exact same thing, but theyll have their own flavor of it, saysZeiler. But DeepSeek has more than one trick up its sleeve. It trained its base model V3 to do something called multi-token prediction, where the model learns to predict a string of words at once instead of one at a time. This training is cheaper and turns out to boost accuracy as well. If you think about how you speak, when youre halfway through a sentence, you know what the rest of the sentence is going to be, says Zeiler. These models should be capable of that too. It has also found cheaper ways to create large data sets. To train last years model, DeepSeekMath, it took a free data set called Common Crawla huge number of documents scraped from the internetand used an automated process to extract just the documents that included math problems. This was far cheaper than building a new data set of math problems by hand. It was also more effective: Common Crawl includes a lot more math than any other specialist math data set thats available. And on the hardware side, DeepSeek has found new ways to juice old chips, allowing it to train top-tier models without coughing up for the latest hardware on the market. Half their innovation comes from straight engineering, says Zeiler: They definitely have some really, really good GPU engineers on that team. Nvidia provides software called CUDA that engineers use to tweak the settings of their chips. But DeepSeek bypassed this code using assembler, a programming language that talks to the hardware itself, to go far beyond what Nvidia offers out of the box. Thats as hardcore as it gets in optimizing these things, says Zeiler. You can do it, but basically its so difficult that nobody does. DeepSeeks string of innovations on multiple models is impressive. But it also shows that the firms claim to have spent less than $6 million to train V3 is not the whole story. R1 and V3 were built on a stack of existing tech. Maybe the very last stepthe last click of the buttoncost them $6 million, but the research that led up to that probably cost 10 times as much, if not more, says Friedman. And in a blog post that cut through a lot of the hype, Anthropic cofounder and CEO Dario Amodei pointed out that DeepSeek probably has around $1 billion worth of chips, an estimate based on reports that the firm in fact used 50,000 Nvidia H100 GPUs. A new paradigm But why now? There are hundreds of startups around the world trying to build the next big thing. Why have we seen a string of reasoning models like OpenAIs o1 and o3, Google DeepMinds Gemini 2.0 Flash Thinking, and now R1 appear within weeks of each other? The answer is that the base modelsGPT-4o, Gemini 2.0, V3are all now good enough to have reasoning-like behavior coaxed out of them. What R1 shows is that with a strong enough base model, reinforcement learning is sufficient to elicit reasoning from a language model without any human supervision, says Lewis Tunstall, a scientist at Hugging Face. In other words, top US firms may have figured out how to do it but were keeping quiet. It seems that theres a clever way of taking your base model, your pretrained model, and turning it into a much more capable reasoning model, says Zeiler. And up to this point, the procedure that was required for converting a pretrained model into a reasoning model wasnt well known. It wasnt public. Whats different about R1 is that DeepSeek published how they did it. And it turns out that its not that expensive a process, says Zeiler. The hard part is getting that pretrained model in the first place. As Karpathy revealed at Microsoft Build last year, pretraining a model represents 99% of the work and most of the cost. If building reasoning models is not as hard as people thought, we can expect a proliferation of free models that are far more capable than weve yet seen. With the know-how out in the open, Friedman thinks, there will be more collaboration between small companies, blunting the edge that the biggest companies have enjoyed. I think this could be a monumental moment, he says.
    0 Commentarii ·0 Distribuiri ·29 Views