• In a world where vibe coders are apparently too busy feeling the code to notice their catastrophic errors, Cursor has decided to swoop in like a digital superhero with its new Bugbot. Designed to "save vibe coders from themselves," this AI marvel promises to supercharge error detection. Because let’s be honest, who needs actual coding skills when you can just vibe your way into a bug-free zone?

    Next thing you know, we'll have robots debugging while we meditate on our latest project. Can't wait for the day when these bots will also start sending us inspirational quotes about debugging life!

    #VibeCoding #CursorBugbot #ErrorDetection #AIProgramming #TechHumor
    In a world where vibe coders are apparently too busy feeling the code to notice their catastrophic errors, Cursor has decided to swoop in like a digital superhero with its new Bugbot. Designed to "save vibe coders from themselves," this AI marvel promises to supercharge error detection. Because let’s be honest, who needs actual coding skills when you can just vibe your way into a bug-free zone? Next thing you know, we'll have robots debugging while we meditate on our latest project. Can't wait for the day when these bots will also start sending us inspirational quotes about debugging life! #VibeCoding #CursorBugbot #ErrorDetection #AIProgramming #TechHumor
    www.wired.com
    One of the most popular platforms for AI-assisted programming says the next era of vibe coding is all about supercharging error detection.
    Like
    Love
    Wow
    Sad
    Angry
    173
    · 1 التعليقات ·0 المشاركات ·0 معاينة
  • How AI is reshaping the future of healthcare and medical research

    Transcript       
    PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”          
    This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.   
    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?    
    In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  The book passage I read at the top is from “Chapter 10: The Big Black Bag.” 
    In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.   
    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open. 
    As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.  
    Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home. 
    Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.     
    Here’s my conversation with Bill Gates and Sébastien Bubeck. 
    LEE: Bill, welcome. 
    BILL GATES: Thank you. 
    LEE: Seb … 
    SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here. 
    LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening? 
    And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?  
    GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines. 
    And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.  
    And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weaknessthat, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning. 
    LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that? 
    GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, … 
    LEE: Right.  
    GATES: … that is a bit weird.  
    LEE: Yeah. 
    GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training. 
    LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent. 
    BUBECK: Yes.  
    LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSRto join and start investigating this thing seriously. And the first person I pulled in was you. 
    BUBECK: Yeah. 
    LEE: And so what were your first encounters? Because I actually don’t remember what happened then. 
    BUBECK: Oh, I remember it very well.My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3. 
    I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1. 
    So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair.And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts. 
    So this was really, to me, the first moment where I saw some understanding in those models.  
    LEE: So this was, just to get the timing right, that was before I pulled you into the tent. 
    BUBECK: That was before. That was like a year before. 
    LEE: Right.  
    BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4. 
    So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.  
    So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x. 
    And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?  
    LEE: Yeah.
    BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.  
    LEE:One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine. 
    And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.  
    And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.  
    I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book. 
    But the main purpose of this conversation isn’t to reminisce aboutor indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements. 
    But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today? 
    You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.  
    Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork? 
    GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.  
    It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision. 
    But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view. 
    LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients.Does that make sense to you? 
    BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong? 
    Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.  
    Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them. 
    And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT. And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.  
    Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way. 
    It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine. 
    LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all? 
    GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that. 
    The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa,
    So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.  
    LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking? 
    GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.  
    The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.  
    LEE: Right.  
    GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.  
    LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication. 
    BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE, for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI. 
    It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for. 
    LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes. 
    I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?  
    That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential.What’s up with that? 
    BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back thatversion of GPT-4o, so now we don’t have the sycophant version out there. 
    Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF, where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad. 
    But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model. 
    So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model. 
    LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and … 
    BUBECK: It’s a very difficult, very difficult balance. 
    LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models? 
    GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there. 
    Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?  
    Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there.
    LEE: Yeah.
    GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake. 
    LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on. 
    BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGIthat kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything. 
    That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects.So it’s … I think it’s an important example to have in mind. 
    LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two? 
    BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it. 
    LEE: So we have about three hours of stuff to talk about, but our time is actually running low.
    BUBECK: Yes, yes, yes.  
    LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now? 
    GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.  
    The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities. 
    And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period. 
    LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers? 
    GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them. 
    LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.  
    I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why. 
    BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and seeproduced what you wanted. So I absolutely agree with that.  
    And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini. So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.  
    LEE: Yeah. 
    BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.  
    Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not. 
    Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision. 
    LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist … 
    BUBECK: Yeah.
    LEE: … or an endocrinologist might not.
    BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.
    LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today? 
    BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later. 
    And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …  
    LEE: Will AI prescribe your medicines? Write your prescriptions? 
    BUBECK: I think yes. I think yes. 
    LEE: OK. Bill? 
    GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate?
    And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelectedjust on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries. 
    You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that. 
    LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.  
    I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.  
    GATES: Yeah. Thanks, you guys. 
    BUBECK: Thank you, Peter. Thanks, Bill. 
    LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.   
    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.  
    And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.  
    One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.  
    HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings. 
    You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.  
    If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.  
    I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.  
    Until next time.  
    #how #reshaping #future #healthcare #medical
    How AI is reshaping the future of healthcare and medical research
    Transcript        PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”           This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?     In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  The book passage I read at the top is from “Chapter 10: The Big Black Bag.”  In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open.  As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.   Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home.  Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.      Here’s my conversation with Bill Gates and Sébastien Bubeck.  LEE: Bill, welcome.  BILL GATES: Thank you.  LEE: Seb …  SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here.  LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening?  And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?   GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines.  And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.   And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weaknessthat, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning.  LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that?  GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, …  LEE: Right.   GATES: … that is a bit weird.   LEE: Yeah.  GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training.  LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent.  BUBECK: Yes.   LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSRto join and start investigating this thing seriously. And the first person I pulled in was you.  BUBECK: Yeah.  LEE: And so what were your first encounters? Because I actually don’t remember what happened then.  BUBECK: Oh, I remember it very well.My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3.  I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1.  So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair.And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts.  So this was really, to me, the first moment where I saw some understanding in those models.   LEE: So this was, just to get the timing right, that was before I pulled you into the tent.  BUBECK: That was before. That was like a year before.  LEE: Right.   BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4.  So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.   So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x.  And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?   LEE: Yeah. BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.   LEE:One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine.  And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.   And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.   I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book.  But the main purpose of this conversation isn’t to reminisce aboutor indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements.  But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today?  You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.   Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork?  GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.   It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision.  But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view.  LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients.Does that make sense to you?  BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong?  Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.   Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them.  And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT. And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.   Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way.  It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine.  LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all?  GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that.  The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa, So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.   LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking?  GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.   The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.   LEE: Right.   GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.   LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication.  BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE, for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI.  It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for.  LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes.  I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?   That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential.What’s up with that?  BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back thatversion of GPT-4o, so now we don’t have the sycophant version out there.  Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF, where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad.  But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model.  So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model.  LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and …  BUBECK: It’s a very difficult, very difficult balance.  LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models?  GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there.  Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?   Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there. LEE: Yeah. GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake.  LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on.  BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGIthat kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything.  That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects.So it’s … I think it’s an important example to have in mind.  LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two?  BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it.  LEE: So we have about three hours of stuff to talk about, but our time is actually running low. BUBECK: Yes, yes, yes.   LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now?  GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.   The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities.  And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period.  LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers?  GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them.  LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.   I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why.  BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and seeproduced what you wanted. So I absolutely agree with that.   And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini. So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.   LEE: Yeah.  BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.   Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not.  Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision.  LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist …  BUBECK: Yeah. LEE: … or an endocrinologist might not. BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know. LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today?  BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later.  And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …   LEE: Will AI prescribe your medicines? Write your prescriptions?  BUBECK: I think yes. I think yes.  LEE: OK. Bill?  GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate? And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelectedjust on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries.  You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that.  LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.   I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.   GATES: Yeah. Thanks, you guys.  BUBECK: Thank you, Peter. Thanks, Bill.  LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.   And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.   One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.   HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings.  You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.   If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.   I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.   Until next time.   #how #reshaping #future #healthcare #medical
    How AI is reshaping the future of healthcare and medical research
    www.microsoft.com
    Transcript [MUSIC]      [BOOK PASSAGE]   PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”   [END OF BOOK PASSAGE]     [THEME MUSIC]     This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?     In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.   [THEME MUSIC FADES] The book passage I read at the top is from “Chapter 10: The Big Black Bag.”  In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open.  As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.   Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home.  Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.    [TRANSITION MUSIC]   Here’s my conversation with Bill Gates and Sébastien Bubeck.  LEE: Bill, welcome.  BILL GATES: Thank you.  LEE: Seb …  SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here.  LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening?  And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?   GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines.  And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.   And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weakness [LAUGHTER] that, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning.  LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that?  GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, …  LEE: Right.   GATES: … that is a bit weird.   LEE: Yeah.  GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training.  LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent. [LAUGHS]  BUBECK: Yes.   LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSR [Microsoft Research] to join and start investigating this thing seriously. And the first person I pulled in was you.  BUBECK: Yeah.  LEE: And so what were your first encounters? Because I actually don’t remember what happened then.  BUBECK: Oh, I remember it very well. [LAUGHS] My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3.  I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1.  So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair. [LAUGHTER] And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts.  So this was really, to me, the first moment where I saw some understanding in those models.   LEE: So this was, just to get the timing right, that was before I pulled you into the tent.  BUBECK: That was before. That was like a year before.  LEE: Right.   BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4.  So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.   So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x.  And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?   LEE: Yeah. BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.   LEE: [LAUGHS] One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine.  And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.   And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.   I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book.  But the main purpose of this conversation isn’t to reminisce about [LAUGHS] or indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements.  But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today?  You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.   Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork?  GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.   It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision.  But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view.  LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients. [LAUGHTER] Does that make sense to you?  BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong?  Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.   Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them.  And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT (opens in new tab). And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.   Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way.  It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine.  LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all?  GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that.  The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa, So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.   LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking?  GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.   The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.   LEE: Right.   GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.   LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication.  BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE [United States Medical Licensing Examination], for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI.  It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for.  LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes.  I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?   That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential. [LAUGHTER] What’s up with that?  BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back that [LAUGHS] version of GPT-4o, so now we don’t have the sycophant version out there.  Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF [reinforcement learning from human feedback], where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad.  But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model.  So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model.  LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and …  BUBECK: It’s a very difficult, very difficult balance.  LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models?  GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there.  Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?   Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there. LEE: Yeah. GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake.  LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on.  BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGI [artificial general intelligence] that kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything.  That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects. [LAUGHTER] So it’s … I think it’s an important example to have in mind.  LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two?  BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it.  LEE: So we have about three hours of stuff to talk about, but our time is actually running low. BUBECK: Yes, yes, yes.   LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now?  GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.   The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities.  And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period.  LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers?  GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them.  LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.   I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why.  BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and see [if you have] produced what you wanted. So I absolutely agree with that.   And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab). So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.   LEE: Yeah.  BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.   Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not.  Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision.  LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist …  BUBECK: Yeah. LEE: … or an endocrinologist might not. BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know. LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today?  BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later.  And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …   LEE: Will AI prescribe your medicines? Write your prescriptions?  BUBECK: I think yes. I think yes.  LEE: OK. Bill?  GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate? And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelected [LAUGHTER] just on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries.  You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that.  LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.   I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.  [TRANSITION MUSIC]  GATES: Yeah. Thanks, you guys.  BUBECK: Thank you, Peter. Thanks, Bill.  LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.   And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.   One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.   HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings.  You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.   If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.  [THEME MUSIC]  I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.   Until next time.   [MUSIC FADES]
    0 التعليقات ·0 المشاركات ·0 معاينة
  • Cursor’s Anysphere nabs $9.9B valuation, soars past $500M ARR

    Anysphere, the maker of AI coding assistant Cursor, has raised million at a billion valuation, Bloomberg reported. The round was led by returning investor Thrive Capital, with participation from Andreessen Horowitz, Accel, and DST Global.
    The massive round is Anysphere’s third fundraise in less than a year. The 3-year-old startup secured its previous capital haul of million at a pre-money valuation of billion late last year, as TechCrunch was first to report. 
    AI coding assistants, often referred to as “vibe coders,” have emerged as one of AI’s most popular applications, with Cursor leading the category. Anysphere’s annualized revenuehas been doubling approximately every two months, a person familiar with the company told TechCrunch. The company has surpassed million in ARR, sources told Bloomberg, a 60% increase from the million we reported in mid-April.
    Cursor offers developers tiered pricing. After a two-week free trial, the company converts users into paying customers, who can opt for either a Pro offering or a monthly business subscription.
    Until recently, the majority of the company’s revenue came from individual user subscriptions, Bloomberg reported. However, Anysphere is now offering enterprise licenses, allowing companies to purchase the application for their teams at a higher price point.
    Earlier this year, the company was approached by OpenAI and other potential buyers, but Anysphere turned down those offers. The ChatGPT maker bought Windsurf, another fast-growing AI assistant, reportedly for billion.
    #cursors #anysphere #nabs #99b #valuation
    Cursor’s Anysphere nabs $9.9B valuation, soars past $500M ARR
    Anysphere, the maker of AI coding assistant Cursor, has raised million at a billion valuation, Bloomberg reported. The round was led by returning investor Thrive Capital, with participation from Andreessen Horowitz, Accel, and DST Global. The massive round is Anysphere’s third fundraise in less than a year. The 3-year-old startup secured its previous capital haul of million at a pre-money valuation of billion late last year, as TechCrunch was first to report.  AI coding assistants, often referred to as “vibe coders,” have emerged as one of AI’s most popular applications, with Cursor leading the category. Anysphere’s annualized revenuehas been doubling approximately every two months, a person familiar with the company told TechCrunch. The company has surpassed million in ARR, sources told Bloomberg, a 60% increase from the million we reported in mid-April. Cursor offers developers tiered pricing. After a two-week free trial, the company converts users into paying customers, who can opt for either a Pro offering or a monthly business subscription. Until recently, the majority of the company’s revenue came from individual user subscriptions, Bloomberg reported. However, Anysphere is now offering enterprise licenses, allowing companies to purchase the application for their teams at a higher price point. Earlier this year, the company was approached by OpenAI and other potential buyers, but Anysphere turned down those offers. The ChatGPT maker bought Windsurf, another fast-growing AI assistant, reportedly for billion. #cursors #anysphere #nabs #99b #valuation
    Cursor’s Anysphere nabs $9.9B valuation, soars past $500M ARR
    techcrunch.com
    Anysphere, the maker of AI coding assistant Cursor, has raised $900 million at a $9.9 billion valuation, Bloomberg reported. The round was led by returning investor Thrive Capital, with participation from Andreessen Horowitz, Accel, and DST Global. The massive round is Anysphere’s third fundraise in less than a year. The 3-year-old startup secured its previous capital haul of $100 million at a pre-money valuation of $2.5 billion late last year, as TechCrunch was first to report.  AI coding assistants, often referred to as “vibe coders,” have emerged as one of AI’s most popular applications, with Cursor leading the category. Anysphere’s annualized revenue (ARR) has been doubling approximately every two months, a person familiar with the company told TechCrunch. The company has surpassed $500 million in ARR, sources told Bloomberg, a 60% increase from the $300 million we reported in mid-April. Cursor offers developers tiered pricing. After a two-week free trial, the company converts users into paying customers, who can opt for either a $20 Pro offering or a $40 monthly business subscription. Until recently, the majority of the company’s revenue came from individual user subscriptions, Bloomberg reported. However, Anysphere is now offering enterprise licenses, allowing companies to purchase the application for their teams at a higher price point. Earlier this year, the company was approached by OpenAI and other potential buyers, but Anysphere turned down those offers. The ChatGPT maker bought Windsurf, another fast-growing AI assistant, reportedly for $3 billion.
    Like
    Love
    Wow
    Angry
    Sad
    223
    · 0 التعليقات ·0 المشاركات ·0 معاينة
  • Thrustmaster T598 + Hypercar Wheel review: a great value PC/PS5 sim racing wheel and pedals built on novel tech

    Thrustmaster T598 + Hypercar Wheel review: a great value PC/PS5 sim racing wheel and pedals built on novel tech
    Direct axial drive impresses, despite limited software and a firmly mid stock wheel.

    Image credit: Digital Foundry

    Review

    by Will Judd
    Deputy Editor, Digital Foundry

    Published on June 1, 2025

    We've seen an explosion in the number of affordable direct driveracing wheels over the past couple of years, with Fanatec and Moza offering increasingly inexpensive options that still deliver the precise, quick and long-lasting force feedback that cheaper gear- or belt-driven wheels can't match.
    Now, Thrustmaster is intruding on that territory with the T598, a PlayStation/PC direct drive wheel, wheel base and pedals that costs just £449/That's on a similar level to the PC-only £459/Moza R5 bundle and the €399/Fanatec CSL DD bundle, so how does the newcomer compare? And what's changed from the more expensive T818 we reviewed before?
    We've been testing the T598 - and the fancy upgraded HyperCar wheel that's available as an upgrade option - for weeks to find out. Our full review follows, so read on - or check out the quick links below to jump to what you're most interested in.

    To see this content please enable targeting cookies.

    Thrustmaster T598 wheel base review: direct axial drive vs traditional direct drive
    Interestingly, the T598 arguably comes with a more advanced DD motor than the more expensive T818 does. It uses a "direct axial drive" versus the standard "direct radial drive", where the magnets are aligned parallel to the wheel shaft rather than perpendicular. This ought to allow for more efficient torque generation, producing less waste heat, minimising precision-sapping magnetic interference and requiring less copper to produce. It also means the T598 can "overshoot" to deliver more than its rated 5nm of constant torque for short periods.
    However, this design also requires a physically taller yet slimmer enclosure, potentially blocking the view forward and requiring a different bolt pattern to attach the base to your desk or sim racing cockpit - both of which are slight annoyances with the T598.Interestingly, you can also feel a slight vibration and hear a quiet crackling noise emanating from the T598 base while idle - something I haven't heard or felt with other direct drive motors and is reportedly inherent to this design.

    There's a lot going on inside this wheel base - including some genuine innovation. | Image credit: Thrustmaster/Digital Foundry

    Thrustmaster has written a pair of white papers to explain why their take on direct driveis better than what came before. Image credit: Thrustmaster

    In terms of the force feedback itself, Thrustmaster have achieved something quite special here. In some titles with a good force feedback implementation - Assetto Corsa, Assetto Corsa Evo and F1 23 stood out to me here - the wheel feels great, with strong force feedback and plenty of detail. If you run up on a kerb or start to lose traction, you know about it right away and can take corrective action. I also appreciated the way that turning the wheel feels perfectly smooth when turning, without any cogging - the slightly jerky sensation common to low-end and mid-range direct drive motors that corresponds to slight attraction as you pass each magnet.
    However, balancing this, the wheel's force feedback feels a little less consistent than others I've tested from the likes of Fanatec or Moza at a similar price point, with some games like Project Cars 3 and Forza Motorsport feeling almost bereft of force feedback by comparison. You also have that slight vibration when the wheel is stationary, which is potentially more noticeable than the cogging sensation in traditional DD designs. The overshoot is also a mixed bag - as the sudden jump in torque can feel a little artificial in some scenarios, eg when you're warming your tyres by weaving in F1 before a safety car restart.
    I'd say that these positives and negatives largely cancel each other out, and you're left with force feedback that is good, way better than non-DD wheels, but not noticeably better than more common radial direct drive designs. Depending on the games you play, either DD style could be preferable. It'll be interesting to see if Thrustmaster are able to tune out some of these negative characteristics through firmware updates - or simply in later products using the same technology.

    1 of 7

    Caption

    Attribution

    Here's how the T598 looks IRL - from the wheel base itself to the default rim, the upgraded Hypercar wheel and the included dual pedals. Click to enlarge.

    Apart from the novel motor, the rest of the wheel base is fairly standard - there's a smalldisplay on the top for adjusting your settings and seeing in-game info like a rev counter, four large circular buttons buttons, the usual Thrustmaster quick release lock for securing your wheel rim and a small button on the back to turn the wheel base on and off. There are connection options for power, USB and connecting other components like pedals or shifters on the back too.
    Weirdly, there's no ability to change settings in the PC Thrustmaster Panel app - it just says this functionality is "coming soon!" - so right now you can only use it for updating firmware, testing buttons and changing between profiles.

    "Coming soon!" starts to become a little less believable six months after the first reviews hit. | Image credit: Digital Foundry

    Instead, you'll be using the built-in screen for making changes, which works well enough but doesn't provide any allowance for extra information - so you'll be sticking to the four basic pre-made profiles, referring to the manual or checking suggested setups online rather than reading built-in tool tips.
    You still get access to the full whack of settings here, and of course this works well for PS5/PS4 users who wouldn't expect a software experience anyway, but PC users may be disappointed to learn that there's no intuitive software interface here. I found the Boosted Media YT review of the wheelbase to offer some good insight into what settings you're likely to want to change from their default values.
    Thrustmaster T598 Sportcar wheel review: a workable default option

    The Sportcar wheel rim looks good - but a plastic construction and relatively spartan controls make it "OK" at best.

    The "Sportcar" wheel provided in the bundle is a little less impressive-looking than the base itself, with a plasticky feel throughout and fairly mushy buttons - though the paddles are snappy enough and feel good to use. The usual PS-style face buttons are split into two clumps up top with L2 and R2, which is a bit odd, with four individual directional buttons in the lower left, start/select/PS in the lower middle and four configuration buttons in the lower right.
    Those configuration buttons require extra explanation, so here we go: the P button at the top swaps between four different pages, indicated with a different colour LED, allowing the remaining three physical buttons to activate up to 12 different functions.There are no rotary encoders or other additional controls here, so PC players that prefer more complicated racing sims may feel a bit underserved by this clunky, cost-saving solution.
    The 815g wheel is at least sized reasonably, with 300mm circular shape that particularly suits drifting, rally and trucking - though all forms of driving and racing are of course possible. The rubber grips under your hands are reasonably comfortable, but you can still feel seams in various places. Overall, the wheel is possibly the weakest part of the package, but perfectly usable and acceptable for the price point.
    Thrustmaster Hypercar Wheel Add-On review: true luxury

    An incredible wheel with premium materials, excellent controls and a more specialised shape.

    Thrustmaster also sent over the £339/Hypercar wheel rim for testing, which is an upgrade option that uses significantly better materials - leather, alcantara, aluminium and carbon - and offers a huge number of extra controls. Its oval shape feels a bit more responsive for faster vehiclesthat require a quick change of direction, but drifting and rally doesn't feel natural. It supports the same PS4, PS5 and PC platforms as the stock option, but there are no legends printed on the buttons to help you.
    The difference in quality here is immediately apparent, with much better tactile feedback from the buttons and a huge number of additional controls for adjusting stuff like ERS deployment or brake bias. Each control feels well-placed, even if the T-shaped layout for the face buttons is slightly unnatural at first, and the paddles for shifting and the clutch are particularly well engineered. I also found holding the wheel a bit more comfortable thanks to that flattened out shape, the more premium materials and the absence of bumps or seams anywhere you're likely to hold.
    It's a huge upgrade in terms of feel and features then, as you'd hope for a wheel rim that costs nearly as much as the entire T598 kit and caboodle. As an upgrade option, I do rate it, though it perhaps makes slightly more sense for T818 owners that have already invested a bit more in the Thrusmaster ecosystem. Regardless, it was this rim that I used for the majority of my time with the T598, and the wheel base feels significantly better with the upgrade.
    Thrustmaster T598 Raceline pedals review: great feedback, but no clutch and no load cell upgrade offered at present

    Surprisingly good for two add-in pedals, in terms of feedback and flexibility.

    The pedals that come with the T598 are surprisingly good, with an accelerator, a brake pedaland no clutch pedal. Each pedal's spring assembly can be pushed into one of three positions to change the amount of pre-load - ie make it a bit softer or harder to press and the pedal plates can be shifted up and down. The narrow dimensions of the metal wheel plate meant that it was impossible to mount directly in the centre of the Playseat Trophy I used for testing, but the slightly off-centre installation I ended up with still worked just fine. They connect using a non-USB connection, so you can't use the pedals with other wheel bases.
    Using the middle distance setting and the firmer of the two springs for the brake, I found the T598 produced good results, on par or perhaps even a tad better than other metal-construction Hall Effect position sensorpedals I've tested such as the Moza SR-P Lite and Fanatec CSL. Braking is the critical point here, as you want to be able to feel when the brake has mechanically reached its threshold and then modulate your inputs from there, and the T598 pedals do allow for this quite easily. They're also not so hard to actuate that you end up having to hard-mount them to a sim rig for good results, and the included carpet spikes are reasonably effective in keeping the pedals in place.
    Presumably, it ought to be possible to add on a load cell brake pedal down the line to upgrade to a properthree pedal setup. For the F1 style driving that I prefer, the clutch pedal isn't used anyway, so it wasn't a massive issue for me - and we frequently see companies like Moza and Fanatec drop the clutch pedal on these aggessively priced bundles so Thrustmaster aren't losing ground by following suit.
    Thrustmaster T598 final verdict: a competitive £450 package with potential

    For PlayStation owners, this is an incredible value pickup that ranks among the cheapest DD options - and PC owners ought to consider it too.

    For £449/the Thrustmaster T598 is an excellent value direct drive wheel and pedal bundle for PlayStation and PC with some relatively minor quirks. The wheel base is powerful, detailed and responsive in most games, with some advantages over traditional DD designs but also some disadvantages - notably the taller shape and a slight hum/vibration while stationary. Traditional DD designs from the likes of Fanatec and Moza can offer more reliable force feedback that works over a wider range of games, cars and tracks, while also benefitting from better PC software, but there's certainly potential for Thrustmaster to improve here.
    The included wheel feels a bit cheap, with a predominantly plastic design, spongey buttons and a slightly odd layout, but the full circle shape and full PS5/PS4 compatibility is most welcome. Upgrading to the HyperCar wheel provides a huge uptick in materials, tactile feedback and number of controls, though this does come at a fairly steep price of £339/If you plan to use the T598 for years and have the budget for it, this is a super upgrade to aim for.
    The included Raceline LTE pedals are the most surprising element for me. These consist of only an accelerator and a brake with only moderate adjustability and a narrow base plate, but they feel great to use, are made from durable metal with HE sensors, and only really lose out to significantly more expensive load cell options. For an add-in for a relatively cheap DD bundle, they're a solid inclusion, and I hope Thrustmaster release a load cell brake pedal for users to upgrade to a better three-pedal setup later.
    Overall, it's an competitive first outing for Thrustmaster with the T598 and direct axial drive, and I'm curious to see where the company - and the tech - goes from here. With Fanatec still on the rebuild after being acquired by Corsair and Moza's offerings being hard to order online in some regions, Thrustmaster has a golden opportunity to seize a share of the mid-range and entry-level sim racing market, and the T598 is a positive start.
    #thrustmaster #t598 #hypercar #wheel #review
    Thrustmaster T598 + Hypercar Wheel review: a great value PC/PS5 sim racing wheel and pedals built on novel tech
    Thrustmaster T598 + Hypercar Wheel review: a great value PC/PS5 sim racing wheel and pedals built on novel tech Direct axial drive impresses, despite limited software and a firmly mid stock wheel. Image credit: Digital Foundry Review by Will Judd Deputy Editor, Digital Foundry Published on June 1, 2025 We've seen an explosion in the number of affordable direct driveracing wheels over the past couple of years, with Fanatec and Moza offering increasingly inexpensive options that still deliver the precise, quick and long-lasting force feedback that cheaper gear- or belt-driven wheels can't match. Now, Thrustmaster is intruding on that territory with the T598, a PlayStation/PC direct drive wheel, wheel base and pedals that costs just £449/That's on a similar level to the PC-only £459/Moza R5 bundle and the €399/Fanatec CSL DD bundle, so how does the newcomer compare? And what's changed from the more expensive T818 we reviewed before? We've been testing the T598 - and the fancy upgraded HyperCar wheel that's available as an upgrade option - for weeks to find out. Our full review follows, so read on - or check out the quick links below to jump to what you're most interested in. To see this content please enable targeting cookies. Thrustmaster T598 wheel base review: direct axial drive vs traditional direct drive Interestingly, the T598 arguably comes with a more advanced DD motor than the more expensive T818 does. It uses a "direct axial drive" versus the standard "direct radial drive", where the magnets are aligned parallel to the wheel shaft rather than perpendicular. This ought to allow for more efficient torque generation, producing less waste heat, minimising precision-sapping magnetic interference and requiring less copper to produce. It also means the T598 can "overshoot" to deliver more than its rated 5nm of constant torque for short periods. However, this design also requires a physically taller yet slimmer enclosure, potentially blocking the view forward and requiring a different bolt pattern to attach the base to your desk or sim racing cockpit - both of which are slight annoyances with the T598.Interestingly, you can also feel a slight vibration and hear a quiet crackling noise emanating from the T598 base while idle - something I haven't heard or felt with other direct drive motors and is reportedly inherent to this design. There's a lot going on inside this wheel base - including some genuine innovation. | Image credit: Thrustmaster/Digital Foundry Thrustmaster has written a pair of white papers to explain why their take on direct driveis better than what came before. Image credit: Thrustmaster In terms of the force feedback itself, Thrustmaster have achieved something quite special here. In some titles with a good force feedback implementation - Assetto Corsa, Assetto Corsa Evo and F1 23 stood out to me here - the wheel feels great, with strong force feedback and plenty of detail. If you run up on a kerb or start to lose traction, you know about it right away and can take corrective action. I also appreciated the way that turning the wheel feels perfectly smooth when turning, without any cogging - the slightly jerky sensation common to low-end and mid-range direct drive motors that corresponds to slight attraction as you pass each magnet. However, balancing this, the wheel's force feedback feels a little less consistent than others I've tested from the likes of Fanatec or Moza at a similar price point, with some games like Project Cars 3 and Forza Motorsport feeling almost bereft of force feedback by comparison. You also have that slight vibration when the wheel is stationary, which is potentially more noticeable than the cogging sensation in traditional DD designs. The overshoot is also a mixed bag - as the sudden jump in torque can feel a little artificial in some scenarios, eg when you're warming your tyres by weaving in F1 before a safety car restart. I'd say that these positives and negatives largely cancel each other out, and you're left with force feedback that is good, way better than non-DD wheels, but not noticeably better than more common radial direct drive designs. Depending on the games you play, either DD style could be preferable. It'll be interesting to see if Thrustmaster are able to tune out some of these negative characteristics through firmware updates - or simply in later products using the same technology. 1 of 7 Caption Attribution Here's how the T598 looks IRL - from the wheel base itself to the default rim, the upgraded Hypercar wheel and the included dual pedals. Click to enlarge. Apart from the novel motor, the rest of the wheel base is fairly standard - there's a smalldisplay on the top for adjusting your settings and seeing in-game info like a rev counter, four large circular buttons buttons, the usual Thrustmaster quick release lock for securing your wheel rim and a small button on the back to turn the wheel base on and off. There are connection options for power, USB and connecting other components like pedals or shifters on the back too. Weirdly, there's no ability to change settings in the PC Thrustmaster Panel app - it just says this functionality is "coming soon!" - so right now you can only use it for updating firmware, testing buttons and changing between profiles. "Coming soon!" starts to become a little less believable six months after the first reviews hit. | Image credit: Digital Foundry Instead, you'll be using the built-in screen for making changes, which works well enough but doesn't provide any allowance for extra information - so you'll be sticking to the four basic pre-made profiles, referring to the manual or checking suggested setups online rather than reading built-in tool tips. You still get access to the full whack of settings here, and of course this works well for PS5/PS4 users who wouldn't expect a software experience anyway, but PC users may be disappointed to learn that there's no intuitive software interface here. I found the Boosted Media YT review of the wheelbase to offer some good insight into what settings you're likely to want to change from their default values. Thrustmaster T598 Sportcar wheel review: a workable default option The Sportcar wheel rim looks good - but a plastic construction and relatively spartan controls make it "OK" at best. The "Sportcar" wheel provided in the bundle is a little less impressive-looking than the base itself, with a plasticky feel throughout and fairly mushy buttons - though the paddles are snappy enough and feel good to use. The usual PS-style face buttons are split into two clumps up top with L2 and R2, which is a bit odd, with four individual directional buttons in the lower left, start/select/PS in the lower middle and four configuration buttons in the lower right. Those configuration buttons require extra explanation, so here we go: the P button at the top swaps between four different pages, indicated with a different colour LED, allowing the remaining three physical buttons to activate up to 12 different functions.There are no rotary encoders or other additional controls here, so PC players that prefer more complicated racing sims may feel a bit underserved by this clunky, cost-saving solution. The 815g wheel is at least sized reasonably, with 300mm circular shape that particularly suits drifting, rally and trucking - though all forms of driving and racing are of course possible. The rubber grips under your hands are reasonably comfortable, but you can still feel seams in various places. Overall, the wheel is possibly the weakest part of the package, but perfectly usable and acceptable for the price point. Thrustmaster Hypercar Wheel Add-On review: true luxury An incredible wheel with premium materials, excellent controls and a more specialised shape. Thrustmaster also sent over the £339/Hypercar wheel rim for testing, which is an upgrade option that uses significantly better materials - leather, alcantara, aluminium and carbon - and offers a huge number of extra controls. Its oval shape feels a bit more responsive for faster vehiclesthat require a quick change of direction, but drifting and rally doesn't feel natural. It supports the same PS4, PS5 and PC platforms as the stock option, but there are no legends printed on the buttons to help you. The difference in quality here is immediately apparent, with much better tactile feedback from the buttons and a huge number of additional controls for adjusting stuff like ERS deployment or brake bias. Each control feels well-placed, even if the T-shaped layout for the face buttons is slightly unnatural at first, and the paddles for shifting and the clutch are particularly well engineered. I also found holding the wheel a bit more comfortable thanks to that flattened out shape, the more premium materials and the absence of bumps or seams anywhere you're likely to hold. It's a huge upgrade in terms of feel and features then, as you'd hope for a wheel rim that costs nearly as much as the entire T598 kit and caboodle. As an upgrade option, I do rate it, though it perhaps makes slightly more sense for T818 owners that have already invested a bit more in the Thrusmaster ecosystem. Regardless, it was this rim that I used for the majority of my time with the T598, and the wheel base feels significantly better with the upgrade. Thrustmaster T598 Raceline pedals review: great feedback, but no clutch and no load cell upgrade offered at present Surprisingly good for two add-in pedals, in terms of feedback and flexibility. The pedals that come with the T598 are surprisingly good, with an accelerator, a brake pedaland no clutch pedal. Each pedal's spring assembly can be pushed into one of three positions to change the amount of pre-load - ie make it a bit softer or harder to press and the pedal plates can be shifted up and down. The narrow dimensions of the metal wheel plate meant that it was impossible to mount directly in the centre of the Playseat Trophy I used for testing, but the slightly off-centre installation I ended up with still worked just fine. They connect using a non-USB connection, so you can't use the pedals with other wheel bases. Using the middle distance setting and the firmer of the two springs for the brake, I found the T598 produced good results, on par or perhaps even a tad better than other metal-construction Hall Effect position sensorpedals I've tested such as the Moza SR-P Lite and Fanatec CSL. Braking is the critical point here, as you want to be able to feel when the brake has mechanically reached its threshold and then modulate your inputs from there, and the T598 pedals do allow for this quite easily. They're also not so hard to actuate that you end up having to hard-mount them to a sim rig for good results, and the included carpet spikes are reasonably effective in keeping the pedals in place. Presumably, it ought to be possible to add on a load cell brake pedal down the line to upgrade to a properthree pedal setup. For the F1 style driving that I prefer, the clutch pedal isn't used anyway, so it wasn't a massive issue for me - and we frequently see companies like Moza and Fanatec drop the clutch pedal on these aggessively priced bundles so Thrustmaster aren't losing ground by following suit. Thrustmaster T598 final verdict: a competitive £450 package with potential For PlayStation owners, this is an incredible value pickup that ranks among the cheapest DD options - and PC owners ought to consider it too. For £449/the Thrustmaster T598 is an excellent value direct drive wheel and pedal bundle for PlayStation and PC with some relatively minor quirks. The wheel base is powerful, detailed and responsive in most games, with some advantages over traditional DD designs but also some disadvantages - notably the taller shape and a slight hum/vibration while stationary. Traditional DD designs from the likes of Fanatec and Moza can offer more reliable force feedback that works over a wider range of games, cars and tracks, while also benefitting from better PC software, but there's certainly potential for Thrustmaster to improve here. The included wheel feels a bit cheap, with a predominantly plastic design, spongey buttons and a slightly odd layout, but the full circle shape and full PS5/PS4 compatibility is most welcome. Upgrading to the HyperCar wheel provides a huge uptick in materials, tactile feedback and number of controls, though this does come at a fairly steep price of £339/If you plan to use the T598 for years and have the budget for it, this is a super upgrade to aim for. The included Raceline LTE pedals are the most surprising element for me. These consist of only an accelerator and a brake with only moderate adjustability and a narrow base plate, but they feel great to use, are made from durable metal with HE sensors, and only really lose out to significantly more expensive load cell options. For an add-in for a relatively cheap DD bundle, they're a solid inclusion, and I hope Thrustmaster release a load cell brake pedal for users to upgrade to a better three-pedal setup later. Overall, it's an competitive first outing for Thrustmaster with the T598 and direct axial drive, and I'm curious to see where the company - and the tech - goes from here. With Fanatec still on the rebuild after being acquired by Corsair and Moza's offerings being hard to order online in some regions, Thrustmaster has a golden opportunity to seize a share of the mid-range and entry-level sim racing market, and the T598 is a positive start. #thrustmaster #t598 #hypercar #wheel #review
    Thrustmaster T598 + Hypercar Wheel review: a great value PC/PS5 sim racing wheel and pedals built on novel tech
    www.eurogamer.net
    Thrustmaster T598 + Hypercar Wheel review: a great value PC/PS5 sim racing wheel and pedals built on novel tech Direct axial drive impresses, despite limited software and a firmly mid stock wheel. Image credit: Digital Foundry Review by Will Judd Deputy Editor, Digital Foundry Published on June 1, 2025 We've seen an explosion in the number of affordable direct drive (DD) racing wheels over the past couple of years, with Fanatec and Moza offering increasingly inexpensive options that still deliver the precise, quick and long-lasting force feedback that cheaper gear- or belt-driven wheels can't match. Now, Thrustmaster is intruding on that territory with the T598, a PlayStation/PC direct drive wheel, wheel base and pedals that costs just £449/$499. That's on a similar level to the PC-only £459/$599 Moza R5 bundle and the €399/$569 Fanatec CSL DD bundle, so how does the newcomer compare? And what's changed from the more expensive T818 we reviewed before? We've been testing the T598 - and the fancy upgraded HyperCar wheel that's available as an upgrade option - for weeks to find out. Our full review follows, so read on - or check out the quick links below to jump to what you're most interested in. To see this content please enable targeting cookies. Thrustmaster T598 wheel base review: direct axial drive vs traditional direct drive Interestingly, the T598 arguably comes with a more advanced DD motor than the more expensive T818 does. It uses a "direct axial drive" versus the standard "direct radial drive", where the magnets are aligned parallel to the wheel shaft rather than perpendicular (see the diagram below). This ought to allow for more efficient torque generation, producing less waste heat, minimising precision-sapping magnetic interference and requiring less copper to produce. It also means the T598 can "overshoot" to deliver more than its rated 5nm of constant torque for short periods. However, this design also requires a physically taller yet slimmer enclosure (measuring 210x210x120mm), potentially blocking the view forward and requiring a different bolt pattern to attach the base to your desk or sim racing cockpit - both of which are slight annoyances with the T598. (You do get an angle bracket to allow for wider and potentially more compatible holes for your cockpit... but this makes the tall wheel base even taller. Table clamps are also included.) Interestingly, you can also feel a slight vibration and hear a quiet crackling noise emanating from the T598 base while idle - something I haven't heard or felt with other direct drive motors and is reportedly inherent to this design. There's a lot going on inside this wheel base - including some genuine innovation. | Image credit: Thrustmaster/Digital Foundry Thrustmaster has written a pair of white papers to explain why their take on direct drive ("axial flux") is better than what came before ("radial flux"). Image credit: Thrustmaster In terms of the force feedback itself, Thrustmaster have achieved something quite special here. In some titles with a good force feedback implementation - Assetto Corsa, Assetto Corsa Evo and F1 23 stood out to me here - the wheel feels great, with strong force feedback and plenty of detail. If you run up on a kerb or start to lose traction, you know about it right away and can take corrective action. I also appreciated the way that turning the wheel feels perfectly smooth when turning, without any cogging - the slightly jerky sensation common to low-end and mid-range direct drive motors that corresponds to slight attraction as you pass each magnet. However, balancing this, the wheel's force feedback feels a little less consistent than others I've tested from the likes of Fanatec or Moza at a similar price point, with some games like Project Cars 3 and Forza Motorsport feeling almost bereft of force feedback by comparison. You also have that slight vibration when the wheel is stationary, which is potentially more noticeable than the cogging sensation in traditional DD designs. The overshoot is also a mixed bag - as the sudden jump in torque can feel a little artificial in some scenarios, eg when you're warming your tyres by weaving in F1 before a safety car restart. I'd say that these positives and negatives largely cancel each other out, and you're left with force feedback that is good, way better than non-DD wheels, but not noticeably better than more common radial direct drive designs. Depending on the games you play, either DD style could be preferable. It'll be interesting to see if Thrustmaster are able to tune out some of these negative characteristics through firmware updates - or simply in later products using the same technology. 1 of 7 Caption Attribution Here's how the T598 looks IRL - from the wheel base itself to the default rim, the upgraded Hypercar wheel and the included dual pedals. Click to enlarge. Apart from the novel motor, the rest of the wheel base is fairly standard - there's a small (colour!) display on the top for adjusting your settings and seeing in-game info like a rev counter, four large circular buttons buttons (L3, R3, Mode and Settings), the usual Thrustmaster quick release lock for securing your wheel rim and a small button on the back to turn the wheel base on and off. There are connection options for power, USB and connecting other components like pedals or shifters on the back too. Weirdly, there's no ability to change settings in the PC Thrustmaster Panel app - it just says this functionality is "coming soon!" - so right now you can only use it for updating firmware, testing buttons and changing between profiles. "Coming soon!" starts to become a little less believable six months after the first reviews hit. | Image credit: Digital Foundry Instead, you'll be using the built-in screen for making changes, which works well enough but doesn't provide any allowance for extra information - so you'll be sticking to the four basic pre-made profiles, referring to the manual or checking suggested setups online rather than reading built-in tool tips. You still get access to the full whack of settings here, and of course this works well for PS5/PS4 users who wouldn't expect a software experience anyway, but PC users may be disappointed to learn that there's no intuitive software interface here. I found the Boosted Media YT review of the wheelbase to offer some good insight into what settings you're likely to want to change from their default values. Thrustmaster T598 Sportcar wheel review: a workable default option The Sportcar wheel rim looks good - but a plastic construction and relatively spartan controls make it "OK" at best. The "Sportcar" wheel provided in the bundle is a little less impressive-looking than the base itself, with a plasticky feel throughout and fairly mushy buttons - though the paddles are snappy enough and feel good to use. The usual PS-style face buttons are split into two clumps up top with L2 and R2, which is a bit odd, with four individual directional buttons in the lower left, start/select/PS in the lower middle and four configuration buttons in the lower right. Those configuration buttons require extra explanation, so here we go: the P button at the top swaps between four different pages, indicated with a different colour LED, allowing the remaining three physical buttons to activate up to 12 different functions. (The Fanatec GT DD Pro, by contrast, has dedicated five-way controls for each of its four functions. This costs more to produce, but allows you to use the controls without looking down to see what coloured light is active.) There are no rotary encoders or other additional controls here, so PC players that prefer more complicated racing sims may feel a bit underserved by this clunky, cost-saving solution. The 815g wheel is at least sized reasonably, with 300mm circular shape that particularly suits drifting, rally and trucking - though all forms of driving and racing are of course possible. The rubber grips under your hands are reasonably comfortable, but you can still feel seams in various places. Overall, the wheel is possibly the weakest part of the package, but perfectly usable and acceptable for the price point. Thrustmaster Hypercar Wheel Add-On review: true luxury An incredible wheel with premium materials, excellent controls and a more specialised shape. Thrustmaster also sent over the £339/$350 Hypercar wheel rim for testing, which is an upgrade option that uses significantly better materials - leather, alcantara, aluminium and carbon - and offers a huge number of extra controls (25 buttons, including four rotary encoders and two pairs of analogue paddles). Its oval shape feels a bit more responsive for faster vehicles (like F1 cars) that require a quick change of direction, but drifting and rally doesn't feel natural. It supports the same PS4, PS5 and PC platforms as the stock option, but there are no legends printed on the buttons to help you. The difference in quality here is immediately apparent, with much better tactile feedback from the buttons and a huge number of additional controls for adjusting stuff like ERS deployment or brake bias. Each control feels well-placed, even if the T-shaped layout for the face buttons is slightly unnatural at first, and the paddles for shifting and the clutch are particularly well engineered. I also found holding the wheel a bit more comfortable thanks to that flattened out shape, the more premium materials and the absence of bumps or seams anywhere you're likely to hold. It's a huge upgrade in terms of feel and features then, as you'd hope for a wheel rim that costs nearly as much as the entire T598 kit and caboodle. As an upgrade option, I do rate it, though it perhaps makes slightly more sense for T818 owners that have already invested a bit more in the Thrusmaster ecosystem. Regardless, it was this rim that I used for the majority of my time with the T598, and the wheel base feels significantly better with the upgrade. Thrustmaster T598 Raceline pedals review: great feedback, but no clutch and no load cell upgrade offered at present Surprisingly good for two add-in pedals, in terms of feedback and flexibility. The pedals that come with the T598 are surprisingly good, with an accelerator, a brake pedal (with a choice of two different spring options) and no clutch pedal. Each pedal's spring assembly can be pushed into one of three positions to change the amount of pre-load - ie make it a bit softer or harder to press and the pedal plates can be shifted up and down. The narrow dimensions of the metal wheel plate meant that it was impossible to mount directly in the centre of the Playseat Trophy I used for testing, but the slightly off-centre installation I ended up with still worked just fine. They connect using a non-USB connection, so you can't use the pedals with other wheel bases. Using the middle distance setting and the firmer of the two springs for the brake, I found the T598 produced good results, on par or perhaps even a tad better than other metal-construction Hall Effect position sensor (ie non-load cell) pedals I've tested such as the Moza SR-P Lite and Fanatec CSL. Braking is the critical point here, as you want to be able to feel when the brake has mechanically reached its threshold and then modulate your inputs from there, and the T598 pedals do allow for this quite easily. They're also not so hard to actuate that you end up having to hard-mount them to a sim rig for good results, and the included carpet spikes are reasonably effective in keeping the pedals in place. Presumably, it ought to be possible to add on a load cell brake pedal down the line to upgrade to a proper (if slightly cramped) three pedal setup. For the F1 style driving that I prefer, the clutch pedal isn't used anyway, so it wasn't a massive issue for me - and we frequently see companies like Moza and Fanatec drop the clutch pedal on these aggessively priced bundles so Thrustmaster aren't losing ground by following suit. Thrustmaster T598 final verdict: a competitive £450 package with potential For PlayStation owners, this is an incredible value pickup that ranks among the cheapest DD options - and PC owners ought to consider it too. For £449/$499, the Thrustmaster T598 is an excellent value direct drive wheel and pedal bundle for PlayStation and PC with some relatively minor quirks. The wheel base is powerful, detailed and responsive in most games, with some advantages over traditional DD designs but also some disadvantages - notably the taller shape and a slight hum/vibration while stationary. Traditional DD designs from the likes of Fanatec and Moza can offer more reliable force feedback that works over a wider range of games, cars and tracks, while also benefitting from better PC software, but there's certainly potential for Thrustmaster to improve here. The included wheel feels a bit cheap, with a predominantly plastic design, spongey buttons and a slightly odd layout, but the full circle shape and full PS5/PS4 compatibility is most welcome. Upgrading to the HyperCar wheel provides a huge uptick in materials, tactile feedback and number of controls, though this does come at a fairly steep price of £339/$350. If you plan to use the T598 for years and have the budget for it, this is a super upgrade to aim for. The included Raceline LTE pedals are the most surprising element for me. These consist of only an accelerator and a brake with only moderate adjustability and a narrow base plate, but they feel great to use, are made from durable metal with HE sensors, and only really lose out to significantly more expensive load cell options. For an add-in for a relatively cheap DD bundle, they're a solid inclusion, and I hope Thrustmaster release a load cell brake pedal for users to upgrade to a better three-pedal setup later. Overall, it's an competitive first outing for Thrustmaster with the T598 and direct axial drive, and I'm curious to see where the company - and the tech - goes from here. With Fanatec still on the rebuild after being acquired by Corsair and Moza's offerings being hard to order online in some regions, Thrustmaster has a golden opportunity to seize a share of the mid-range and entry-level sim racing market, and the T598 is a positive start.
    0 التعليقات ·0 المشاركات ·0 معاينة
  • The Legal Accountability of AI-Generated Deepfakes in Election Misinformation

    How Deepfakes Are Created

    Generative AI models enable the creation of highly realistic fake media. Most deepfakes today are produced by training deep neural networks on real images, video or audio of a target person. The two predominant AI architectures are generative adversarial networksand autoencoders. A GAN consists of a generator network that produces synthetic images and a discriminator network that tries to distinguish fakes from real data. Through iterative training, the generator learns to produce outputs that increasingly fool the discriminator¹. Autoencoder-based tools similarly learn to encode a target face and then decode it onto a source video. In practice, deepfake creators use accessible software: open-source tools like DeepFaceLab and FaceSwap dominate video face-swapping². Voice-cloning toolscan mimic a person’s speech from minutes of audio. Commercial platforms like Synthesia allow text-to-video avatars, which have already been misused in disinformation campaigns³. Even mobile appslet users do basic face swaps in minutes⁴. In short, advances in GANs and related models make deepfakes cheaper and easier to generate than ever.

    Diagram of a generative adversarial network: A generator network creates fake images from random input and a discriminator network distinguishes fakes from real examples. Over time the generator improves until its outputs “fool” the discriminator⁵

    During creation, a deepfake algorithm is typically trained on a large dataset of real images or audio from the target. The more varied and high-quality the training data, the more realistic the deepfake. The output often then undergoes post-processingto enhance believability¹. Technical defenses focus on two fronts: detection and authentication. Detection uses AI models to spot inconsistenciesthat betray a synthetic origin⁵. Authentication embeds markers before dissemination – for example, invisible watermarks or cryptographically signed metadata indicating authenticity⁶. The EU AI Act will soon mandate that major AI content providers embed machine-readable “watermark” signals in synthetic media⁷. However, as GAO notes, detection is an arms race – even a marked deepfake can sometimes evade notice – and labels alone don’t stop false narratives from spreading⁸⁹.

    Deepfakes in Recent Elections: Examples

    Deepfakes and AI-generated imagery already have made headlines in election cycles around the world. In the 2024 U.S. primary season, a digitally-altered audio robocall mimicked President Biden’s voice urging Democrats not to vote in the New Hampshire primary. The callerwas later fined million by the FCC and indicted under existing telemarketing laws¹⁰¹¹.Also in 2024, former President Trump posted on social media a collage implying that pop singer Taylor Swift endorsed his campaign, using AI-generated images of Swift in “Swifties for Trump” shirts¹². The posts sparked media uproar, though analysts noted the same effect could have been achieved without AI¹². Similarly, Elon Musk’s X platform carried AI-generated clips, including a parody “Ad” depicting Vice-President Harris’s voice via an AI clone¹³.

    Beyond the U.S., deepfake-like content has appeared globally. In Indonesia’s 2024 presidential election, a video surfaced on social media in which a convincingly generated image of the late President Suharto appeared to endorse the candidate of the Golkar Party. Days later, the endorsed candidatewon the presidency¹⁴. In Bangladesh, a viral deepfake video superimposed the face of opposition leader Rumeen Farhana onto a bikini-clad body – an incendiary fabrication designed to discredit her in the conservative Muslim-majority society¹⁵. Moldova’s pro-Western President Maia Sandu has been repeatedly targeted by AI-driven disinformation; one deepfake video falsely showed her resigning and endorsing a Russian-friendly party, apparently to sow distrust in the electoral process¹⁶. Even in Taiwan, a TikTok clip circulated that synthetically portrayed a U.S. politician making foreign-policy statements – stoking confusion ahead of Taiwanese elections¹⁷. In Slovakia’s recent campaign, AI-generated audio mimicking the liberal party leader suggested he plotted vote-rigging and beer-price hikes – instantly spreading on social media just days before the election¹⁸. These examples show that deepfakes have touched diverse polities, often aiming to undermine candidates or confuse voters¹⁵¹⁸.

    Notably, many of the most viral “deepfakes” in 2024 were actually circulated as obvious memes or claims, rather than subtle deceptions. Experts observed that outright undetectable AI deepfakes were relatively rare; more common were AI-generated memes plainly shared by partisans, or cheaply doctored “cheapfakes” made with basic editing tools¹³¹⁹. For instance, social media was awash with memes of Kamala Harris in Soviet garb or of Black Americans holding Trump signs¹³, but these were typically used satirically, not meant to be secretly believed. Nonetheless, even unsophisticated fakes can sway opinion: a U.S. study found that false presidential adsdid change voter attitudes in swing states. In sum, deepfakes are a real and growing phenomenon in election campaigns²⁰²¹ worldwide – a trend taken seriously by voters and regulators alike.

    U.S. Legal Framework and Accountability

    In the U.S., deepfake creators and distributors of election misinformation face a patchwork of tools, but no single comprehensive federal “deepfake law.” Existing laws relevant to disinformation include statutes against impersonating government officials, electioneering, and targeted statutes like criminal electioneering communications. In some cases ordinary laws have been stretched: the NH robocall used the Telephone Consumer Protection Act and mail/telemarketing fraud provisions, resulting in the M fine and a criminal charge. Similarly, voice impostors can potentially violate laws against “false advertising” or “unlawful corporate communications.” However, these laws were enacted before AI, and litigators have warned they often do not fit neatly. For example, deceptive deepfake claims not tied to a specific victim do not easily fit into defamation or privacy torts. Voter intimidation lawsalso leave a gap for non-threatening falsehoods about voting logistics or endorsements.

    Recognizing these gaps, some courts and agencies are invoking other theories. The U.S. Department of Justice has recently charged individuals under broad fraud statutes, and state attorneys general have considered deepfake misinformation as interference with voting rights. Notably, the Federal Election Commissionis preparing to enforce new rules: in April 2024 it issued an advisory opinion limiting “non-candidate electioneering communications” that use falsified media, effectively requiring that political ads use only real images of the candidate. If finalized, that would make it unlawful for campaigns to pay for ads depicting a candidate saying things they never did. Similarly, the Federal Trade Commissionand Department of Justicehave signaled that purely commercial deepfakes could violate consumer protection or election laws.

    U.S. Legislation and Proposals

    Federal lawmakers have proposed new statutes. The DEEPFAKES Accountability Actwould, among other things, impose a disclosure requirement: political ads featuring a manipulated media likeness would need clear disclaimers identifying the content as synthetic. It also increases penalties for producing false election videos or audio intended to influence the vote. While not yet enacted, supporters argue it would provide a uniform rule for all federal and state campaigns. The Brennan Center supports transparency requirements over outright bans, suggesting laws should narrowly target deceptive deepfakes in paid ads or certain categorieswhile carving out parody and news coverage.

    At the state level, over 20 states have passed deepfake laws specifically for elections. For example, Florida and California forbid distributing falsified audio/visual media of candidates with intent to deceive voters. Some statesdefine “deepfake” in statutes and allow candidates to sue or revoke candidacies of violators. These measures have had mixed success: courts have struck down overly broad provisions that acted as prior restraints. Critically, these state laws raise First Amendment issues: political speech is highly protected, so any restriction must be tightly tailored. Already, Texas and Virginia statutes are under legal review, and Elon Musk’s company has sued under California’s lawas unconstitutional. In practice, most lawsuits have so far centered on defamation or intellectual property, rather than election-focused statutes.

    Policy Recommendations: Balancing Integrity and Speech

    Given the rapidly evolving technology, experts recommend a multi-pronged approach. Most stress transparency and disclosure as core principles. For example, the Brennan Center urges requiring any political communication that uses AI-synthesized images or voice to include a clear label. This could be a digital watermark or a visible disclaimer. Transparency has two advantages: it forces campaigns and platforms to “own” the use of AI, and it alerts audiences to treat the content with skepticism.

    Outright bans on all deepfakes would likely violate free speech, but targeted bans on specific harmsmay be defensible. Indeed, Florida already penalizes misuse of recordings in voter suppression. Another recommendation is limited liability: tying penalties to demonstrable intent to mislead, not to the mere act of content creation. Both U.S. federal proposals and EU law generally condition fines on the “appearance of fraud” or deception.

    Technical solutions can complement laws. Watermarking original mediacould deter the reuse of authentic images in doctored fakes. Open tools for deepfake detection – some supported by government research grants – should be deployed by fact-checkers and social platforms. Making detection datasets publicly availablehelps improve AI models to spot fakes. International cooperation is also urged: cross-border agreements on information-sharing could help trace and halt disinformation campaigns. The G7 and APEC have all recently committed to fighting election interference via AI, which may lead to joint norms or rapid response teams.

    Ultimately, many analysts believe the strongest “cure” is a well-informed public: education campaigns to teach voters to question sensational media, and a robust independent press to debunk falsehoods swiftly. While the law can penalize the worst offenders, awareness and resilience in the electorate are crucial buffers against influence operations. As Georgia Tech’s Sean Parker quipped in 2019, “the real question is not if deepfakes will influence elections, but who will be empowered by the first effective one.” Thus policies should aim to deter malicious use without unduly chilling innovation or satire.

    References:

    /.

    /.

    .

    .

    .

    .

    .

    .

    .

    /.

    .

    .

    /.

    /.

    .

    The post The Legal Accountability of AI-Generated Deepfakes in Election Misinformation appeared first on MarkTechPost.
    #legal #accountability #aigenerated #deepfakes #election
    The Legal Accountability of AI-Generated Deepfakes in Election Misinformation
    How Deepfakes Are Created Generative AI models enable the creation of highly realistic fake media. Most deepfakes today are produced by training deep neural networks on real images, video or audio of a target person. The two predominant AI architectures are generative adversarial networksand autoencoders. A GAN consists of a generator network that produces synthetic images and a discriminator network that tries to distinguish fakes from real data. Through iterative training, the generator learns to produce outputs that increasingly fool the discriminator¹. Autoencoder-based tools similarly learn to encode a target face and then decode it onto a source video. In practice, deepfake creators use accessible software: open-source tools like DeepFaceLab and FaceSwap dominate video face-swapping². Voice-cloning toolscan mimic a person’s speech from minutes of audio. Commercial platforms like Synthesia allow text-to-video avatars, which have already been misused in disinformation campaigns³. Even mobile appslet users do basic face swaps in minutes⁴. In short, advances in GANs and related models make deepfakes cheaper and easier to generate than ever. Diagram of a generative adversarial network: A generator network creates fake images from random input and a discriminator network distinguishes fakes from real examples. Over time the generator improves until its outputs “fool” the discriminator⁵ During creation, a deepfake algorithm is typically trained on a large dataset of real images or audio from the target. The more varied and high-quality the training data, the more realistic the deepfake. The output often then undergoes post-processingto enhance believability¹. Technical defenses focus on two fronts: detection and authentication. Detection uses AI models to spot inconsistenciesthat betray a synthetic origin⁵. Authentication embeds markers before dissemination – for example, invisible watermarks or cryptographically signed metadata indicating authenticity⁶. The EU AI Act will soon mandate that major AI content providers embed machine-readable “watermark” signals in synthetic media⁷. However, as GAO notes, detection is an arms race – even a marked deepfake can sometimes evade notice – and labels alone don’t stop false narratives from spreading⁸⁹. Deepfakes in Recent Elections: Examples Deepfakes and AI-generated imagery already have made headlines in election cycles around the world. In the 2024 U.S. primary season, a digitally-altered audio robocall mimicked President Biden’s voice urging Democrats not to vote in the New Hampshire primary. The callerwas later fined million by the FCC and indicted under existing telemarketing laws¹⁰¹¹.Also in 2024, former President Trump posted on social media a collage implying that pop singer Taylor Swift endorsed his campaign, using AI-generated images of Swift in “Swifties for Trump” shirts¹². The posts sparked media uproar, though analysts noted the same effect could have been achieved without AI¹². Similarly, Elon Musk’s X platform carried AI-generated clips, including a parody “Ad” depicting Vice-President Harris’s voice via an AI clone¹³. Beyond the U.S., deepfake-like content has appeared globally. In Indonesia’s 2024 presidential election, a video surfaced on social media in which a convincingly generated image of the late President Suharto appeared to endorse the candidate of the Golkar Party. Days later, the endorsed candidatewon the presidency¹⁴. In Bangladesh, a viral deepfake video superimposed the face of opposition leader Rumeen Farhana onto a bikini-clad body – an incendiary fabrication designed to discredit her in the conservative Muslim-majority society¹⁵. Moldova’s pro-Western President Maia Sandu has been repeatedly targeted by AI-driven disinformation; one deepfake video falsely showed her resigning and endorsing a Russian-friendly party, apparently to sow distrust in the electoral process¹⁶. Even in Taiwan, a TikTok clip circulated that synthetically portrayed a U.S. politician making foreign-policy statements – stoking confusion ahead of Taiwanese elections¹⁷. In Slovakia’s recent campaign, AI-generated audio mimicking the liberal party leader suggested he plotted vote-rigging and beer-price hikes – instantly spreading on social media just days before the election¹⁸. These examples show that deepfakes have touched diverse polities, often aiming to undermine candidates or confuse voters¹⁵¹⁸. Notably, many of the most viral “deepfakes” in 2024 were actually circulated as obvious memes or claims, rather than subtle deceptions. Experts observed that outright undetectable AI deepfakes were relatively rare; more common were AI-generated memes plainly shared by partisans, or cheaply doctored “cheapfakes” made with basic editing tools¹³¹⁹. For instance, social media was awash with memes of Kamala Harris in Soviet garb or of Black Americans holding Trump signs¹³, but these were typically used satirically, not meant to be secretly believed. Nonetheless, even unsophisticated fakes can sway opinion: a U.S. study found that false presidential adsdid change voter attitudes in swing states. In sum, deepfakes are a real and growing phenomenon in election campaigns²⁰²¹ worldwide – a trend taken seriously by voters and regulators alike. U.S. Legal Framework and Accountability In the U.S., deepfake creators and distributors of election misinformation face a patchwork of tools, but no single comprehensive federal “deepfake law.” Existing laws relevant to disinformation include statutes against impersonating government officials, electioneering, and targeted statutes like criminal electioneering communications. In some cases ordinary laws have been stretched: the NH robocall used the Telephone Consumer Protection Act and mail/telemarketing fraud provisions, resulting in the M fine and a criminal charge. Similarly, voice impostors can potentially violate laws against “false advertising” or “unlawful corporate communications.” However, these laws were enacted before AI, and litigators have warned they often do not fit neatly. For example, deceptive deepfake claims not tied to a specific victim do not easily fit into defamation or privacy torts. Voter intimidation lawsalso leave a gap for non-threatening falsehoods about voting logistics or endorsements. Recognizing these gaps, some courts and agencies are invoking other theories. The U.S. Department of Justice has recently charged individuals under broad fraud statutes, and state attorneys general have considered deepfake misinformation as interference with voting rights. Notably, the Federal Election Commissionis preparing to enforce new rules: in April 2024 it issued an advisory opinion limiting “non-candidate electioneering communications” that use falsified media, effectively requiring that political ads use only real images of the candidate. If finalized, that would make it unlawful for campaigns to pay for ads depicting a candidate saying things they never did. Similarly, the Federal Trade Commissionand Department of Justicehave signaled that purely commercial deepfakes could violate consumer protection or election laws. U.S. Legislation and Proposals Federal lawmakers have proposed new statutes. The DEEPFAKES Accountability Actwould, among other things, impose a disclosure requirement: political ads featuring a manipulated media likeness would need clear disclaimers identifying the content as synthetic. It also increases penalties for producing false election videos or audio intended to influence the vote. While not yet enacted, supporters argue it would provide a uniform rule for all federal and state campaigns. The Brennan Center supports transparency requirements over outright bans, suggesting laws should narrowly target deceptive deepfakes in paid ads or certain categorieswhile carving out parody and news coverage. At the state level, over 20 states have passed deepfake laws specifically for elections. For example, Florida and California forbid distributing falsified audio/visual media of candidates with intent to deceive voters. Some statesdefine “deepfake” in statutes and allow candidates to sue or revoke candidacies of violators. These measures have had mixed success: courts have struck down overly broad provisions that acted as prior restraints. Critically, these state laws raise First Amendment issues: political speech is highly protected, so any restriction must be tightly tailored. Already, Texas and Virginia statutes are under legal review, and Elon Musk’s company has sued under California’s lawas unconstitutional. In practice, most lawsuits have so far centered on defamation or intellectual property, rather than election-focused statutes. Policy Recommendations: Balancing Integrity and Speech Given the rapidly evolving technology, experts recommend a multi-pronged approach. Most stress transparency and disclosure as core principles. For example, the Brennan Center urges requiring any political communication that uses AI-synthesized images or voice to include a clear label. This could be a digital watermark or a visible disclaimer. Transparency has two advantages: it forces campaigns and platforms to “own” the use of AI, and it alerts audiences to treat the content with skepticism. Outright bans on all deepfakes would likely violate free speech, but targeted bans on specific harmsmay be defensible. Indeed, Florida already penalizes misuse of recordings in voter suppression. Another recommendation is limited liability: tying penalties to demonstrable intent to mislead, not to the mere act of content creation. Both U.S. federal proposals and EU law generally condition fines on the “appearance of fraud” or deception. Technical solutions can complement laws. Watermarking original mediacould deter the reuse of authentic images in doctored fakes. Open tools for deepfake detection – some supported by government research grants – should be deployed by fact-checkers and social platforms. Making detection datasets publicly availablehelps improve AI models to spot fakes. International cooperation is also urged: cross-border agreements on information-sharing could help trace and halt disinformation campaigns. The G7 and APEC have all recently committed to fighting election interference via AI, which may lead to joint norms or rapid response teams. Ultimately, many analysts believe the strongest “cure” is a well-informed public: education campaigns to teach voters to question sensational media, and a robust independent press to debunk falsehoods swiftly. While the law can penalize the worst offenders, awareness and resilience in the electorate are crucial buffers against influence operations. As Georgia Tech’s Sean Parker quipped in 2019, “the real question is not if deepfakes will influence elections, but who will be empowered by the first effective one.” Thus policies should aim to deter malicious use without unduly chilling innovation or satire. References: /. /. . . . . . . . /. . . /. /. . The post The Legal Accountability of AI-Generated Deepfakes in Election Misinformation appeared first on MarkTechPost. #legal #accountability #aigenerated #deepfakes #election
    The Legal Accountability of AI-Generated Deepfakes in Election Misinformation
    www.marktechpost.com
    How Deepfakes Are Created Generative AI models enable the creation of highly realistic fake media. Most deepfakes today are produced by training deep neural networks on real images, video or audio of a target person. The two predominant AI architectures are generative adversarial networks (GANs) and autoencoders. A GAN consists of a generator network that produces synthetic images and a discriminator network that tries to distinguish fakes from real data. Through iterative training, the generator learns to produce outputs that increasingly fool the discriminator¹. Autoencoder-based tools similarly learn to encode a target face and then decode it onto a source video. In practice, deepfake creators use accessible software: open-source tools like DeepFaceLab and FaceSwap dominate video face-swapping (one estimate suggests DeepFaceLab was used for over 95% of known deepfake videos)². Voice-cloning tools (often built on similar AI principles) can mimic a person’s speech from minutes of audio. Commercial platforms like Synthesia allow text-to-video avatars (turning typed scripts into lifelike “spokespeople”), which have already been misused in disinformation campaigns³. Even mobile apps (e.g. FaceApp, Zao) let users do basic face swaps in minutes⁴. In short, advances in GANs and related models make deepfakes cheaper and easier to generate than ever. Diagram of a generative adversarial network (GAN): A generator network creates fake images from random input and a discriminator network distinguishes fakes from real examples. Over time the generator improves until its outputs “fool” the discriminator⁵ During creation, a deepfake algorithm is typically trained on a large dataset of real images or audio from the target. The more varied and high-quality the training data, the more realistic the deepfake. The output often then undergoes post-processing (color adjustments, lip-syncing refinements) to enhance believability¹. Technical defenses focus on two fronts: detection and authentication. Detection uses AI models to spot inconsistencies (blinking irregularities, audio artifacts or metadata mismatches) that betray a synthetic origin⁵. Authentication embeds markers before dissemination – for example, invisible watermarks or cryptographically signed metadata indicating authenticity⁶. The EU AI Act will soon mandate that major AI content providers embed machine-readable “watermark” signals in synthetic media⁷. However, as GAO notes, detection is an arms race – even a marked deepfake can sometimes evade notice – and labels alone don’t stop false narratives from spreading⁸⁹. Deepfakes in Recent Elections: Examples Deepfakes and AI-generated imagery already have made headlines in election cycles around the world. In the 2024 U.S. primary season, a digitally-altered audio robocall mimicked President Biden’s voice urging Democrats not to vote in the New Hampshire primary. The caller (“Susan Anderson”) was later fined $6 million by the FCC and indicted under existing telemarketing laws¹⁰¹¹. (Importantly, FCC rules on robocalls applied regardless of AI: the perpetrator could have used a voice actor or recording instead.) Also in 2024, former President Trump posted on social media a collage implying that pop singer Taylor Swift endorsed his campaign, using AI-generated images of Swift in “Swifties for Trump” shirts¹². The posts sparked media uproar, though analysts noted the same effect could have been achieved without AI (e.g., by photoshopping text on real images)¹². Similarly, Elon Musk’s X platform carried AI-generated clips, including a parody “Ad” depicting Vice-President Harris’s voice via an AI clone¹³. Beyond the U.S., deepfake-like content has appeared globally. In Indonesia’s 2024 presidential election, a video surfaced on social media in which a convincingly generated image of the late President Suharto appeared to endorse the candidate of the Golkar Party. Days later, the endorsed candidate (who is Suharto’s son-in-law) won the presidency¹⁴. In Bangladesh, a viral deepfake video superimposed the face of opposition leader Rumeen Farhana onto a bikini-clad body – an incendiary fabrication designed to discredit her in the conservative Muslim-majority society¹⁵. Moldova’s pro-Western President Maia Sandu has been repeatedly targeted by AI-driven disinformation; one deepfake video falsely showed her resigning and endorsing a Russian-friendly party, apparently to sow distrust in the electoral process¹⁶. Even in Taiwan (amidst tensions with China), a TikTok clip circulated that synthetically portrayed a U.S. politician making foreign-policy statements – stoking confusion ahead of Taiwanese elections¹⁷. In Slovakia’s recent campaign, AI-generated audio mimicking the liberal party leader suggested he plotted vote-rigging and beer-price hikes – instantly spreading on social media just days before the election¹⁸. These examples show that deepfakes have touched diverse polities (from Bangladesh and Indonesia to Moldova, Slovakia, India and beyond), often aiming to undermine candidates or confuse voters¹⁵¹⁸. Notably, many of the most viral “deepfakes” in 2024 were actually circulated as obvious memes or claims, rather than subtle deceptions. Experts observed that outright undetectable AI deepfakes were relatively rare; more common were AI-generated memes plainly shared by partisans, or cheaply doctored “cheapfakes” made with basic editing tools¹³¹⁹. For instance, social media was awash with memes of Kamala Harris in Soviet garb or of Black Americans holding Trump signs¹³, but these were typically used satirically, not meant to be secretly believed. Nonetheless, even unsophisticated fakes can sway opinion: a U.S. study found that false presidential ads (not necessarily AI-made) did change voter attitudes in swing states. In sum, deepfakes are a real and growing phenomenon in election campaigns²⁰²¹ worldwide – a trend taken seriously by voters and regulators alike. U.S. Legal Framework and Accountability In the U.S., deepfake creators and distributors of election misinformation face a patchwork of tools, but no single comprehensive federal “deepfake law.” Existing laws relevant to disinformation include statutes against impersonating government officials, electioneering (such as the Bipartisan Campaign Reform Act, which requires disclaimers on political ads), and targeted statutes like criminal electioneering communications. In some cases ordinary laws have been stretched: the NH robocall used the Telephone Consumer Protection Act and mail/telemarketing fraud provisions, resulting in the $6M fine and a criminal charge. Similarly, voice impostors can potentially violate laws against “false advertising” or “unlawful corporate communications.” However, these laws were enacted before AI, and litigators have warned they often do not fit neatly. For example, deceptive deepfake claims not tied to a specific victim do not easily fit into defamation or privacy torts. Voter intimidation laws (prohibiting threats or coercion) also leave a gap for non-threatening falsehoods about voting logistics or endorsements. Recognizing these gaps, some courts and agencies are invoking other theories. The U.S. Department of Justice has recently charged individuals under broad fraud statutes (e.g. for a plot to impersonate an aide to swing votes in 2020), and state attorneys general have considered deepfake misinformation as interference with voting rights. Notably, the Federal Election Commission (FEC) is preparing to enforce new rules: in April 2024 it issued an advisory opinion limiting “non-candidate electioneering communications” that use falsified media, effectively requiring that political ads use only real images of the candidate. If finalized, that would make it unlawful for campaigns to pay for ads depicting a candidate saying things they never did. Similarly, the Federal Trade Commission (FTC) and Department of Justice (DOJ) have signaled that purely commercial deepfakes could violate consumer protection or election laws (for example, liability for mass false impersonation or for foreign-funded electioneering). U.S. Legislation and Proposals Federal lawmakers have proposed new statutes. The DEEPFAKES Accountability Act (H.R.5586 in the 118th Congress) would, among other things, impose a disclosure requirement: political ads featuring a manipulated media likeness would need clear disclaimers identifying the content as synthetic. It also increases penalties for producing false election videos or audio intended to influence the vote. While not yet enacted, supporters argue it would provide a uniform rule for all federal and state campaigns. The Brennan Center supports transparency requirements over outright bans, suggesting laws should narrowly target deceptive deepfakes in paid ads or certain categories (e.g. false claims about time/place/manner of voting) while carving out parody and news coverage. At the state level, over 20 states have passed deepfake laws specifically for elections. For example, Florida and California forbid distributing falsified audio/visual media of candidates with intent to deceive voters (though Florida’s law exempts parody). Some states (like Texas) define “deepfake” in statutes and allow candidates to sue or revoke candidacies of violators. These measures have had mixed success: courts have struck down overly broad provisions that acted as prior restraints (e.g. Minnesota’s 2023 law was challenged for threatening injunctions against anyone “reasonably believed” to violate it). Critically, these state laws raise First Amendment issues: political speech is highly protected, so any restriction must be tightly tailored. Already, Texas and Virginia statutes are under legal review, and Elon Musk’s company has sued under California’s law (which requires platforms to label or block deepfakes) as unconstitutional. In practice, most lawsuits have so far centered on defamation or intellectual property (for instance, a celebrity suing over a botched celebrity-deepfake video), rather than election-focused statutes. Policy Recommendations: Balancing Integrity and Speech Given the rapidly evolving technology, experts recommend a multi-pronged approach. Most stress transparency and disclosure as core principles. For example, the Brennan Center urges requiring any political communication that uses AI-synthesized images or voice to include a clear label. This could be a digital watermark or a visible disclaimer. Transparency has two advantages: it forces campaigns and platforms to “own” the use of AI, and it alerts audiences to treat the content with skepticism. Outright bans on all deepfakes would likely violate free speech, but targeted bans on specific harms (e.g. automated phone calls impersonating voters, or videos claiming false polling information) may be defensible. Indeed, Florida already penalizes misuse of recordings in voter suppression. Another recommendation is limited liability: tying penalties to demonstrable intent to mislead, not to the mere act of content creation. Both U.S. federal proposals and EU law generally condition fines on the “appearance of fraud” or deception. Technical solutions can complement laws. Watermarking original media (as encouraged by the EU AI Act) could deter the reuse of authentic images in doctored fakes. Open tools for deepfake detection – some supported by government research grants – should be deployed by fact-checkers and social platforms. Making detection datasets publicly available (e.g. the MIT OpenDATATEST) helps improve AI models to spot fakes. International cooperation is also urged: cross-border agreements on information-sharing could help trace and halt disinformation campaigns. The G7 and APEC have all recently committed to fighting election interference via AI, which may lead to joint norms or rapid response teams. Ultimately, many analysts believe the strongest “cure” is a well-informed public: education campaigns to teach voters to question sensational media, and a robust independent press to debunk falsehoods swiftly. While the law can penalize the worst offenders, awareness and resilience in the electorate are crucial buffers against influence operations. As Georgia Tech’s Sean Parker quipped in 2019, “the real question is not if deepfakes will influence elections, but who will be empowered by the first effective one.” Thus policies should aim to deter malicious use without unduly chilling innovation or satire. References: https://www.security.org/resources/deepfake-statistics/. https://www.wired.com/story/synthesia-ai-deepfakes-it-control-riparbelli/. https://www.gao.gov/products/gao-24-107292. https://technologyquotient.freshfields.com/post/102jb19/eu-ai-act-unpacked-8-new-rules-on-deepfakes. https://knightcolumbia.org/blog/we-looked-at-78-election-deepfakes-political-misinformation-is-not-an-ai-problem. https://www.npr.org/2024/12/21/nx-s1-5220301/deepfakes-memes-artificial-intelligence-elections. https://apnews.com/article/artificial-intelligence-elections-disinformation-chatgpt-bc283e7426402f0b4baa7df280a4c3fd. https://www.lawfaremedia.org/article/new-and-old-tools-to-tackle-deepfakes-and-election-lies-in-2024. https://www.brennancenter.org/our-work/research-reports/regulating-ai-deepfakes-and-synthetic-media-political-arena. https://firstamendment.mtsu.edu/article/political-deepfakes-and-elections/. https://www.ncsl.org/technology-and-communication/deceptive-audio-or-visual-media-deepfakes-2024-legislation. https://law.unh.edu/sites/default/files/media/2022/06/nagumotu_pp113-157.pdf. https://dfrlab.org/2024/10/02/brazil-election-ai-research/. https://dfrlab.org/2024/11/26/brazil-election-ai-deepfakes/. https://freedomhouse.org/article/eu-digital-services-act-win-transparency. The post The Legal Accountability of AI-Generated Deepfakes in Election Misinformation appeared first on MarkTechPost.
    0 التعليقات ·0 المشاركات ·0 معاينة
  • WordPad is dead in Windows 11, but Notepad is absorbing its skills

    For years, Notepad has existed as a bare-bones text editor. No longer. Microsoft keeps adding to it, including a new update that includes capabilities that you might have expected in another Windows application, WordPad.
    Microsoft said Friday that it is adding “lightweight formatting” in Notepad, including markdown input and file support, but also bold and italic fonts and even hyperlinks. They’re all accessible via a new toolbar that includes these new formatting options.
    Microsoft didn’t indicate that it is testing the new Notepad or offering these features in preview, so presumably these updates will arrive on your Windows PC soon.
    Two things seem to be going on here. In late 2023, Microsoft killed off WordPad, the rich text editor that served as a poor man’s alternative to Microsoft Word.Traditionally, Notepad has been the Windows answer to a light text editor that coders can use or write in, although more sophisticated alternatives like vim exist.

    It seems that Microsoft is adding more features to try and help those users, while moving towards a WordPad replacement.
    Microsoft, meanwhile, is bringing the Edit app to Windows as well. Edit is an open-source app that was basically designed as a command-line interface, or CLI, and Microsoft specifically references how obtuse vim was to use when it announced it. Either way, by beefing up Notepad –heck, even with Copilot! — and adding the Edit option as well, Windows is offering a number of lightweight CLI and text-editing interfaces without bloating the operating system even further.
    #wordpad #dead #windows #but #notepad
    WordPad is dead in Windows 11, but Notepad is absorbing its skills
    For years, Notepad has existed as a bare-bones text editor. No longer. Microsoft keeps adding to it, including a new update that includes capabilities that you might have expected in another Windows application, WordPad. Microsoft said Friday that it is adding “lightweight formatting” in Notepad, including markdown input and file support, but also bold and italic fonts and even hyperlinks. They’re all accessible via a new toolbar that includes these new formatting options. Microsoft didn’t indicate that it is testing the new Notepad or offering these features in preview, so presumably these updates will arrive on your Windows PC soon. Two things seem to be going on here. In late 2023, Microsoft killed off WordPad, the rich text editor that served as a poor man’s alternative to Microsoft Word.Traditionally, Notepad has been the Windows answer to a light text editor that coders can use or write in, although more sophisticated alternatives like vim exist. It seems that Microsoft is adding more features to try and help those users, while moving towards a WordPad replacement. Microsoft, meanwhile, is bringing the Edit app to Windows as well. Edit is an open-source app that was basically designed as a command-line interface, or CLI, and Microsoft specifically references how obtuse vim was to use when it announced it. Either way, by beefing up Notepad –heck, even with Copilot! — and adding the Edit option as well, Windows is offering a number of lightweight CLI and text-editing interfaces without bloating the operating system even further. #wordpad #dead #windows #but #notepad
    WordPad is dead in Windows 11, but Notepad is absorbing its skills
    www.pcworld.com
    For years, Notepad has existed as a bare-bones text editor. No longer. Microsoft keeps adding to it, including a new update that includes capabilities that you might have expected in another Windows application, WordPad. Microsoft said Friday that it is adding “lightweight formatting” in Notepad, including markdown input and file support, but also bold and italic fonts and even hyperlinks. They’re all accessible via a new toolbar that includes these new formatting options. Microsoft didn’t indicate that it is testing the new Notepad or offering these features in preview, so presumably these updates will arrive on your Windows PC soon. Two things seem to be going on here. In late 2023, Microsoft killed off WordPad, the rich text editor that served as a poor man’s alternative to Microsoft Word. (There’s a way to bring WordPad back, but you’d need access to an older version of Windows where WordPad still existed.) Traditionally, Notepad has been the Windows answer to a light text editor that coders can use or write in, although more sophisticated alternatives like vim exist. It seems that Microsoft is adding more features to try and help those users, while moving towards a WordPad replacement. Microsoft, meanwhile, is bringing the Edit app to Windows as well. Edit is an open-source app that was basically designed as a command-line interface, or CLI, and Microsoft specifically references how obtuse vim was to use when it announced it. Either way, by beefing up Notepad –heck, even with Copilot! — and adding the Edit option as well, Windows is offering a number of lightweight CLI and text-editing interfaces without bloating the operating system even further.
    0 التعليقات ·0 المشاركات ·0 معاينة
CGShares https://cgshares.com