• How AI is reshaping the future of healthcare and medical research

    Transcript       
    PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”          
    This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.   
    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?    
    In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  The book passage I read at the top is from “Chapter 10: The Big Black Bag.” 
    In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.   
    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open. 
    As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.  
    Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home. 
    Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.     
    Here’s my conversation with Bill Gates and Sébastien Bubeck. 
    LEE: Bill, welcome. 
    BILL GATES: Thank you. 
    LEE: Seb … 
    SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here. 
    LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening? 
    And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?  
    GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines. 
    And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.  
    And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weaknessthat, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning. 
    LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that? 
    GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, … 
    LEE: Right.  
    GATES: … that is a bit weird.  
    LEE: Yeah. 
    GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training. 
    LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent. 
    BUBECK: Yes.  
    LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSRto join and start investigating this thing seriously. And the first person I pulled in was you. 
    BUBECK: Yeah. 
    LEE: And so what were your first encounters? Because I actually don’t remember what happened then. 
    BUBECK: Oh, I remember it very well.My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3. 
    I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1. 
    So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair.And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts. 
    So this was really, to me, the first moment where I saw some understanding in those models.  
    LEE: So this was, just to get the timing right, that was before I pulled you into the tent. 
    BUBECK: That was before. That was like a year before. 
    LEE: Right.  
    BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4. 
    So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.  
    So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x. 
    And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?  
    LEE: Yeah.
    BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.  
    LEE:One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine. 
    And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.  
    And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.  
    I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book. 
    But the main purpose of this conversation isn’t to reminisce aboutor indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements. 
    But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today? 
    You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.  
    Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork? 
    GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.  
    It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision. 
    But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view. 
    LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients.Does that make sense to you? 
    BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong? 
    Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.  
    Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them. 
    And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT. And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.  
    Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way. 
    It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine. 
    LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all? 
    GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that. 
    The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa,
    So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.  
    LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking? 
    GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.  
    The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.  
    LEE: Right.  
    GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.  
    LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication. 
    BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE, for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI. 
    It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for. 
    LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes. 
    I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?  
    That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential.What’s up with that? 
    BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back thatversion of GPT-4o, so now we don’t have the sycophant version out there. 
    Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF, where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad. 
    But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model. 
    So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model. 
    LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and … 
    BUBECK: It’s a very difficult, very difficult balance. 
    LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models? 
    GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there. 
    Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?  
    Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there.
    LEE: Yeah.
    GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake. 
    LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on. 
    BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGIthat kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything. 
    That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects.So it’s … I think it’s an important example to have in mind. 
    LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two? 
    BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it. 
    LEE: So we have about three hours of stuff to talk about, but our time is actually running low.
    BUBECK: Yes, yes, yes.  
    LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now? 
    GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.  
    The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities. 
    And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period. 
    LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers? 
    GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them. 
    LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.  
    I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why. 
    BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and seeproduced what you wanted. So I absolutely agree with that.  
    And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini. So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.  
    LEE: Yeah. 
    BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.  
    Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not. 
    Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision. 
    LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist … 
    BUBECK: Yeah.
    LEE: … or an endocrinologist might not.
    BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.
    LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today? 
    BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later. 
    And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …  
    LEE: Will AI prescribe your medicines? Write your prescriptions? 
    BUBECK: I think yes. I think yes. 
    LEE: OK. Bill? 
    GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate?
    And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelectedjust on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries. 
    You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that. 
    LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.  
    I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.  
    GATES: Yeah. Thanks, you guys. 
    BUBECK: Thank you, Peter. Thanks, Bill. 
    LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.   
    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.  
    And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.  
    One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.  
    HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings. 
    You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.  
    If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.  
    I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.  
    Until next time.  
    #how #reshaping #future #healthcare #medical
    How AI is reshaping the future of healthcare and medical research
    Transcript        PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”           This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?     In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  The book passage I read at the top is from “Chapter 10: The Big Black Bag.”  In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open.  As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.   Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home.  Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.      Here’s my conversation with Bill Gates and Sébastien Bubeck.  LEE: Bill, welcome.  BILL GATES: Thank you.  LEE: Seb …  SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here.  LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening?  And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?   GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines.  And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.   And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weaknessthat, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning.  LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that?  GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, …  LEE: Right.   GATES: … that is a bit weird.   LEE: Yeah.  GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training.  LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent.  BUBECK: Yes.   LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSRto join and start investigating this thing seriously. And the first person I pulled in was you.  BUBECK: Yeah.  LEE: And so what were your first encounters? Because I actually don’t remember what happened then.  BUBECK: Oh, I remember it very well.My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3.  I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1.  So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair.And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts.  So this was really, to me, the first moment where I saw some understanding in those models.   LEE: So this was, just to get the timing right, that was before I pulled you into the tent.  BUBECK: That was before. That was like a year before.  LEE: Right.   BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4.  So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.   So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x.  And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?   LEE: Yeah. BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.   LEE:One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine.  And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.   And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.   I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book.  But the main purpose of this conversation isn’t to reminisce aboutor indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements.  But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today?  You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.   Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork?  GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.   It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision.  But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view.  LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients.Does that make sense to you?  BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong?  Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.   Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them.  And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT. And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.   Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way.  It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine.  LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all?  GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that.  The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa, So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.   LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking?  GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.   The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.   LEE: Right.   GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.   LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication.  BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE, for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI.  It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for.  LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes.  I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?   That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential.What’s up with that?  BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back thatversion of GPT-4o, so now we don’t have the sycophant version out there.  Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF, where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad.  But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model.  So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model.  LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and …  BUBECK: It’s a very difficult, very difficult balance.  LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models?  GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there.  Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?   Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there. LEE: Yeah. GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake.  LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on.  BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGIthat kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything.  That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects.So it’s … I think it’s an important example to have in mind.  LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two?  BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it.  LEE: So we have about three hours of stuff to talk about, but our time is actually running low. BUBECK: Yes, yes, yes.   LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now?  GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.   The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities.  And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period.  LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers?  GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them.  LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.   I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why.  BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and seeproduced what you wanted. So I absolutely agree with that.   And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini. So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.   LEE: Yeah.  BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.   Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not.  Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision.  LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist …  BUBECK: Yeah. LEE: … or an endocrinologist might not. BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know. LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today?  BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later.  And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …   LEE: Will AI prescribe your medicines? Write your prescriptions?  BUBECK: I think yes. I think yes.  LEE: OK. Bill?  GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate? And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelectedjust on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries.  You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that.  LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.   I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.   GATES: Yeah. Thanks, you guys.  BUBECK: Thank you, Peter. Thanks, Bill.  LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.   And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.   One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.   HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings.  You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.   If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.   I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.   Until next time.   #how #reshaping #future #healthcare #medical
    WWW.MICROSOFT.COM
    How AI is reshaping the future of healthcare and medical research
    Transcript [MUSIC]      [BOOK PASSAGE]   PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”   [END OF BOOK PASSAGE]     [THEME MUSIC]     This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?     In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.   [THEME MUSIC FADES] The book passage I read at the top is from “Chapter 10: The Big Black Bag.”  In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open.  As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.   Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home.  Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.    [TRANSITION MUSIC]   Here’s my conversation with Bill Gates and Sébastien Bubeck.  LEE: Bill, welcome.  BILL GATES: Thank you.  LEE: Seb …  SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here.  LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening?  And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?   GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines.  And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.   And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weakness [LAUGHTER] that, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning.  LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that?  GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, …  LEE: Right.   GATES: … that is a bit weird.   LEE: Yeah.  GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training.  LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent. [LAUGHS]  BUBECK: Yes.   LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSR [Microsoft Research] to join and start investigating this thing seriously. And the first person I pulled in was you.  BUBECK: Yeah.  LEE: And so what were your first encounters? Because I actually don’t remember what happened then.  BUBECK: Oh, I remember it very well. [LAUGHS] My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3.  I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1.  So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair. [LAUGHTER] And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts.  So this was really, to me, the first moment where I saw some understanding in those models.   LEE: So this was, just to get the timing right, that was before I pulled you into the tent.  BUBECK: That was before. That was like a year before.  LEE: Right.   BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4.  So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.   So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x.  And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?   LEE: Yeah. BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.   LEE: [LAUGHS] One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine.  And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.   And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.   I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book.  But the main purpose of this conversation isn’t to reminisce about [LAUGHS] or indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements.  But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today?  You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.   Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork?  GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.   It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision.  But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view.  LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients. [LAUGHTER] Does that make sense to you?  BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong?  Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.   Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them.  And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT (opens in new tab). And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.   Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way.  It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine.  LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all?  GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that.  The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa, So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.   LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking?  GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.   The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.   LEE: Right.   GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.   LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication.  BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE [United States Medical Licensing Examination], for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI.  It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for.  LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes.  I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?   That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential. [LAUGHTER] What’s up with that?  BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back that [LAUGHS] version of GPT-4o, so now we don’t have the sycophant version out there.  Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF [reinforcement learning from human feedback], where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad.  But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model.  So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model.  LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and …  BUBECK: It’s a very difficult, very difficult balance.  LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models?  GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there.  Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?   Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there. LEE: Yeah. GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake.  LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on.  BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGI [artificial general intelligence] that kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything.  That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects. [LAUGHTER] So it’s … I think it’s an important example to have in mind.  LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two?  BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it.  LEE: So we have about three hours of stuff to talk about, but our time is actually running low. BUBECK: Yes, yes, yes.   LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now?  GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.   The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities.  And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period.  LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers?  GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them.  LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.   I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why.  BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and see [if you have] produced what you wanted. So I absolutely agree with that.   And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab). So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.   LEE: Yeah.  BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.   Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not.  Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision.  LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist …  BUBECK: Yeah. LEE: … or an endocrinologist might not. BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know. LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today?  BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later.  And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …   LEE: Will AI prescribe your medicines? Write your prescriptions?  BUBECK: I think yes. I think yes.  LEE: OK. Bill?  GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate? And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelected [LAUGHTER] just on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries.  You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that.  LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.   I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.  [TRANSITION MUSIC]  GATES: Yeah. Thanks, you guys.  BUBECK: Thank you, Peter. Thanks, Bill.  LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.   And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.   One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.   HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings.  You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.   If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.  [THEME MUSIC]  I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.   Until next time.   [MUSIC FADES]
    0 Комментарии 0 Поделились
  • ‘A Minecraft Movie’: Wētā FX Helps Adapt an Iconic Game One Block at a Time

    Adapting the iconic, block-based design aesthetic of Mojang’s beloved Minecraft videogame into the hit feature film comedy adventure, The Minecraft Movie, posed an enormous number of hurdles for director Jared Hess and Oscar-winning Production VFX Supervisor Dan Lemmon. Tasked with helping translate the iconic pixelated world into something cinematically engaging, while remaining true to its visual DNA, was Wētā FX, who delivered 450 VFX shots on the film. And two of their key leads on the film were VFX Supervisor Sheldon Stopsack and Animation Supervisor Kevin Estey. 
    But the shot count merely scratches the surface of the extensive work the studio performed. Wētā led the design and creation of The Overworld, 64 unique terrains spanning deserts, lush forests, oceans, and mountain ranges, all combined into one continuous environment, assets that were also shared with Digital Domain for their work on the 3rd act battle. Wētā also handled extensive work on the lava-filled hellscape of The Nether that involved Unreal Engine for early representations used in previs, scene scouting, and onset during principal photography, before refining the environment during post-production. They also dressed The Nether with lava, fire, and torches, along with atmospherics and particulate like smoke, ash, and embers.

    But wait… there’s more!
    The studio’s Art Department, working closely with Hess, co-created the look and feel of all digital characters in the film. For Malgosha’s henchmen, the Piglins, Wētā designed and created 12 different variants, all with individual characteristics and personalities. They also designed sheep, bees, pandas, zombies, skeletons, and lovable wolf Dennis. Many of these characters were provided to other vendors for their work on the film.
    Needless to say, the studio truly became a “Master Builder” on the show.

    The film is based on the hugely popular game Minecraft, first released by Sweden’s Mojang Studios in 2011 and purchased by Microsoft for billion in 2014, which immerses players in a low-res, pixelated “sandbox” simulation where they can use blocks to build entire worlds. 
    Here's the final trailer:

    In a far-ranging interview, Stopsack and Estey shared with AWN a peek into their creative process, from early design exploration to creation of an intricate practical cloak for Malgosha and the use of Unreal Engine for previs, postvis, and real-time onset visualization.
    Dan Sarto: The film is filled with distinct settings and characters sporting various “block” styled features. Can you share some of the work you did on the environments, character design, and character animation?
    Sheldon Stopsack: There's, there's so much to talk about and truth to be told, if you were to touch on everything, we would probably need to spend the whole day together. 
    Kevin Estey: Sheldon and I realized that when we talk about the film, either amongst ourselves or with someone else, we could just keep going, there are so many stories to tell.
    DS: Well, start with The Overworld and The Nether. How did the design process begin? What did you have to work with?
    SS: Visual effects is a tricky business, you know. It's always difficult. Always challenging. However, Minecraft stood out to us as not your usual quote unquote standard visual effects project, even though as you know, there is no standard visual effects project because they're all somehow different. They all come with their own creative ideas, inspirations, and challenges. But Minecraft, right from the get-go, was different, simply by the fact that when you first consider the idea of making such a live-action movie, you instantly ask yourself, “How do we make this work? How do we combine these two inherently very, very different but unique worlds?” That was everyone’s number one question. How do we land this? Where do we land this? And I don't think that any of us really had an answer, including our clients, Dan Lemmonand Jared Hess. Everyone was really open for this journey. That's compelling for us, to get out of our comfort zone. It makes you nervous because there are no real obvious answers.
    KE: Early on, we seemed to thrive off these kinds of scary creative challenges. There were lots of question marks. We had many moments when we were trying to figure out character designs. We had a template from the game, but it was an incredibly vague, low-resolution template. And there were so many ways that we could go. But that design discovery throughout the project was really satisfying. 

    DS: Game adaptations are never simple. There usually isn’t much in the way of story. But with Minecraft, from a visual standpoint, how did you translate low res, block-styled characters into something entertaining that could sustain a 100-minute feature film?
    SS: Everything was a question mark. Using the lava that you see in The Nether as one example, we had beautiful concept art for all our environments, The Overworld and The Nether, but those concepts only really took you this far. They didn’t represent the block shapes or give you a clear answer of like how realistic some of those materials, shapes and structures would be. How organic would we go? All of this needed to be explored. For the lava, we had stylized concept pieces, with block shaped viscosity as it flowed down. But we spent months with our effects team, and Dan and Jared, just riffing on ideas. We came full circle, with the lava ending up being more realistic, a naturally viscous liquid based on real physics. And the same goes with the waterfall that you see in the Overworld. 
    The question is, how far do we take things into the true Minecraft representation of things? How much do we scale back a little bit and ground ourselves in reality, with effects we’re quite comfortable producing as a company? There's always a tradeoff to find that balance of how best to combine what’s been filmed, the practical sets and live-action performances, with effects. Where’s the sweet spot? What's the level of abstraction? What's honest to the game? As much as some call Minecraft a simple game, it isn't simple, right? It's incredibly complex. It's got a set of rules and logic to the world building process within the game that we had to learn, adapt, and honor in many ways.
    When our misfits first arrive and we have these big vistas and establishing shots, when you really look at it, you, you recognize a lot of the things that we tried to adapt from the game. There are different biomes, like the Badlands, which is very sand stoney; there's the Woodlands, which is a lush environment with cherry blossom trees; you’ve got the snow biome with big mountains in the background. Our intent was to honor the game.
    KE: I took a big cue from a lot of the early designs, and particularly the approach that Jared liked for the characters and to the design in general, which was maintaining the stylized, blocky aesthetic, but covering them in realistic flesh, fur, things that were going to make them appear as real as possible despite the absolutely unreal designs of their bodies. And so essentially, it was squared skeleton… squarish bones with flesh and realistic fur laid over top. We tried various things, all extremely stylized. The Creepers are a good example. We tried all kinds of ways for them to explode. Sheldon found a great reference for a cat coughing up a hairball. He was nice to censor the worst part of it, but those undulations in the chest and ribcage… Jared spoke of the Creepers being basically tragic characters that only wanted to be loved, to just be close to you. But sadly, whenever they did, they’d explode. So, we experimented with a lot of different motions of how they’d explode.

    DS: Talk about the process of determining how these characters would move. None seem to have remotely realistic proportions in their limbs, bodies, or head size.
    KE: There were a couple things that Jared always seemed to be chasing. One was just something that would make him laugh. Of course, it had to sit within the bounds of how a zombie might move, or a skeleton might move, as we were interpreting the game. But the main thing was just, was it fun and funny? I still remember one of the earliest gags they came up with in mocap sessions, even before I even joined the show, was how the zombies get up after they fall over. It was sort of like a tripod, where its face and feet were planted and its butt shoots up in the air.
    After a lot of experimentation, we came up with basic personality types for each character. There were 12 different types of Piglins. The zombies were essentially like you're coming home from the pub after a few too many pints and you're just trying to get in the door, but you can't find your keys. Loose, slightly inebriated movement. The best movement we found for the skeletons was essentially like an old man with rigid limbs and lack of ligaments that was chasing kids off his lawn. And so, we created this kind of bible of performance types that really helped guide performers on the mocap stage and animators later on.
    SS: A lot of our exploration didn’t stick. But Jared was the expert in all of this. He always came up with some quirky last-minute idea. 
    KE: My favorite from Jared came in the middle of one mocap shoot. He walked up to me and said he had this stupid idea. I said OK, go on. He said, what if Malgosha had these two little pigs next to her, like Catholic alter boys, swinging incense. Can we do that? I talked to our stage manager, and we quickly put together a temporary prop for the incense burners. And we got two performers who just stood there. What are they going to do? Jared said, “Nothing. Just stand there and swing. I think it would look funny.” So, that’s what we did.  We dubbed them the Priesty Boys. And they are there throughout the film. That was amazing about Jared. He was always like, let's just try it, see if it works. Otherwise ditch it.

    DS: Tell me about your work on Malgosha. And I also want to discuss your use of Unreal Engine and the previs and postvis work. 
    SS: For Malgosha as a character, our art department did a phenomenal job finding the character design at the concept phase. But it was a collective effort. So many contributors were involved in her making. And I'm not just talking about the digital artists here on our side. It was a joint venture of different people having different explorations and experiments. It started off with the concept work as a foundation, which we mocked up with 3D sketches before building a model. But with Malgosha, we also had the costume department on the production side building this elaborate cloak. Remember, that cloak kind of makes 80, 85% of her appearance. It's almost like a character in itself, the way we utilized it. And the costume department built this beautiful, elaborate, incredibly intricate, practical version of it that we intended to use on set for the performer to wear. It ended up being too impractical because it was too heavy. But it was beautiful. So, while we didn't really use it on set, it gave us something physically to kind of incorporate into our digital version.
    KE: Alan Henry is the motion performer who portrayed her on set and on the mocap stage. I've known him for close to 15 years. I started working with him on The Hobbit films. He was a stunt performer who eventually rolled into doing motion capture with us on The Hobbit. He’s an incredible actor and absolutely hilarious and can adapt to any sort of situation. He’s so improvisational. He came up with an approach to Malgosha very quickly. Added a limp so that she felt decrepit, leaning on the staff, adding her other arm as kind of like a gimp arm that she would point and gesture with.  
    Even though she’s a blocky character, her anatomy is very much a biped, with rounder limbs than the other Piglins. She's got hooves, is somewhat squarish, and her much more bulky mass in the middle was easier to manipulate and move around. Because she would have to battle with Steve in the end, she had to have a level of agility that even some of the Piglins didn't have.

    DS: Did Unreal Engine come into play with her? 
    SS: Unreal was used all the way through the project. Dan Lemmon and his team early on set up their own virtual art department to build representations of the Overworld and the Nether within the context of Unreal. We and Sony Imageworks tried to provide recreations of these environments that were then used within Unreal to previsualize what was happening on set during shooting of principal photography. And that's where our mocap and on-set teams were coming into play. Effects provided what we called the Nudge Cam. It was a system to do real-time tracking using a stereo pair of Basler computer vision cameras that were mounted onto the sides of the principal camera. We provided the live tracking that was then composited in real time with the Unreal Engine content that all the vendors had provided. It was a great way of utilizing Unreal to give the camera operators or DOP, even Jared, a good sense of what we would actually shoot. It gave everyone a little bit of context for the look and feel of what you could actually expect from these scenes. 
    Because we started this journey with Unreal having onset in mind, we internally decided, look, let's take this further. Let's take this into post-production as well. What would it take to utilize Unreal for shot creation? And it was really exclusively used on the Nether environment. I don’t want to say we used it for matte painting replacement. We used it more for say, let's build this extended environment in Unreal. Not only use it as a render engine with this reasonably fast turnaround but also use it for what it's good at: authoring things, quickly changing things, moving columns around, manipulating things, dressing them, lighting them, and rendering them. It became sort of a tool that we used in place of a traditional matte painting for the extended environments.
    KE: Another thing worth mentioning is we were able to utilize it on our mocap stage as well during the two-week shoot with Jared and crew. When we shoot on the mocap stage, we get a very simple sort of gray shaded diagnostic grid. You have your single-color characters that sometimes are textured, but they’re fairly simple without any context of environment. Our special projects team was able to port what we usually see in Giant, the software we use on the mocap stage, into Unreal, which gave us these beautifully lit environments with interactive fire and atmosphere. And Jared and the team could see their movie for the first time in a rough, but still very beautiful rough state. That was invaluable.

    DS: If you had to key on anything, what would say with the biggest challenges for your teams on the film? You're laughing. I can hear you thinking, “Do we have an hour?” 
    KE: Where do you begin? 
    SS: Exactly. It's so hard to really single one out. And I struggle with that question every time I've been asked that question.
    KE: I’ll start.  I've got a very simple practical answer and then a larger one, something that was new to us, kind of similar to what we were just talking about. The simple practical one is the Piglins square feet with no ankles. It was very tough to make them walk realistically. Think of the leg of a chair. How do you make that roll and bank and bend because there is no joint? There are a lot of Piglins walking on surfaces and it was a very difficult conundrum to solve. It took a lot of hard work from our motion edit team and our animation team to get those things walking realistically. You know, it’s doing that simple thing that you don't usually pay attention to. So that was one reasonably big challenge that is often literally buried in the shadows. The bigger one was something that was new to me. We often do a lot of our previs and postvis in-house and then finish the shots. And just because of circumstances and capacity, we did the postvis for the entire final battle, but we ended up sharing the sequence with Digital Domain, who did an amazing job completing some of the stuff on the Battlefield we did post on. For me personally, I've never experienced not finishing what I started. But it was also really rewarding to see how well the work we had put in was honored by DD when they took it over.  
    SS: I think the biggest challenge and the biggest achievement that I'm most proud of is really ending up with something that was well received by the wider audience. Of creating these two worlds, this sort of abstract adaptation of the Minecraft game and combining it with live-action. That was the achievement for me. That was the biggest challenge. We were all nervous from day one. And we continued to be nervous up until the day the movie came out. None of us really knew how it ultimately would be received. The fact that it came together and was so well received is a testament to everyone doing a fantastic job. And that's what I'm incredibly proud of.

    Dan Sarto is Publisher and Editor-in-Chief of Animation World Network.
    #minecraft #movie #wētā #helps #adapt
    ‘A Minecraft Movie’: Wētā FX Helps Adapt an Iconic Game One Block at a Time
    Adapting the iconic, block-based design aesthetic of Mojang’s beloved Minecraft videogame into the hit feature film comedy adventure, The Minecraft Movie, posed an enormous number of hurdles for director Jared Hess and Oscar-winning Production VFX Supervisor Dan Lemmon. Tasked with helping translate the iconic pixelated world into something cinematically engaging, while remaining true to its visual DNA, was Wētā FX, who delivered 450 VFX shots on the film. And two of their key leads on the film were VFX Supervisor Sheldon Stopsack and Animation Supervisor Kevin Estey.  But the shot count merely scratches the surface of the extensive work the studio performed. Wētā led the design and creation of The Overworld, 64 unique terrains spanning deserts, lush forests, oceans, and mountain ranges, all combined into one continuous environment, assets that were also shared with Digital Domain for their work on the 3rd act battle. Wētā also handled extensive work on the lava-filled hellscape of The Nether that involved Unreal Engine for early representations used in previs, scene scouting, and onset during principal photography, before refining the environment during post-production. They also dressed The Nether with lava, fire, and torches, along with atmospherics and particulate like smoke, ash, and embers. But wait… there’s more! The studio’s Art Department, working closely with Hess, co-created the look and feel of all digital characters in the film. For Malgosha’s henchmen, the Piglins, Wētā designed and created 12 different variants, all with individual characteristics and personalities. They also designed sheep, bees, pandas, zombies, skeletons, and lovable wolf Dennis. Many of these characters were provided to other vendors for their work on the film. Needless to say, the studio truly became a “Master Builder” on the show. The film is based on the hugely popular game Minecraft, first released by Sweden’s Mojang Studios in 2011 and purchased by Microsoft for billion in 2014, which immerses players in a low-res, pixelated “sandbox” simulation where they can use blocks to build entire worlds.  Here's the final trailer: In a far-ranging interview, Stopsack and Estey shared with AWN a peek into their creative process, from early design exploration to creation of an intricate practical cloak for Malgosha and the use of Unreal Engine for previs, postvis, and real-time onset visualization. Dan Sarto: The film is filled with distinct settings and characters sporting various “block” styled features. Can you share some of the work you did on the environments, character design, and character animation? Sheldon Stopsack: There's, there's so much to talk about and truth to be told, if you were to touch on everything, we would probably need to spend the whole day together.  Kevin Estey: Sheldon and I realized that when we talk about the film, either amongst ourselves or with someone else, we could just keep going, there are so many stories to tell. DS: Well, start with The Overworld and The Nether. How did the design process begin? What did you have to work with? SS: Visual effects is a tricky business, you know. It's always difficult. Always challenging. However, Minecraft stood out to us as not your usual quote unquote standard visual effects project, even though as you know, there is no standard visual effects project because they're all somehow different. They all come with their own creative ideas, inspirations, and challenges. But Minecraft, right from the get-go, was different, simply by the fact that when you first consider the idea of making such a live-action movie, you instantly ask yourself, “How do we make this work? How do we combine these two inherently very, very different but unique worlds?” That was everyone’s number one question. How do we land this? Where do we land this? And I don't think that any of us really had an answer, including our clients, Dan Lemmonand Jared Hess. Everyone was really open for this journey. That's compelling for us, to get out of our comfort zone. It makes you nervous because there are no real obvious answers. KE: Early on, we seemed to thrive off these kinds of scary creative challenges. There were lots of question marks. We had many moments when we were trying to figure out character designs. We had a template from the game, but it was an incredibly vague, low-resolution template. And there were so many ways that we could go. But that design discovery throughout the project was really satisfying.  DS: Game adaptations are never simple. There usually isn’t much in the way of story. But with Minecraft, from a visual standpoint, how did you translate low res, block-styled characters into something entertaining that could sustain a 100-minute feature film? SS: Everything was a question mark. Using the lava that you see in The Nether as one example, we had beautiful concept art for all our environments, The Overworld and The Nether, but those concepts only really took you this far. They didn’t represent the block shapes or give you a clear answer of like how realistic some of those materials, shapes and structures would be. How organic would we go? All of this needed to be explored. For the lava, we had stylized concept pieces, with block shaped viscosity as it flowed down. But we spent months with our effects team, and Dan and Jared, just riffing on ideas. We came full circle, with the lava ending up being more realistic, a naturally viscous liquid based on real physics. And the same goes with the waterfall that you see in the Overworld.  The question is, how far do we take things into the true Minecraft representation of things? How much do we scale back a little bit and ground ourselves in reality, with effects we’re quite comfortable producing as a company? There's always a tradeoff to find that balance of how best to combine what’s been filmed, the practical sets and live-action performances, with effects. Where’s the sweet spot? What's the level of abstraction? What's honest to the game? As much as some call Minecraft a simple game, it isn't simple, right? It's incredibly complex. It's got a set of rules and logic to the world building process within the game that we had to learn, adapt, and honor in many ways. When our misfits first arrive and we have these big vistas and establishing shots, when you really look at it, you, you recognize a lot of the things that we tried to adapt from the game. There are different biomes, like the Badlands, which is very sand stoney; there's the Woodlands, which is a lush environment with cherry blossom trees; you’ve got the snow biome with big mountains in the background. Our intent was to honor the game. KE: I took a big cue from a lot of the early designs, and particularly the approach that Jared liked for the characters and to the design in general, which was maintaining the stylized, blocky aesthetic, but covering them in realistic flesh, fur, things that were going to make them appear as real as possible despite the absolutely unreal designs of their bodies. And so essentially, it was squared skeleton… squarish bones with flesh and realistic fur laid over top. We tried various things, all extremely stylized. The Creepers are a good example. We tried all kinds of ways for them to explode. Sheldon found a great reference for a cat coughing up a hairball. He was nice to censor the worst part of it, but those undulations in the chest and ribcage… Jared spoke of the Creepers being basically tragic characters that only wanted to be loved, to just be close to you. But sadly, whenever they did, they’d explode. So, we experimented with a lot of different motions of how they’d explode. DS: Talk about the process of determining how these characters would move. None seem to have remotely realistic proportions in their limbs, bodies, or head size. KE: There were a couple things that Jared always seemed to be chasing. One was just something that would make him laugh. Of course, it had to sit within the bounds of how a zombie might move, or a skeleton might move, as we were interpreting the game. But the main thing was just, was it fun and funny? I still remember one of the earliest gags they came up with in mocap sessions, even before I even joined the show, was how the zombies get up after they fall over. It was sort of like a tripod, where its face and feet were planted and its butt shoots up in the air. After a lot of experimentation, we came up with basic personality types for each character. There were 12 different types of Piglins. The zombies were essentially like you're coming home from the pub after a few too many pints and you're just trying to get in the door, but you can't find your keys. Loose, slightly inebriated movement. The best movement we found for the skeletons was essentially like an old man with rigid limbs and lack of ligaments that was chasing kids off his lawn. And so, we created this kind of bible of performance types that really helped guide performers on the mocap stage and animators later on. SS: A lot of our exploration didn’t stick. But Jared was the expert in all of this. He always came up with some quirky last-minute idea.  KE: My favorite from Jared came in the middle of one mocap shoot. He walked up to me and said he had this stupid idea. I said OK, go on. He said, what if Malgosha had these two little pigs next to her, like Catholic alter boys, swinging incense. Can we do that? I talked to our stage manager, and we quickly put together a temporary prop for the incense burners. And we got two performers who just stood there. What are they going to do? Jared said, “Nothing. Just stand there and swing. I think it would look funny.” So, that’s what we did.  We dubbed them the Priesty Boys. And they are there throughout the film. That was amazing about Jared. He was always like, let's just try it, see if it works. Otherwise ditch it. DS: Tell me about your work on Malgosha. And I also want to discuss your use of Unreal Engine and the previs and postvis work.  SS: For Malgosha as a character, our art department did a phenomenal job finding the character design at the concept phase. But it was a collective effort. So many contributors were involved in her making. And I'm not just talking about the digital artists here on our side. It was a joint venture of different people having different explorations and experiments. It started off with the concept work as a foundation, which we mocked up with 3D sketches before building a model. But with Malgosha, we also had the costume department on the production side building this elaborate cloak. Remember, that cloak kind of makes 80, 85% of her appearance. It's almost like a character in itself, the way we utilized it. And the costume department built this beautiful, elaborate, incredibly intricate, practical version of it that we intended to use on set for the performer to wear. It ended up being too impractical because it was too heavy. But it was beautiful. So, while we didn't really use it on set, it gave us something physically to kind of incorporate into our digital version. KE: Alan Henry is the motion performer who portrayed her on set and on the mocap stage. I've known him for close to 15 years. I started working with him on The Hobbit films. He was a stunt performer who eventually rolled into doing motion capture with us on The Hobbit. He’s an incredible actor and absolutely hilarious and can adapt to any sort of situation. He’s so improvisational. He came up with an approach to Malgosha very quickly. Added a limp so that she felt decrepit, leaning on the staff, adding her other arm as kind of like a gimp arm that she would point and gesture with.   Even though she’s a blocky character, her anatomy is very much a biped, with rounder limbs than the other Piglins. She's got hooves, is somewhat squarish, and her much more bulky mass in the middle was easier to manipulate and move around. Because she would have to battle with Steve in the end, she had to have a level of agility that even some of the Piglins didn't have. DS: Did Unreal Engine come into play with her?  SS: Unreal was used all the way through the project. Dan Lemmon and his team early on set up their own virtual art department to build representations of the Overworld and the Nether within the context of Unreal. We and Sony Imageworks tried to provide recreations of these environments that were then used within Unreal to previsualize what was happening on set during shooting of principal photography. And that's where our mocap and on-set teams were coming into play. Effects provided what we called the Nudge Cam. It was a system to do real-time tracking using a stereo pair of Basler computer vision cameras that were mounted onto the sides of the principal camera. We provided the live tracking that was then composited in real time with the Unreal Engine content that all the vendors had provided. It was a great way of utilizing Unreal to give the camera operators or DOP, even Jared, a good sense of what we would actually shoot. It gave everyone a little bit of context for the look and feel of what you could actually expect from these scenes.  Because we started this journey with Unreal having onset in mind, we internally decided, look, let's take this further. Let's take this into post-production as well. What would it take to utilize Unreal for shot creation? And it was really exclusively used on the Nether environment. I don’t want to say we used it for matte painting replacement. We used it more for say, let's build this extended environment in Unreal. Not only use it as a render engine with this reasonably fast turnaround but also use it for what it's good at: authoring things, quickly changing things, moving columns around, manipulating things, dressing them, lighting them, and rendering them. It became sort of a tool that we used in place of a traditional matte painting for the extended environments. KE: Another thing worth mentioning is we were able to utilize it on our mocap stage as well during the two-week shoot with Jared and crew. When we shoot on the mocap stage, we get a very simple sort of gray shaded diagnostic grid. You have your single-color characters that sometimes are textured, but they’re fairly simple without any context of environment. Our special projects team was able to port what we usually see in Giant, the software we use on the mocap stage, into Unreal, which gave us these beautifully lit environments with interactive fire and atmosphere. And Jared and the team could see their movie for the first time in a rough, but still very beautiful rough state. That was invaluable. DS: If you had to key on anything, what would say with the biggest challenges for your teams on the film? You're laughing. I can hear you thinking, “Do we have an hour?”  KE: Where do you begin?  SS: Exactly. It's so hard to really single one out. And I struggle with that question every time I've been asked that question. KE: I’ll start.  I've got a very simple practical answer and then a larger one, something that was new to us, kind of similar to what we were just talking about. The simple practical one is the Piglins square feet with no ankles. It was very tough to make them walk realistically. Think of the leg of a chair. How do you make that roll and bank and bend because there is no joint? There are a lot of Piglins walking on surfaces and it was a very difficult conundrum to solve. It took a lot of hard work from our motion edit team and our animation team to get those things walking realistically. You know, it’s doing that simple thing that you don't usually pay attention to. So that was one reasonably big challenge that is often literally buried in the shadows. The bigger one was something that was new to me. We often do a lot of our previs and postvis in-house and then finish the shots. And just because of circumstances and capacity, we did the postvis for the entire final battle, but we ended up sharing the sequence with Digital Domain, who did an amazing job completing some of the stuff on the Battlefield we did post on. For me personally, I've never experienced not finishing what I started. But it was also really rewarding to see how well the work we had put in was honored by DD when they took it over.   SS: I think the biggest challenge and the biggest achievement that I'm most proud of is really ending up with something that was well received by the wider audience. Of creating these two worlds, this sort of abstract adaptation of the Minecraft game and combining it with live-action. That was the achievement for me. That was the biggest challenge. We were all nervous from day one. And we continued to be nervous up until the day the movie came out. None of us really knew how it ultimately would be received. The fact that it came together and was so well received is a testament to everyone doing a fantastic job. And that's what I'm incredibly proud of. Dan Sarto is Publisher and Editor-in-Chief of Animation World Network. #minecraft #movie #wētā #helps #adapt
    WWW.AWN.COM
    ‘A Minecraft Movie’: Wētā FX Helps Adapt an Iconic Game One Block at a Time
    Adapting the iconic, block-based design aesthetic of Mojang’s beloved Minecraft videogame into the hit feature film comedy adventure, The Minecraft Movie, posed an enormous number of hurdles for director Jared Hess and Oscar-winning Production VFX Supervisor Dan Lemmon. Tasked with helping translate the iconic pixelated world into something cinematically engaging, while remaining true to its visual DNA, was Wētā FX, who delivered 450 VFX shots on the film. And two of their key leads on the film were VFX Supervisor Sheldon Stopsack and Animation Supervisor Kevin Estey.  But the shot count merely scratches the surface of the extensive work the studio performed. Wētā led the design and creation of The Overworld, 64 unique terrains spanning deserts, lush forests, oceans, and mountain ranges, all combined into one continuous environment, assets that were also shared with Digital Domain for their work on the 3rd act battle. Wētā also handled extensive work on the lava-filled hellscape of The Nether that involved Unreal Engine for early representations used in previs, scene scouting, and onset during principal photography, before refining the environment during post-production. They also dressed The Nether with lava, fire, and torches, along with atmospherics and particulate like smoke, ash, and embers. But wait… there’s more! The studio’s Art Department, working closely with Hess, co-created the look and feel of all digital characters in the film. For Malgosha’s henchmen, the Piglins, Wētā designed and created 12 different variants, all with individual characteristics and personalities. They also designed sheep, bees, pandas, zombies, skeletons, and lovable wolf Dennis. Many of these characters were provided to other vendors for their work on the film. Needless to say, the studio truly became a “Master Builder” on the show. The film is based on the hugely popular game Minecraft, first released by Sweden’s Mojang Studios in 2011 and purchased by Microsoft for $2.5 billion in 2014, which immerses players in a low-res, pixelated “sandbox” simulation where they can use blocks to build entire worlds.  Here's the final trailer: In a far-ranging interview, Stopsack and Estey shared with AWN a peek into their creative process, from early design exploration to creation of an intricate practical cloak for Malgosha and the use of Unreal Engine for previs, postvis, and real-time onset visualization. Dan Sarto: The film is filled with distinct settings and characters sporting various “block” styled features. Can you share some of the work you did on the environments, character design, and character animation? Sheldon Stopsack: There's, there's so much to talk about and truth to be told, if you were to touch on everything, we would probably need to spend the whole day together.  Kevin Estey: Sheldon and I realized that when we talk about the film, either amongst ourselves or with someone else, we could just keep going, there are so many stories to tell. DS: Well, start with The Overworld and The Nether. How did the design process begin? What did you have to work with? SS: Visual effects is a tricky business, you know. It's always difficult. Always challenging. However, Minecraft stood out to us as not your usual quote unquote standard visual effects project, even though as you know, there is no standard visual effects project because they're all somehow different. They all come with their own creative ideas, inspirations, and challenges. But Minecraft, right from the get-go, was different, simply by the fact that when you first consider the idea of making such a live-action movie, you instantly ask yourself, “How do we make this work? How do we combine these two inherently very, very different but unique worlds?” That was everyone’s number one question. How do we land this? Where do we land this? And I don't think that any of us really had an answer, including our clients, Dan Lemmon [Production VFX Supervisor] and Jared Hess [the film’s director]. Everyone was really open for this journey. That's compelling for us, to get out of our comfort zone. It makes you nervous because there are no real obvious answers. KE: Early on, we seemed to thrive off these kinds of scary creative challenges. There were lots of question marks. We had many moments when we were trying to figure out character designs. We had a template from the game, but it was an incredibly vague, low-resolution template. And there were so many ways that we could go. But that design discovery throughout the project was really satisfying.  DS: Game adaptations are never simple. There usually isn’t much in the way of story. But with Minecraft, from a visual standpoint, how did you translate low res, block-styled characters into something entertaining that could sustain a 100-minute feature film? SS: Everything was a question mark. Using the lava that you see in The Nether as one example, we had beautiful concept art for all our environments, The Overworld and The Nether, but those concepts only really took you this far. They didn’t represent the block shapes or give you a clear answer of like how realistic some of those materials, shapes and structures would be. How organic would we go? All of this needed to be explored. For the lava, we had stylized concept pieces, with block shaped viscosity as it flowed down. But we spent months with our effects team, and Dan and Jared, just riffing on ideas. We came full circle, with the lava ending up being more realistic, a naturally viscous liquid based on real physics. And the same goes with the waterfall that you see in the Overworld.  The question is, how far do we take things into the true Minecraft representation of things? How much do we scale back a little bit and ground ourselves in reality, with effects we’re quite comfortable producing as a company? There's always a tradeoff to find that balance of how best to combine what’s been filmed, the practical sets and live-action performances, with effects. Where’s the sweet spot? What's the level of abstraction? What's honest to the game? As much as some call Minecraft a simple game, it isn't simple, right? It's incredibly complex. It's got a set of rules and logic to the world building process within the game that we had to learn, adapt, and honor in many ways. When our misfits first arrive and we have these big vistas and establishing shots, when you really look at it, you, you recognize a lot of the things that we tried to adapt from the game. There are different biomes, like the Badlands, which is very sand stoney; there's the Woodlands, which is a lush environment with cherry blossom trees; you’ve got the snow biome with big mountains in the background. Our intent was to honor the game. KE: I took a big cue from a lot of the early designs, and particularly the approach that Jared liked for the characters and to the design in general, which was maintaining the stylized, blocky aesthetic, but covering them in realistic flesh, fur, things that were going to make them appear as real as possible despite the absolutely unreal designs of their bodies. And so essentially, it was squared skeleton… squarish bones with flesh and realistic fur laid over top. We tried various things, all extremely stylized. The Creepers are a good example. We tried all kinds of ways for them to explode. Sheldon found a great reference for a cat coughing up a hairball. He was nice to censor the worst part of it, but those undulations in the chest and ribcage… Jared spoke of the Creepers being basically tragic characters that only wanted to be loved, to just be close to you. But sadly, whenever they did, they’d explode. So, we experimented with a lot of different motions of how they’d explode. DS: Talk about the process of determining how these characters would move. None seem to have remotely realistic proportions in their limbs, bodies, or head size. KE: There were a couple things that Jared always seemed to be chasing. One was just something that would make him laugh. Of course, it had to sit within the bounds of how a zombie might move, or a skeleton might move, as we were interpreting the game. But the main thing was just, was it fun and funny? I still remember one of the earliest gags they came up with in mocap sessions, even before I even joined the show, was how the zombies get up after they fall over. It was sort of like a tripod, where its face and feet were planted and its butt shoots up in the air. After a lot of experimentation, we came up with basic personality types for each character. There were 12 different types of Piglins. The zombies were essentially like you're coming home from the pub after a few too many pints and you're just trying to get in the door, but you can't find your keys. Loose, slightly inebriated movement. The best movement we found for the skeletons was essentially like an old man with rigid limbs and lack of ligaments that was chasing kids off his lawn. And so, we created this kind of bible of performance types that really helped guide performers on the mocap stage and animators later on. SS: A lot of our exploration didn’t stick. But Jared was the expert in all of this. He always came up with some quirky last-minute idea.  KE: My favorite from Jared came in the middle of one mocap shoot. He walked up to me and said he had this stupid idea. I said OK, go on. He said, what if Malgosha had these two little pigs next to her, like Catholic alter boys [the thurifers], swinging incense [a thurible]. Can we do that? I talked to our stage manager, and we quickly put together a temporary prop for the incense burners. And we got two performers who just stood there. What are they going to do? Jared said, “Nothing. Just stand there and swing. I think it would look funny.” So, that’s what we did.  We dubbed them the Priesty Boys. And they are there throughout the film. That was amazing about Jared. He was always like, let's just try it, see if it works. Otherwise ditch it. DS: Tell me about your work on Malgosha. And I also want to discuss your use of Unreal Engine and the previs and postvis work.  SS: For Malgosha as a character, our art department did a phenomenal job finding the character design at the concept phase. But it was a collective effort. So many contributors were involved in her making. And I'm not just talking about the digital artists here on our side. It was a joint venture of different people having different explorations and experiments. It started off with the concept work as a foundation, which we mocked up with 3D sketches before building a model. But with Malgosha, we also had the costume department on the production side building this elaborate cloak. Remember, that cloak kind of makes 80, 85% of her appearance. It's almost like a character in itself, the way we utilized it. And the costume department built this beautiful, elaborate, incredibly intricate, practical version of it that we intended to use on set for the performer to wear. It ended up being too impractical because it was too heavy. But it was beautiful. So, while we didn't really use it on set, it gave us something physically to kind of incorporate into our digital version. KE: Alan Henry is the motion performer who portrayed her on set and on the mocap stage. I've known him for close to 15 years. I started working with him on The Hobbit films. He was a stunt performer who eventually rolled into doing motion capture with us on The Hobbit. He’s an incredible actor and absolutely hilarious and can adapt to any sort of situation. He’s so improvisational. He came up with an approach to Malgosha very quickly. Added a limp so that she felt decrepit, leaning on the staff, adding her other arm as kind of like a gimp arm that she would point and gesture with.   Even though she’s a blocky character, her anatomy is very much a biped, with rounder limbs than the other Piglins. She's got hooves, is somewhat squarish, and her much more bulky mass in the middle was easier to manipulate and move around. Because she would have to battle with Steve in the end, she had to have a level of agility that even some of the Piglins didn't have. DS: Did Unreal Engine come into play with her?  SS: Unreal was used all the way through the project. Dan Lemmon and his team early on set up their own virtual art department to build representations of the Overworld and the Nether within the context of Unreal. We and Sony Imageworks tried to provide recreations of these environments that were then used within Unreal to previsualize what was happening on set during shooting of principal photography. And that's where our mocap and on-set teams were coming into play. Effects provided what we called the Nudge Cam. It was a system to do real-time tracking using a stereo pair of Basler computer vision cameras that were mounted onto the sides of the principal camera. We provided the live tracking that was then composited in real time with the Unreal Engine content that all the vendors had provided. It was a great way of utilizing Unreal to give the camera operators or DOP, even Jared, a good sense of what we would actually shoot. It gave everyone a little bit of context for the look and feel of what you could actually expect from these scenes.  Because we started this journey with Unreal having onset in mind, we internally decided, look, let's take this further. Let's take this into post-production as well. What would it take to utilize Unreal for shot creation? And it was really exclusively used on the Nether environment. I don’t want to say we used it for matte painting replacement. We used it more for say, let's build this extended environment in Unreal. Not only use it as a render engine with this reasonably fast turnaround but also use it for what it's good at: authoring things, quickly changing things, moving columns around, manipulating things, dressing them, lighting them, and rendering them. It became sort of a tool that we used in place of a traditional matte painting for the extended environments. KE: Another thing worth mentioning is we were able to utilize it on our mocap stage as well during the two-week shoot with Jared and crew. When we shoot on the mocap stage, we get a very simple sort of gray shaded diagnostic grid. You have your single-color characters that sometimes are textured, but they’re fairly simple without any context of environment. Our special projects team was able to port what we usually see in Giant, the software we use on the mocap stage, into Unreal, which gave us these beautifully lit environments with interactive fire and atmosphere. And Jared and the team could see their movie for the first time in a rough, but still very beautiful rough state. That was invaluable. DS: If you had to key on anything, what would say with the biggest challenges for your teams on the film? You're laughing. I can hear you thinking, “Do we have an hour?”  KE: Where do you begin?  SS: Exactly. It's so hard to really single one out. And I struggle with that question every time I've been asked that question. KE: I’ll start.  I've got a very simple practical answer and then a larger one, something that was new to us, kind of similar to what we were just talking about. The simple practical one is the Piglins square feet with no ankles. It was very tough to make them walk realistically. Think of the leg of a chair. How do you make that roll and bank and bend because there is no joint? There are a lot of Piglins walking on surfaces and it was a very difficult conundrum to solve. It took a lot of hard work from our motion edit team and our animation team to get those things walking realistically. You know, it’s doing that simple thing that you don't usually pay attention to. So that was one reasonably big challenge that is often literally buried in the shadows. The bigger one was something that was new to me. We often do a lot of our previs and postvis in-house and then finish the shots. And just because of circumstances and capacity, we did the postvis for the entire final battle, but we ended up sharing the sequence with Digital Domain, who did an amazing job completing some of the stuff on the Battlefield we did post on. For me personally, I've never experienced not finishing what I started. But it was also really rewarding to see how well the work we had put in was honored by DD when they took it over.   SS: I think the biggest challenge and the biggest achievement that I'm most proud of is really ending up with something that was well received by the wider audience. Of creating these two worlds, this sort of abstract adaptation of the Minecraft game and combining it with live-action. That was the achievement for me. That was the biggest challenge. We were all nervous from day one. And we continued to be nervous up until the day the movie came out. None of us really knew how it ultimately would be received. The fact that it came together and was so well received is a testament to everyone doing a fantastic job. And that's what I'm incredibly proud of. Dan Sarto is Publisher and Editor-in-Chief of Animation World Network.
    0 Комментарии 0 Поделились
  • #333;">How to Spot AI Hype and Avoid The AI Con, According to Two Experts
    "Artificial intelligence, if we're being frank, is a con: a bill of goods you are being sold to line someone's pockets."That is the heart of the argument that linguist Emily Bender and sociologist Alex Hanna make in their new book The AI Con.
    It's a useful guide for anyone whose life has intersected with technologies sold as artificial intelligence and anyone who's questioned their real usefulness, which is most of us.
    Bender is a professor at the University of Washington who was named one of Time magazine's most influential people in artificial intelligence, and Hanna is the director of research at the nonprofit Distributed AI Research Institute and a former member of the ethical AI team at Google.The explosion of ChatGPT in late 2022 kicked off a new hype cycle in AI.
    Hype, as the authors define it, is the "aggrandizement" of technology that you are convinced you need to buy or invest in "lest you miss out on entertainment or pleasure, monetary reward, return on investment, or market share." But it's not the first time, nor likely the last, that scholars, government leaders and regular people have been intrigued and worried by the idea of machine learning and AI.Bender and Hanna trace the roots of machine learning back to the 1950s, to when mathematician John McCarthy coined the term artificial intelligence.
    It was in an era when the United States was looking to fund projects that would help the country gain any kind of edge on the Soviets militarily, ideologically and technologically.
    "It didn't spring whole cloth out of Zeus's head or anything.
    This has a longer history," Hanna said in an interview with CNET.
    "It's certainly not the first hype cycle with, quote, unquote, AI."Today's hype cycle is propelled by the billions of dollars of venture capital investment into startups like OpenAI and the tech giants like Meta, Google and Microsoft pouring billions of dollars into AI research and development.
    The result is clear, with all the newest phones, laptops and software updates drenched in AI-washing.
    And there are no signs that AI research and development will slow down, thanks in part to a growing motivation to beat China in AI development.
    Not the first hype cycle indeed.Of course, generative AI in 2025 is much more advanced than the Eliza psychotherapy chatbot that first enraptured scientists in the 1970s.
    Today's business leaders and workers are inundated with hype, with a heavy dose of FOMO and seemingly complex but often misused jargon.
    Listening to tech leaders and AI enthusiasts, it might seem like AI will take your job to save your company money.
    But the authors argue that neither is wholly likely, which is one reason why it's important to recognize and break through the hype.So how do we recognize AI hype? These are a few telltale signs, according to Bender and Hanna, that we share below.
    The authors outline more questions to ask and strategies for AI hype busting in their book, which is out now in the US.Watch out for language that humanizes AIAnthropomorphizing, or the process of giving an inanimate object human-like characteristics or qualities, is a big part of building AI hype.
    An example of this kind of language can be found when AI companies say their chatbots can now "see" and "think."These can be useful comparisons when trying to describe the ability of new object-identifying AI programs or deep-reasoning AI models, but they can also be misleading.
    AI chatbots aren't capable of seeing of thinking because they don't have brains.
    Even the idea of neural nets, Hanna noted in our interview and in the book, is based on human understanding of neurons from the 1950s, not actually how neurons work, but it can fool us into believing there's a brain behind the machine.That belief is something we're predisposed to because of how we as humans process language.
    We're conditioned to imagine that there is a mind behind the text we see, even when we know it's generated by AI, Bender said.
    "We interpret language by developing a model in our minds of who the speaker was," Bender added.In these models, we use our knowledge of the person speaking to create meaning, not just using the meaning of the words they say.
    "So when we encounter synthetic text extruded from something like ChatGPT, we're going to do the same thing," Bender said.
    "And it is very hard to remind ourselves that the mind isn't there.
    It's just a construct that we have produced."The authors argue that part of why AI companies try to convince us their products are human-like is that this sets the foreground for them to convince us that AI can replace humans, whether it's at work or as creators.
    It's compelling for us to believe that AI could be the silver bullet fix to complicated problems in critical industries like health care and government services.But more often than not, the authors argue, AI isn't bring used to fix anything.
    AI is sold with the goal of efficiency, but AI services end up replacing qualified workers with black box machines that need copious amounts of babysitting from underpaid contract or gig workers.
    As Hanna put it in our interview, "AI is not going to take your job, but it will make your job shittier."Be dubious of the phrase 'super intelligence'If a human can't do something, you should be wary of claims that an AI can do it.
    "Superhuman intelligence, or super intelligence, is a very dangerous turn of phrase, insofar as it thinks that some technology is going to make humans superfluous," Hanna said.
    In "certain domains, like pattern matching at scale, computers are quite good at that.
    But if there's an idea that there's going to be a superhuman poem, or a superhuman notion of research or doing science, that is clear hype." Bender added, "And we don't talk about airplanes as superhuman flyers or rulers as superhuman measurers, it seems to be only in this AI space that that comes up."The idea of AI "super intelligence" comes up often when people talk about artificial general intelligence.
    Many CEOs struggle to define what exactly AGI is, but it's essentially AI's most advanced form, potentially capable of making decisions and handling complex tasks.
    There's still no evidence we're anywhere near a future enabled by AGI, but it's a popular buzzword.Many of these future-looking statements from AI leaders borrow tropes from science fiction.
    Both boosters and doomers — how Bender and Hanna describe AI enthusiasts and those worried about the potential for harm — rely on sci-fi scenarios.
    The boosters imagine an AI-powered futuristic society.
    The doomers bemoan a future where AI robots take over the world and wipe out humanity.The connecting thread, according to the authors, is an unshakable belief that AI is smarter than humans and inevitable.
    "One of the things that we see a lot in the discourse is this idea that the future is fixed, and it's just a question of how fast we get there," Bender said.
    "And then there's this claim that this particular technology is a step on that path, and it's all marketing.
    It is helpful to be able to see behind it."Part of why AI is so popular is that an autonomous functional AI assistant would mean AI companies are fulfilling their promises of world-changing innovation to their investors.
    Planning for that future — whether it's a utopia or dystopia — keeps investors looking forward as the companies burn through billions of dollars and admit they'll miss their carbon emission goals.
    For better or worse, life is not science fiction.
    Whenever you see someone claiming their AI product is straight out of a movie, it's a good sign to approach with skepticism.
    Ask what goes in and how outputs are evaluatedOne of the easiest ways to see through AI marketing fluff is to look and see whether the company is disclosing how it operates.
    Many AI companies won't tell you what content is used to train their models.
    But they usually disclose what the company does with your data and sometimes brag about how their models stack up against competitors.
    That's where you should start looking, typically in their privacy policies.One of the top complaints and concerns from creators is how AI models are trained.
    There are many lawsuits over alleged copyright infringement, and there are a lot of concerns over bias in AI chatbots and their capacity for harm.
    "If you wanted to create a system that is designed to move things forward rather than reproduce the oppressions of the past, you would have to start by curating your data," Bender said.
    Instead, AI companies are grabbing "everything that wasn't nailed down on the internet," Hanna said.If you're hearing about an AI product for the first time, one thing in particular to look out for is any kind of statistic that highlights its effectiveness.
    Like many other researchers, Bender and Hanna have called out that a finding with no citation is a red flag.
    "Anytime someone is selling you something but not giving you access to how it was evaluated, you are on thin ice," Bender said.It can be frustrating and disappointing when AI companies don't disclose certain information about how their AI products work and how they were developed.
    But recognizing those holes in their sales pitch can help deflate hype, even though it would be better to have the information.
    For more, check out our full ChatGPT glossary and how to turn off Apple Intelligence.
    #0066cc;">#how #spot #hype #and #avoid #the #con #according #two #experts #quotartificial #intelligence #we039re #being #frank #bill #goods #you #are #sold #line #someone039s #pocketsquotthat #heart #argument #that #linguist #emily #bender #sociologist #alex #hannamake #their #new #bookthe #conit039s #useful #guide #for #anyone #whose #life #has #intersected #with #technologies #artificial #who039s #questioned #real #usefulness #which #most #usbender #professor #university #washington #who #was #named #one #time #magazine039s #influential #people #hanna #director #research #nonprofit #distributed #instituteand #former #member #ethical #team #googlethe #explosion #chatgpt #late #kicked #off #cycle #aihype #authors #define #quotaggrandizementquot #technology #convinced #need #buy #invest #quotlest #miss #out #entertainment #pleasure #monetary #reward #return #investment #market #sharequot #but #it039s #not #first #nor #likely #last #scholars #government #leaders #regular #have #been #intrigued #worried #idea #machine #learning #aibender #trace #roots #back #1950s #when #mathematician #john #mccarthy #coined #term #intelligenceit #era #united #states #looking #fund #projects #would #help #country #gain #any #kind #edge #soviets #militarily #ideologically #technologicallyquotit #didn039t #spring #whole #cloth #zeus039s #head #anythingthis #longer #historyquot #said #interview #cnetquotit039s #certainly #quote #unquote #aiquottoday039s #propelled #billions #dollars #venture #capital #into #startups #like #openai #tech #giants #meta #google #microsoft #pouring #developmentthe #result #clear #all #newest #phones #laptops #software #updates #drenched #aiwashingand #there #signs #development #will #slow #down #thanks #part #growing #motivation #beat #china #developmentnot #indeedof #course #generative #much #more #advanced #than #eliza #psychotherapy #chatbot #enraptured #scientists #1970stoday039s #business #workers #inundated #heavy #dose #fomo #seemingly #complex #often #misused #jargonlistening #enthusiasts #might #seem #take #your #job #save #company #moneybut #argue #neither #wholly #reason #why #important #recognize #break #through #hypeso #these #few #telltale #share #belowthe #outline #questions #ask #strategies #busting #book #now #uswatch #language #humanizes #aianthropomorphizing #process #giving #inanimate #object #humanlike #characteristics #qualities #big #building #hypean #example #this #can #found #companies #say #chatbots #quotseequot #quotthinkquotthese #comparisons #trying #describe #ability #objectidentifying #programs #deepreasoning #models #they #also #misleadingai #aren039t #capable #seeing #thinking #because #don039t #brainseven #neural #nets #noted #our #based #human #understanding #neurons #from #actually #work #fool #believing #there039s #brain #behind #machinethat #belief #something #predisposed #humans #languagewe039re #conditioned #imagine #mind #text #see #even #know #generated #saidquotwe #interpret #developing #model #minds #speaker #wasquot #addedin #use #knowledge #person #speaking #create #meaning #just #using #words #sayquotso #encounter #synthetic #extruded #going #same #thingquot #saidquotand #very #hard #remind #ourselves #isn039t #thereit039s #construct #producedquotthe #try #convince #products #sets #foreground #them #replace #whether #creatorsit039s #compelling #believe #could #silver #bullet #fix #complicated #problems #critical #industries #health #care #servicesbut #bring #used #anythingai #goal #efficiency #services #end #replacing #qualified #black #box #machines #copious #amounts #babysitting #underpaid #contract #gig #workersas #put #quotai #make #shittierquotbe #dubious #phrase #039super #intelligence039if #can039t #should #wary #claims #itquotsuperhuman #super #dangerous #turn #insofar #thinks #some #superfluousquot #saidin #quotcertain #domains #pattern #matching #scale #computers #quite #good #thatbut #superhuman #poem #notion #doing #science #hypequot #added #quotand #talk #about #airplanes #flyers #rulers #measurers #seems #only #space #comes #upquotthe #quotsuper #intelligencequot #general #intelligencemany #ceos #struggle #what #exactly #agi #essentially #ai039s #form #potentially #making #decisions #handling #tasksthere039s #still #evidence #anywhere #near #future #enabled #popularbuzzwordmany #futurelooking #statements #borrow #tropes #fictionboth #boosters #doomers #those #potential #harm #rely #scifi #scenariosthe #aipowered #futuristic #societythe #bemoan #where #robots #over #world #wipe #humanitythe #connecting #thread #unshakable #smarter #inevitablequotone #things #lot #discourse #fixed #question #fast #get #therequot #then #claim #particular #step #path #marketingit #helpful #able #itquotpart #popular #autonomous #functional #assistant #mean #fulfilling #promises #worldchanging #innovation #investorsplanning #utopia #dystopia #keeps #investors #forward #burn #admit #they039ll #carbon #emission #goalsfor #better #worse #fictionwhenever #someone #claiming #product #straight #movie #sign #approach #skepticism #goes #outputs #evaluatedone #easiest #ways #marketing #fluff #look #disclosing #operatesmany #won039t #tell #content #train #modelsbut #usually #disclose #does #data #sometimes #brag #stack #against #competitorsthat039s #start #typically #privacy #policiesone #top #complaints #concernsfrom #creators #trainedthere #many #lawsuits #alleged #copyright #infringement #concerns #bias #capacity #harmquotif #wanted #system #designed #move #rather #reproduce #oppressions #past #curating #dataquot #saidinstead #grabbing #quoteverything #wasn039t #nailed #internetquot #saidif #you039re #hearing #thing #statistic #highlights #its #effectivenesslike #other #researchers #called #finding #citation #red #flagquotanytime #selling #access #evaluated #thin #icequot #saidit #frustrating #disappointing #certain #information #were #developedbut #recognizing #holes #sales #pitch #deflate #though #informationfor #check #fullchatgpt #glossary #offapple
    How to Spot AI Hype and Avoid The AI Con, According to Two Experts
    "Artificial intelligence, if we're being frank, is a con: a bill of goods you are being sold to line someone's pockets."That is the heart of the argument that linguist Emily Bender and sociologist Alex Hanna make in their new book The AI Con. It's a useful guide for anyone whose life has intersected with technologies sold as artificial intelligence and anyone who's questioned their real usefulness, which is most of us. Bender is a professor at the University of Washington who was named one of Time magazine's most influential people in artificial intelligence, and Hanna is the director of research at the nonprofit Distributed AI Research Institute and a former member of the ethical AI team at Google.The explosion of ChatGPT in late 2022 kicked off a new hype cycle in AI. Hype, as the authors define it, is the "aggrandizement" of technology that you are convinced you need to buy or invest in "lest you miss out on entertainment or pleasure, monetary reward, return on investment, or market share." But it's not the first time, nor likely the last, that scholars, government leaders and regular people have been intrigued and worried by the idea of machine learning and AI.Bender and Hanna trace the roots of machine learning back to the 1950s, to when mathematician John McCarthy coined the term artificial intelligence. It was in an era when the United States was looking to fund projects that would help the country gain any kind of edge on the Soviets militarily, ideologically and technologically. "It didn't spring whole cloth out of Zeus's head or anything. This has a longer history," Hanna said in an interview with CNET. "It's certainly not the first hype cycle with, quote, unquote, AI."Today's hype cycle is propelled by the billions of dollars of venture capital investment into startups like OpenAI and the tech giants like Meta, Google and Microsoft pouring billions of dollars into AI research and development. The result is clear, with all the newest phones, laptops and software updates drenched in AI-washing. And there are no signs that AI research and development will slow down, thanks in part to a growing motivation to beat China in AI development. Not the first hype cycle indeed.Of course, generative AI in 2025 is much more advanced than the Eliza psychotherapy chatbot that first enraptured scientists in the 1970s. Today's business leaders and workers are inundated with hype, with a heavy dose of FOMO and seemingly complex but often misused jargon. Listening to tech leaders and AI enthusiasts, it might seem like AI will take your job to save your company money. But the authors argue that neither is wholly likely, which is one reason why it's important to recognize and break through the hype.So how do we recognize AI hype? These are a few telltale signs, according to Bender and Hanna, that we share below. The authors outline more questions to ask and strategies for AI hype busting in their book, which is out now in the US.Watch out for language that humanizes AIAnthropomorphizing, or the process of giving an inanimate object human-like characteristics or qualities, is a big part of building AI hype. An example of this kind of language can be found when AI companies say their chatbots can now "see" and "think."These can be useful comparisons when trying to describe the ability of new object-identifying AI programs or deep-reasoning AI models, but they can also be misleading. AI chatbots aren't capable of seeing of thinking because they don't have brains. Even the idea of neural nets, Hanna noted in our interview and in the book, is based on human understanding of neurons from the 1950s, not actually how neurons work, but it can fool us into believing there's a brain behind the machine.That belief is something we're predisposed to because of how we as humans process language. We're conditioned to imagine that there is a mind behind the text we see, even when we know it's generated by AI, Bender said. "We interpret language by developing a model in our minds of who the speaker was," Bender added.In these models, we use our knowledge of the person speaking to create meaning, not just using the meaning of the words they say. "So when we encounter synthetic text extruded from something like ChatGPT, we're going to do the same thing," Bender said. "And it is very hard to remind ourselves that the mind isn't there. It's just a construct that we have produced."The authors argue that part of why AI companies try to convince us their products are human-like is that this sets the foreground for them to convince us that AI can replace humans, whether it's at work or as creators. It's compelling for us to believe that AI could be the silver bullet fix to complicated problems in critical industries like health care and government services.But more often than not, the authors argue, AI isn't bring used to fix anything. AI is sold with the goal of efficiency, but AI services end up replacing qualified workers with black box machines that need copious amounts of babysitting from underpaid contract or gig workers. As Hanna put it in our interview, "AI is not going to take your job, but it will make your job shittier."Be dubious of the phrase 'super intelligence'If a human can't do something, you should be wary of claims that an AI can do it. "Superhuman intelligence, or super intelligence, is a very dangerous turn of phrase, insofar as it thinks that some technology is going to make humans superfluous," Hanna said. In "certain domains, like pattern matching at scale, computers are quite good at that. But if there's an idea that there's going to be a superhuman poem, or a superhuman notion of research or doing science, that is clear hype." Bender added, "And we don't talk about airplanes as superhuman flyers or rulers as superhuman measurers, it seems to be only in this AI space that that comes up."The idea of AI "super intelligence" comes up often when people talk about artificial general intelligence. Many CEOs struggle to define what exactly AGI is, but it's essentially AI's most advanced form, potentially capable of making decisions and handling complex tasks. There's still no evidence we're anywhere near a future enabled by AGI, but it's a popular buzzword.Many of these future-looking statements from AI leaders borrow tropes from science fiction. Both boosters and doomers — how Bender and Hanna describe AI enthusiasts and those worried about the potential for harm — rely on sci-fi scenarios. The boosters imagine an AI-powered futuristic society. The doomers bemoan a future where AI robots take over the world and wipe out humanity.The connecting thread, according to the authors, is an unshakable belief that AI is smarter than humans and inevitable. "One of the things that we see a lot in the discourse is this idea that the future is fixed, and it's just a question of how fast we get there," Bender said. "And then there's this claim that this particular technology is a step on that path, and it's all marketing. It is helpful to be able to see behind it."Part of why AI is so popular is that an autonomous functional AI assistant would mean AI companies are fulfilling their promises of world-changing innovation to their investors. Planning for that future — whether it's a utopia or dystopia — keeps investors looking forward as the companies burn through billions of dollars and admit they'll miss their carbon emission goals. For better or worse, life is not science fiction. Whenever you see someone claiming their AI product is straight out of a movie, it's a good sign to approach with skepticism. Ask what goes in and how outputs are evaluatedOne of the easiest ways to see through AI marketing fluff is to look and see whether the company is disclosing how it operates. Many AI companies won't tell you what content is used to train their models. But they usually disclose what the company does with your data and sometimes brag about how their models stack up against competitors. That's where you should start looking, typically in their privacy policies.One of the top complaints and concerns from creators is how AI models are trained. There are many lawsuits over alleged copyright infringement, and there are a lot of concerns over bias in AI chatbots and their capacity for harm. "If you wanted to create a system that is designed to move things forward rather than reproduce the oppressions of the past, you would have to start by curating your data," Bender said. Instead, AI companies are grabbing "everything that wasn't nailed down on the internet," Hanna said.If you're hearing about an AI product for the first time, one thing in particular to look out for is any kind of statistic that highlights its effectiveness. Like many other researchers, Bender and Hanna have called out that a finding with no citation is a red flag. "Anytime someone is selling you something but not giving you access to how it was evaluated, you are on thin ice," Bender said.It can be frustrating and disappointing when AI companies don't disclose certain information about how their AI products work and how they were developed. But recognizing those holes in their sales pitch can help deflate hype, even though it would be better to have the information. For more, check out our full ChatGPT glossary and how to turn off Apple Intelligence.
    المصدر: www.cnet.com
    #how #spot #hype #and #avoid #the #con #according #two #experts #quotartificial #intelligence #we039re #being #frank #bill #goods #you #are #sold #line #someone039s #pocketsquotthat #heart #argument #that #linguist #emily #bender #sociologist #alex #hannamake #their #new #bookthe #conit039s #useful #guide #for #anyone #whose #life #has #intersected #with #technologies #artificial #who039s #questioned #real #usefulness #which #most #usbender #professor #university #washington #who #was #named #one #time #magazine039s #influential #people #hanna #director #research #nonprofit #distributed #instituteand #former #member #ethical #team #googlethe #explosion #chatgpt #late #kicked #off #cycle #aihype #authors #define #quotaggrandizementquot #technology #convinced #need #buy #invest #quotlest #miss #out #entertainment #pleasure #monetary #reward #return #investment #market #sharequot #but #it039s #not #first #nor #likely #last #scholars #government #leaders #regular #have #been #intrigued #worried #idea #machine #learning #aibender #trace #roots #back #1950s #when #mathematician #john #mccarthy #coined #term #intelligenceit #era #united #states #looking #fund #projects #would #help #country #gain #any #kind #edge #soviets #militarily #ideologically #technologicallyquotit #didn039t #spring #whole #cloth #zeus039s #head #anythingthis #longer #historyquot #said #interview #cnetquotit039s #certainly #quote #unquote #aiquottoday039s #propelled #billions #dollars #venture #capital #into #startups #like #openai #tech #giants #meta #google #microsoft #pouring #developmentthe #result #clear #all #newest #phones #laptops #software #updates #drenched #aiwashingand #there #signs #development #will #slow #down #thanks #part #growing #motivation #beat #china #developmentnot #indeedof #course #generative #much #more #advanced #than #eliza #psychotherapy #chatbot #enraptured #scientists #1970stoday039s #business #workers #inundated #heavy #dose #fomo #seemingly #complex #often #misused #jargonlistening #enthusiasts #might #seem #take #your #job #save #company #moneybut #argue #neither #wholly #reason #why #important #recognize #break #through #hypeso #these #few #telltale #share #belowthe #outline #questions #ask #strategies #busting #book #now #uswatch #language #humanizes #aianthropomorphizing #process #giving #inanimate #object #humanlike #characteristics #qualities #big #building #hypean #example #this #can #found #companies #say #chatbots #quotseequot #quotthinkquotthese #comparisons #trying #describe #ability #objectidentifying #programs #deepreasoning #models #they #also #misleadingai #aren039t #capable #seeing #thinking #because #don039t #brainseven #neural #nets #noted #our #based #human #understanding #neurons #from #actually #work #fool #believing #there039s #brain #behind #machinethat #belief #something #predisposed #humans #languagewe039re #conditioned #imagine #mind #text #see #even #know #generated #saidquotwe #interpret #developing #model #minds #speaker #wasquot #addedin #use #knowledge #person #speaking #create #meaning #just #using #words #sayquotso #encounter #synthetic #extruded #going #same #thingquot #saidquotand #very #hard #remind #ourselves #isn039t #thereit039s #construct #producedquotthe #try #convince #products #sets #foreground #them #replace #whether #creatorsit039s #compelling #believe #could #silver #bullet #fix #complicated #problems #critical #industries #health #care #servicesbut #bring #used #anythingai #goal #efficiency #services #end #replacing #qualified #black #box #machines #copious #amounts #babysitting #underpaid #contract #gig #workersas #put #quotai #make #shittierquotbe #dubious #phrase #039super #intelligence039if #can039t #should #wary #claims #itquotsuperhuman #super #dangerous #turn #insofar #thinks #some #superfluousquot #saidin #quotcertain #domains #pattern #matching #scale #computers #quite #good #thatbut #superhuman #poem #notion #doing #science #hypequot #added #quotand #talk #about #airplanes #flyers #rulers #measurers #seems #only #space #comes #upquotthe #quotsuper #intelligencequot #general #intelligencemany #ceos #struggle #what #exactly #agi #essentially #ai039s #form #potentially #making #decisions #handling #tasksthere039s #still #evidence #anywhere #near #future #enabled #popularbuzzwordmany #futurelooking #statements #borrow #tropes #fictionboth #boosters #doomers #those #potential #harm #rely #scifi #scenariosthe #aipowered #futuristic #societythe #bemoan #where #robots #over #world #wipe #humanitythe #connecting #thread #unshakable #smarter #inevitablequotone #things #lot #discourse #fixed #question #fast #get #therequot #then #claim #particular #step #path #marketingit #helpful #able #itquotpart #popular #autonomous #functional #assistant #mean #fulfilling #promises #worldchanging #innovation #investorsplanning #utopia #dystopia #keeps #investors #forward #burn #admit #they039ll #carbon #emission #goalsfor #better #worse #fictionwhenever #someone #claiming #product #straight #movie #sign #approach #skepticism #goes #outputs #evaluatedone #easiest #ways #marketing #fluff #look #disclosing #operatesmany #won039t #tell #content #train #modelsbut #usually #disclose #does #data #sometimes #brag #stack #against #competitorsthat039s #start #typically #privacy #policiesone #top #complaints #concernsfrom #creators #trainedthere #many #lawsuits #alleged #copyright #infringement #concerns #bias #capacity #harmquotif #wanted #system #designed #move #rather #reproduce #oppressions #past #curating #dataquot #saidinstead #grabbing #quoteverything #wasn039t #nailed #internetquot #saidif #you039re #hearing #thing #statistic #highlights #its #effectivenesslike #other #researchers #called #finding #citation #red #flagquotanytime #selling #access #evaluated #thin #icequot #saidit #frustrating #disappointing #certain #information #were #developedbut #recognizing #holes #sales #pitch #deflate #though #informationfor #check #fullchatgpt #glossary #offapple
    WWW.CNET.COM
    How to Spot AI Hype and Avoid The AI Con, According to Two Experts
    "Artificial intelligence, if we're being frank, is a con: a bill of goods you are being sold to line someone's pockets."That is the heart of the argument that linguist Emily Bender and sociologist Alex Hanna make in their new book The AI Con. It's a useful guide for anyone whose life has intersected with technologies sold as artificial intelligence and anyone who's questioned their real usefulness, which is most of us. Bender is a professor at the University of Washington who was named one of Time magazine's most influential people in artificial intelligence, and Hanna is the director of research at the nonprofit Distributed AI Research Institute and a former member of the ethical AI team at Google.The explosion of ChatGPT in late 2022 kicked off a new hype cycle in AI. Hype, as the authors define it, is the "aggrandizement" of technology that you are convinced you need to buy or invest in "lest you miss out on entertainment or pleasure, monetary reward, return on investment, or market share." But it's not the first time, nor likely the last, that scholars, government leaders and regular people have been intrigued and worried by the idea of machine learning and AI.Bender and Hanna trace the roots of machine learning back to the 1950s, to when mathematician John McCarthy coined the term artificial intelligence. It was in an era when the United States was looking to fund projects that would help the country gain any kind of edge on the Soviets militarily, ideologically and technologically. "It didn't spring whole cloth out of Zeus's head or anything. This has a longer history," Hanna said in an interview with CNET. "It's certainly not the first hype cycle with, quote, unquote, AI."Today's hype cycle is propelled by the billions of dollars of venture capital investment into startups like OpenAI and the tech giants like Meta, Google and Microsoft pouring billions of dollars into AI research and development. The result is clear, with all the newest phones, laptops and software updates drenched in AI-washing. And there are no signs that AI research and development will slow down, thanks in part to a growing motivation to beat China in AI development. Not the first hype cycle indeed.Of course, generative AI in 2025 is much more advanced than the Eliza psychotherapy chatbot that first enraptured scientists in the 1970s. Today's business leaders and workers are inundated with hype, with a heavy dose of FOMO and seemingly complex but often misused jargon. Listening to tech leaders and AI enthusiasts, it might seem like AI will take your job to save your company money. But the authors argue that neither is wholly likely, which is one reason why it's important to recognize and break through the hype.So how do we recognize AI hype? These are a few telltale signs, according to Bender and Hanna, that we share below. The authors outline more questions to ask and strategies for AI hype busting in their book, which is out now in the US.Watch out for language that humanizes AIAnthropomorphizing, or the process of giving an inanimate object human-like characteristics or qualities, is a big part of building AI hype. An example of this kind of language can be found when AI companies say their chatbots can now "see" and "think."These can be useful comparisons when trying to describe the ability of new object-identifying AI programs or deep-reasoning AI models, but they can also be misleading. AI chatbots aren't capable of seeing of thinking because they don't have brains. Even the idea of neural nets, Hanna noted in our interview and in the book, is based on human understanding of neurons from the 1950s, not actually how neurons work, but it can fool us into believing there's a brain behind the machine.That belief is something we're predisposed to because of how we as humans process language. We're conditioned to imagine that there is a mind behind the text we see, even when we know it's generated by AI, Bender said. "We interpret language by developing a model in our minds of who the speaker was," Bender added.In these models, we use our knowledge of the person speaking to create meaning, not just using the meaning of the words they say. "So when we encounter synthetic text extruded from something like ChatGPT, we're going to do the same thing," Bender said. "And it is very hard to remind ourselves that the mind isn't there. It's just a construct that we have produced."The authors argue that part of why AI companies try to convince us their products are human-like is that this sets the foreground for them to convince us that AI can replace humans, whether it's at work or as creators. It's compelling for us to believe that AI could be the silver bullet fix to complicated problems in critical industries like health care and government services.But more often than not, the authors argue, AI isn't bring used to fix anything. AI is sold with the goal of efficiency, but AI services end up replacing qualified workers with black box machines that need copious amounts of babysitting from underpaid contract or gig workers. As Hanna put it in our interview, "AI is not going to take your job, but it will make your job shittier."Be dubious of the phrase 'super intelligence'If a human can't do something, you should be wary of claims that an AI can do it. "Superhuman intelligence, or super intelligence, is a very dangerous turn of phrase, insofar as it thinks that some technology is going to make humans superfluous," Hanna said. In "certain domains, like pattern matching at scale, computers are quite good at that. But if there's an idea that there's going to be a superhuman poem, or a superhuman notion of research or doing science, that is clear hype." Bender added, "And we don't talk about airplanes as superhuman flyers or rulers as superhuman measurers, it seems to be only in this AI space that that comes up."The idea of AI "super intelligence" comes up often when people talk about artificial general intelligence. Many CEOs struggle to define what exactly AGI is, but it's essentially AI's most advanced form, potentially capable of making decisions and handling complex tasks. There's still no evidence we're anywhere near a future enabled by AGI, but it's a popular buzzword.Many of these future-looking statements from AI leaders borrow tropes from science fiction. Both boosters and doomers — how Bender and Hanna describe AI enthusiasts and those worried about the potential for harm — rely on sci-fi scenarios. The boosters imagine an AI-powered futuristic society. The doomers bemoan a future where AI robots take over the world and wipe out humanity.The connecting thread, according to the authors, is an unshakable belief that AI is smarter than humans and inevitable. "One of the things that we see a lot in the discourse is this idea that the future is fixed, and it's just a question of how fast we get there," Bender said. "And then there's this claim that this particular technology is a step on that path, and it's all marketing. It is helpful to be able to see behind it."Part of why AI is so popular is that an autonomous functional AI assistant would mean AI companies are fulfilling their promises of world-changing innovation to their investors. Planning for that future — whether it's a utopia or dystopia — keeps investors looking forward as the companies burn through billions of dollars and admit they'll miss their carbon emission goals. For better or worse, life is not science fiction. Whenever you see someone claiming their AI product is straight out of a movie, it's a good sign to approach with skepticism. Ask what goes in and how outputs are evaluatedOne of the easiest ways to see through AI marketing fluff is to look and see whether the company is disclosing how it operates. Many AI companies won't tell you what content is used to train their models. But they usually disclose what the company does with your data and sometimes brag about how their models stack up against competitors. That's where you should start looking, typically in their privacy policies.One of the top complaints and concerns from creators is how AI models are trained. There are many lawsuits over alleged copyright infringement, and there are a lot of concerns over bias in AI chatbots and their capacity for harm. "If you wanted to create a system that is designed to move things forward rather than reproduce the oppressions of the past, you would have to start by curating your data," Bender said. Instead, AI companies are grabbing "everything that wasn't nailed down on the internet," Hanna said.If you're hearing about an AI product for the first time, one thing in particular to look out for is any kind of statistic that highlights its effectiveness. Like many other researchers, Bender and Hanna have called out that a finding with no citation is a red flag. "Anytime someone is selling you something but not giving you access to how it was evaluated, you are on thin ice," Bender said.It can be frustrating and disappointing when AI companies don't disclose certain information about how their AI products work and how they were developed. But recognizing those holes in their sales pitch can help deflate hype, even though it would be better to have the information. For more, check out our full ChatGPT glossary and how to turn off Apple Intelligence.
    0 Комментарии 0 Поделились