• How AI is reshaping the future of healthcare and medical research

    Transcript       
    PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”          
    This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.   
    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?    
    In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  The book passage I read at the top is from “Chapter 10: The Big Black Bag.” 
    In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.   
    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open. 
    As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.  
    Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home. 
    Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.     
    Here’s my conversation with Bill Gates and Sébastien Bubeck. 
    LEE: Bill, welcome. 
    BILL GATES: Thank you. 
    LEE: Seb … 
    SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here. 
    LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening? 
    And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?  
    GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines. 
    And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.  
    And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weaknessthat, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning. 
    LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that? 
    GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, … 
    LEE: Right.  
    GATES: … that is a bit weird.  
    LEE: Yeah. 
    GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training. 
    LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent. 
    BUBECK: Yes.  
    LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSRto join and start investigating this thing seriously. And the first person I pulled in was you. 
    BUBECK: Yeah. 
    LEE: And so what were your first encounters? Because I actually don’t remember what happened then. 
    BUBECK: Oh, I remember it very well.My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3. 
    I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1. 
    So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair.And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts. 
    So this was really, to me, the first moment where I saw some understanding in those models.  
    LEE: So this was, just to get the timing right, that was before I pulled you into the tent. 
    BUBECK: That was before. That was like a year before. 
    LEE: Right.  
    BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4. 
    So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.  
    So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x. 
    And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?  
    LEE: Yeah.
    BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.  
    LEE:One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine. 
    And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.  
    And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.  
    I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book. 
    But the main purpose of this conversation isn’t to reminisce aboutor indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements. 
    But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today? 
    You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.  
    Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork? 
    GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.  
    It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision. 
    But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view. 
    LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients.Does that make sense to you? 
    BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong? 
    Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.  
    Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them. 
    And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT. And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.  
    Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way. 
    It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine. 
    LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all? 
    GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that. 
    The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa,
    So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.  
    LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking? 
    GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.  
    The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.  
    LEE: Right.  
    GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.  
    LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication. 
    BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE, for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI. 
    It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for. 
    LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes. 
    I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?  
    That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential.What’s up with that? 
    BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back thatversion of GPT-4o, so now we don’t have the sycophant version out there. 
    Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF, where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad. 
    But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model. 
    So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model. 
    LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and … 
    BUBECK: It’s a very difficult, very difficult balance. 
    LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models? 
    GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there. 
    Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?  
    Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there.
    LEE: Yeah.
    GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake. 
    LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on. 
    BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGIthat kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything. 
    That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects.So it’s … I think it’s an important example to have in mind. 
    LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two? 
    BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it. 
    LEE: So we have about three hours of stuff to talk about, but our time is actually running low.
    BUBECK: Yes, yes, yes.  
    LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now? 
    GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.  
    The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities. 
    And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period. 
    LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers? 
    GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them. 
    LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.  
    I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why. 
    BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and seeproduced what you wanted. So I absolutely agree with that.  
    And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini. So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.  
    LEE: Yeah. 
    BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.  
    Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not. 
    Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision. 
    LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist … 
    BUBECK: Yeah.
    LEE: … or an endocrinologist might not.
    BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.
    LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today? 
    BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later. 
    And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …  
    LEE: Will AI prescribe your medicines? Write your prescriptions? 
    BUBECK: I think yes. I think yes. 
    LEE: OK. Bill? 
    GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate?
    And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelectedjust on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries. 
    You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that. 
    LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.  
    I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.  
    GATES: Yeah. Thanks, you guys. 
    BUBECK: Thank you, Peter. Thanks, Bill. 
    LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.   
    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.  
    And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.  
    One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.  
    HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings. 
    You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.  
    If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.  
    I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.  
    Until next time.  
    #how #reshaping #future #healthcare #medical
    How AI is reshaping the future of healthcare and medical research
    Transcript        PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”           This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?     In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  The book passage I read at the top is from “Chapter 10: The Big Black Bag.”  In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open.  As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.   Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home.  Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.      Here’s my conversation with Bill Gates and Sébastien Bubeck.  LEE: Bill, welcome.  BILL GATES: Thank you.  LEE: Seb …  SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here.  LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening?  And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?   GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines.  And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.   And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weaknessthat, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning.  LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that?  GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, …  LEE: Right.   GATES: … that is a bit weird.   LEE: Yeah.  GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training.  LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent.  BUBECK: Yes.   LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSRto join and start investigating this thing seriously. And the first person I pulled in was you.  BUBECK: Yeah.  LEE: And so what were your first encounters? Because I actually don’t remember what happened then.  BUBECK: Oh, I remember it very well.My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3.  I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1.  So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair.And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts.  So this was really, to me, the first moment where I saw some understanding in those models.   LEE: So this was, just to get the timing right, that was before I pulled you into the tent.  BUBECK: That was before. That was like a year before.  LEE: Right.   BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4.  So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.   So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x.  And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?   LEE: Yeah. BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.   LEE:One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine.  And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.   And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.   I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book.  But the main purpose of this conversation isn’t to reminisce aboutor indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements.  But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today?  You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.   Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork?  GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.   It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision.  But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view.  LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients.Does that make sense to you?  BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong?  Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.   Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them.  And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT. And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.   Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way.  It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine.  LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all?  GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that.  The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa, So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.   LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking?  GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.   The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.   LEE: Right.   GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.   LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication.  BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE, for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI.  It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for.  LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes.  I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?   That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential.What’s up with that?  BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back thatversion of GPT-4o, so now we don’t have the sycophant version out there.  Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF, where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad.  But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model.  So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model.  LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and …  BUBECK: It’s a very difficult, very difficult balance.  LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models?  GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there.  Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?   Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there. LEE: Yeah. GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake.  LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on.  BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGIthat kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything.  That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects.So it’s … I think it’s an important example to have in mind.  LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two?  BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it.  LEE: So we have about three hours of stuff to talk about, but our time is actually running low. BUBECK: Yes, yes, yes.   LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now?  GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.   The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities.  And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period.  LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers?  GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them.  LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.   I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why.  BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and seeproduced what you wanted. So I absolutely agree with that.   And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini. So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.   LEE: Yeah.  BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.   Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not.  Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision.  LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist …  BUBECK: Yeah. LEE: … or an endocrinologist might not. BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know. LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today?  BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later.  And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …   LEE: Will AI prescribe your medicines? Write your prescriptions?  BUBECK: I think yes. I think yes.  LEE: OK. Bill?  GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate? And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelectedjust on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries.  You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that.  LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.   I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.   GATES: Yeah. Thanks, you guys.  BUBECK: Thank you, Peter. Thanks, Bill.  LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.   And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.   One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.   HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings.  You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.   If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.   I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.   Until next time.   #how #reshaping #future #healthcare #medical
    WWW.MICROSOFT.COM
    How AI is reshaping the future of healthcare and medical research
    Transcript [MUSIC]      [BOOK PASSAGE]   PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”   [END OF BOOK PASSAGE]     [THEME MUSIC]     This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?     In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.   [THEME MUSIC FADES] The book passage I read at the top is from “Chapter 10: The Big Black Bag.”  In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open.  As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.   Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home.  Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.    [TRANSITION MUSIC]   Here’s my conversation with Bill Gates and Sébastien Bubeck.  LEE: Bill, welcome.  BILL GATES: Thank you.  LEE: Seb …  SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here.  LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening?  And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?   GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines.  And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.   And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weakness [LAUGHTER] that, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning.  LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that?  GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, …  LEE: Right.   GATES: … that is a bit weird.   LEE: Yeah.  GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training.  LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent. [LAUGHS]  BUBECK: Yes.   LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSR [Microsoft Research] to join and start investigating this thing seriously. And the first person I pulled in was you.  BUBECK: Yeah.  LEE: And so what were your first encounters? Because I actually don’t remember what happened then.  BUBECK: Oh, I remember it very well. [LAUGHS] My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3.  I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1.  So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair. [LAUGHTER] And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts.  So this was really, to me, the first moment where I saw some understanding in those models.   LEE: So this was, just to get the timing right, that was before I pulled you into the tent.  BUBECK: That was before. That was like a year before.  LEE: Right.   BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4.  So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.   So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x.  And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?   LEE: Yeah. BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.   LEE: [LAUGHS] One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine.  And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.   And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.   I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book.  But the main purpose of this conversation isn’t to reminisce about [LAUGHS] or indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements.  But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today?  You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.   Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork?  GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.   It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision.  But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view.  LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients. [LAUGHTER] Does that make sense to you?  BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong?  Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.   Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them.  And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT (opens in new tab). And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.   Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way.  It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine.  LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all?  GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that.  The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa, So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.   LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking?  GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.   The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.   LEE: Right.   GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.   LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication.  BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE [United States Medical Licensing Examination], for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI.  It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for.  LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes.  I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?   That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential. [LAUGHTER] What’s up with that?  BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back that [LAUGHS] version of GPT-4o, so now we don’t have the sycophant version out there.  Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF [reinforcement learning from human feedback], where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad.  But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model.  So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model.  LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and …  BUBECK: It’s a very difficult, very difficult balance.  LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models?  GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there.  Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?   Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there. LEE: Yeah. GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake.  LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on.  BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGI [artificial general intelligence] that kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything.  That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects. [LAUGHTER] So it’s … I think it’s an important example to have in mind.  LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two?  BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it.  LEE: So we have about three hours of stuff to talk about, but our time is actually running low. BUBECK: Yes, yes, yes.   LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now?  GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.   The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities.  And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period.  LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers?  GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them.  LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.   I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why.  BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and see [if you have] produced what you wanted. So I absolutely agree with that.   And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab). So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.   LEE: Yeah.  BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.   Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not.  Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision.  LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist …  BUBECK: Yeah. LEE: … or an endocrinologist might not. BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know. LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today?  BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later.  And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …   LEE: Will AI prescribe your medicines? Write your prescriptions?  BUBECK: I think yes. I think yes.  LEE: OK. Bill?  GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate? And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelected [LAUGHTER] just on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries.  You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that.  LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.   I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.  [TRANSITION MUSIC]  GATES: Yeah. Thanks, you guys.  BUBECK: Thank you, Peter. Thanks, Bill.  LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.   And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.   One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.   HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings.  You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.   If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.  [THEME MUSIC]  I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.   Until next time.   [MUSIC FADES]
    0 Reacties 0 aandelen
  • Elden Ring Nightreign Guide – How To Level Up Quickly

    Elden Ring Nightreign is a markedly different gameplay experience from the mainline title, with a focus on co-operative multiplayer in a rogue-lite experience. In the final showdown of each run, nothing matters as much as your level, and maximizing your time and effort to achieve that fast will be your primary focus.

    This Elden Ring Nightreign guide has everything you can possibly do to level up to the maximum cap of 15 quickly and efficiently.

    Relics

    There are a number of Relic effects that you can acquire to both directly and indirectly earn runes for both you and your allies. Since the Relics are RNG-based and gained at the end of runs, whether successful or unsuccessful, it may be a while before you have the right combination in your Chalice.

    There are also Nightfarer-specific, quest-based Relics such as Revenant’s Small Makeup Brush that increases rune acquisition for self and allies, and Ironeye’s Cracked Sealing Wax which earns runes for critical hits.

    Your earned Murk will come in handy in order to potentially buy the Relics you want from the Small Jar Bazaar in Roundtable Hold. There is even a Relic to lower your expenditures at merchants: Night of the Demon, which offers up a huge rune discount for shop purchases while on expedition.

    You will also want to upgrade your Chalices there, to use more runes in a variety of color combinations.

    Talismans

    Certain Talismans such as the Gold Scarab which grants an increase in runes from defeated enemies by 15%, will come in handy, and can potentially be stacked in the two slots available to Nightfarers. Talismans are boss drops, as well as obtainable from Teardrop Scarabs, so farm those whenever you encounter them.

    Consumables

    Certain consumable items such as the Gold-Pickled Fowl Foot can boost rune acquisition for a time. Because their drops are not guaranteed and can be few and far between, it is recommended that you save these for use right before a major combat encounter where you expect to gain a large amount of runes, in order to best take advantage of their small window of effect.

    Stonesword Key

    While this item does not grant runes in and of itself, it does put you in the path of receiving massive windfalls of them. There are certain Relic effects that provide you a free Stonesword Key at the beginning of your run, but also drop from the chest secreted behind the altar, near the boss of the Great Church location. Either way, the key can then be used to access the locked Evergaols with their Field Boss challenge. Felling these great enemies grants a choice between multiple selectable Dormant Powers, including an immediate rune lumpsum in multiples of 10K, and another two that grant a persistent increase in number of runes obtained by 10%.

    Another potential location for a chest with a Stonesword Key is atop the tower of a Fort. An additional bonus to clearing the Fort is being able to view the locations of nearby Field Bosses and Scarabs, by interacting with the maps on the desk within the tower, which will inform you that you have ‘Acquired a local clue’.

    Merchants are also a potential source for Stonesword Keys.

    Scale-Bearing Merchant

    This NPC encounter at the arena of Libra, Creature of Night in Limveld has multiple options to choose from when interacting with him, including ‘Take runes’, which is a combination of buff and debuff that grants a rune bonus from defeating enemies, while constantly draining your health. Fortunately, you can choose to get rid of the phantom chest it places on your head, through your inventory, if you find that the health debuff is untenable on your run. Unfortunately, the choices themselves appear to be RNG and you may not necessarily even see the rune boost option on your run.

    Maximize Your Time

    At a certain point in your run, you may want to simply breeze right past low-level cannon fodder, and head straight for the major opponent in any given area. This is especially true once you achieve higher levels, as you really want to optimize for the limited time you have left before your final battle. Slay the mini boss, gather your loot and Dormant Power, and venture forward.

    Spectral trees allow you to use Spectral Hawks to fast travel to locations where you might farm for runes, or collect rewards.

    Shifting Earth

    These unique world-changing events completely modify the open world of Limveld, which in turn open up opportunities for lots of rune acquisition. Take the map variations when offered them, and your run will significantly benefit from both the primary reward, and the secret buff from defeating the mini-boss.

    Additional variants appear to unlock as you attempt new runs.

    Losing Runes on Death

    Daytime battles run the risk of losing runes and entire levels, and while the former can be recovered from the site, they can sometimes also be picked up by enemies before you can get to them. Fortunately, you can always track down the rune-thief visually, by identifying their glowing, golden aura. Simply slay them to recover your runes and levels. Dying before recovering dropped runes loses them permanently, and will oblige you to waste time farming more.

    Those are some of the best ways to boost your rune acquisition and level up quickly in Elden Ring Nightreign.
    #elden #ring #nightreign #guide #how
    Elden Ring Nightreign Guide – How To Level Up Quickly
    Elden Ring Nightreign is a markedly different gameplay experience from the mainline title, with a focus on co-operative multiplayer in a rogue-lite experience. In the final showdown of each run, nothing matters as much as your level, and maximizing your time and effort to achieve that fast will be your primary focus. This Elden Ring Nightreign guide has everything you can possibly do to level up to the maximum cap of 15 quickly and efficiently. Relics There are a number of Relic effects that you can acquire to both directly and indirectly earn runes for both you and your allies. Since the Relics are RNG-based and gained at the end of runs, whether successful or unsuccessful, it may be a while before you have the right combination in your Chalice. There are also Nightfarer-specific, quest-based Relics such as Revenant’s Small Makeup Brush that increases rune acquisition for self and allies, and Ironeye’s Cracked Sealing Wax which earns runes for critical hits. Your earned Murk will come in handy in order to potentially buy the Relics you want from the Small Jar Bazaar in Roundtable Hold. There is even a Relic to lower your expenditures at merchants: Night of the Demon, which offers up a huge rune discount for shop purchases while on expedition. You will also want to upgrade your Chalices there, to use more runes in a variety of color combinations. Talismans Certain Talismans such as the Gold Scarab which grants an increase in runes from defeated enemies by 15%, will come in handy, and can potentially be stacked in the two slots available to Nightfarers. Talismans are boss drops, as well as obtainable from Teardrop Scarabs, so farm those whenever you encounter them. Consumables Certain consumable items such as the Gold-Pickled Fowl Foot can boost rune acquisition for a time. Because their drops are not guaranteed and can be few and far between, it is recommended that you save these for use right before a major combat encounter where you expect to gain a large amount of runes, in order to best take advantage of their small window of effect. Stonesword Key While this item does not grant runes in and of itself, it does put you in the path of receiving massive windfalls of them. There are certain Relic effects that provide you a free Stonesword Key at the beginning of your run, but also drop from the chest secreted behind the altar, near the boss of the Great Church location. Either way, the key can then be used to access the locked Evergaols with their Field Boss challenge. Felling these great enemies grants a choice between multiple selectable Dormant Powers, including an immediate rune lumpsum in multiples of 10K, and another two that grant a persistent increase in number of runes obtained by 10%. Another potential location for a chest with a Stonesword Key is atop the tower of a Fort. An additional bonus to clearing the Fort is being able to view the locations of nearby Field Bosses and Scarabs, by interacting with the maps on the desk within the tower, which will inform you that you have ‘Acquired a local clue’. Merchants are also a potential source for Stonesword Keys. Scale-Bearing Merchant This NPC encounter at the arena of Libra, Creature of Night in Limveld has multiple options to choose from when interacting with him, including ‘Take runes’, which is a combination of buff and debuff that grants a rune bonus from defeating enemies, while constantly draining your health. Fortunately, you can choose to get rid of the phantom chest it places on your head, through your inventory, if you find that the health debuff is untenable on your run. Unfortunately, the choices themselves appear to be RNG and you may not necessarily even see the rune boost option on your run. Maximize Your Time At a certain point in your run, you may want to simply breeze right past low-level cannon fodder, and head straight for the major opponent in any given area. This is especially true once you achieve higher levels, as you really want to optimize for the limited time you have left before your final battle. Slay the mini boss, gather your loot and Dormant Power, and venture forward. Spectral trees allow you to use Spectral Hawks to fast travel to locations where you might farm for runes, or collect rewards. Shifting Earth These unique world-changing events completely modify the open world of Limveld, which in turn open up opportunities for lots of rune acquisition. Take the map variations when offered them, and your run will significantly benefit from both the primary reward, and the secret buff from defeating the mini-boss. Additional variants appear to unlock as you attempt new runs. Losing Runes on Death Daytime battles run the risk of losing runes and entire levels, and while the former can be recovered from the site, they can sometimes also be picked up by enemies before you can get to them. Fortunately, you can always track down the rune-thief visually, by identifying their glowing, golden aura. Simply slay them to recover your runes and levels. Dying before recovering dropped runes loses them permanently, and will oblige you to waste time farming more. Those are some of the best ways to boost your rune acquisition and level up quickly in Elden Ring Nightreign. #elden #ring #nightreign #guide #how
    GAMINGBOLT.COM
    Elden Ring Nightreign Guide – How To Level Up Quickly
    Elden Ring Nightreign is a markedly different gameplay experience from the mainline title, with a focus on co-operative multiplayer in a rogue-lite experience. In the final showdown of each run, nothing matters as much as your level, and maximizing your time and effort to achieve that fast will be your primary focus. This Elden Ring Nightreign guide has everything you can possibly do to level up to the maximum cap of 15 quickly and efficiently. Relics There are a number of Relic effects that you can acquire to both directly and indirectly earn runes for both you and your allies. Since the Relics are RNG-based and gained at the end of runs, whether successful or unsuccessful, it may be a while before you have the right combination in your Chalice. There are also Nightfarer-specific, quest-based Relics such as Revenant’s Small Makeup Brush that increases rune acquisition for self and allies, and Ironeye’s Cracked Sealing Wax which earns runes for critical hits. Your earned Murk will come in handy in order to potentially buy the Relics you want from the Small Jar Bazaar in Roundtable Hold. There is even a Relic to lower your expenditures at merchants: Night of the Demon, which offers up a huge rune discount for shop purchases while on expedition. You will also want to upgrade your Chalices there, to use more runes in a variety of color combinations. Talismans Certain Talismans such as the Gold Scarab which grants an increase in runes from defeated enemies by 15%, will come in handy, and can potentially be stacked in the two slots available to Nightfarers. Talismans are boss drops, as well as obtainable from Teardrop Scarabs, so farm those whenever you encounter them. Consumables Certain consumable items such as the Gold-Pickled Fowl Foot can boost rune acquisition for a time. Because their drops are not guaranteed and can be few and far between, it is recommended that you save these for use right before a major combat encounter where you expect to gain a large amount of runes, in order to best take advantage of their small window of effect. Stonesword Key While this item does not grant runes in and of itself, it does put you in the path of receiving massive windfalls of them. There are certain Relic effects that provide you a free Stonesword Key at the beginning of your run, but also drop from the chest secreted behind the altar, near the boss of the Great Church location. Either way, the key can then be used to access the locked Evergaols with their Field Boss challenge. Felling these great enemies grants a choice between multiple selectable Dormant Powers, including an immediate rune lumpsum in multiples of 10K, and another two that grant a persistent increase in number of runes obtained by 10%. Another potential location for a chest with a Stonesword Key is atop the tower of a Fort. An additional bonus to clearing the Fort is being able to view the locations of nearby Field Bosses and Scarabs, by interacting with the maps on the desk within the tower, which will inform you that you have ‘Acquired a local clue’. Merchants are also a potential source for Stonesword Keys. Scale-Bearing Merchant This NPC encounter at the arena of Libra, Creature of Night in Limveld has multiple options to choose from when interacting with him, including ‘Take runes’, which is a combination of buff and debuff that grants a rune bonus from defeating enemies, while constantly draining your health. Fortunately, you can choose to get rid of the phantom chest it places on your head, through your inventory, if you find that the health debuff is untenable on your run. Unfortunately, the choices themselves appear to be RNG and you may not necessarily even see the rune boost option on your run. Maximize Your Time At a certain point in your run, you may want to simply breeze right past low-level cannon fodder, and head straight for the major opponent in any given area. This is especially true once you achieve higher levels, as you really want to optimize for the limited time you have left before your final battle. Slay the mini boss, gather your loot and Dormant Power, and venture forward. Spectral trees allow you to use Spectral Hawks to fast travel to locations where you might farm for runes, or collect rewards. Shifting Earth These unique world-changing events completely modify the open world of Limveld, which in turn open up opportunities for lots of rune acquisition. Take the map variations when offered them, and your run will significantly benefit from both the primary reward, and the secret buff from defeating the mini-boss. Additional variants appear to unlock as you attempt new runs. Losing Runes on Death Daytime battles run the risk of losing runes and entire levels, and while the former can be recovered from the site, they can sometimes also be picked up by enemies before you can get to them. Fortunately, you can always track down the rune-thief visually, by identifying their glowing, golden aura. Simply slay them to recover your runes and levels. Dying before recovering dropped runes loses them permanently, and will oblige you to waste time farming more. Those are some of the best ways to boost your rune acquisition and level up quickly in Elden Ring Nightreign.
    0 Reacties 0 aandelen
  • Is Nightreign Solo Play Really Impossible?

    Elden Ring Nightreign is a tough-as-nails game that blends the beloved roguelike and soulslike genres into something fans of both should find appealing. However, unlike most games in either genre, this one’s inherently designed around working together in a group of three. So, you may be wondering if you can strike out on your own in Elden Ring Nightreign. While the game is about to get easier for folks who choose to go it alone, right now such a style proves an exceptionally difficult challenge.Suggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History

    Share SubtitlesOffEnglishview videoSuggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History

    Share SubtitlesOffEnglish Elden Ring Nightreign solo?Let’s get this out of the way first: Yes, Elden Ring Nightreign offers the option for solo play. To do so, you’ll need to open the expedition menu at Roundtable Hold, then switch over to the matchmaking settings tab. At the bottom of the menu, set the Expedition Type to “Singleplayer.”The real question is whether Elden Ring Nightreign’s single-player experience is manageable or fun, and that really depends on your skill level, class choice, and patience more so than in any other similar game I can remember playing. If you really want to go at it by yourself,Ironeye or Wylder.Elden Ring Nightreign is already pretty damn challenging when running with a group of three other folks. The game’s sense of randomness adds a lot of unknowns to an expedition, and things can go wrong very quickly. But with a team, you can be revived, have someone else available to take some aggro from you when things get hairy, and use your character’s abilities to complement one another in difficult showdowns. It’s often still hard as hell, but victory usually feels possible even when things don’t go quite as planned.However, when you’re alone…Well, you’re all alone. If you die on a solo expedition, that’s it. You’re done. Back to the Roundable Hold with you, loser.With this in mind, some folks may find the anxiety-inducing pacing and chaotic showdowns enjoyable even while solo; but those who struggle to succeed without a group may find it demoralizing to watch hours go by without making any meaningful progress. And since some classes are much better for solo play than others, it can be even more frustrating to go it alone for someone who prefers to play one of the support-focused classes.If you really want to go at it by yourself, I’d recommend taking a look at Ironeye or Wylder.Ironeye’s ranged playstyle is the safest in the game, giving you a lot of freedom to tackle enemies your own way. For instance, you can take the high ground against some foes to avoid their attacks altogether, or use his sliding ability to dodge an attack and get behind an enemy for better positioning.Wylder, meanwhile, is a jack-of-all-trades character with a solid health pool and balanced stats that make him great at adapting to whatever type of loot a run provides. Simply grab any melee weapon and you’ll probably be doing alright with this fella. Plus, he has some of the coolest skins in the game. That doesn’t help you in battle, but like…come on. He looks rad.In conclusion, while things can certainly go poorly even with a team, I’d argue playing by your lonesome leaves too little room for error for a game that requires such a hefty time investment and minimal payoff for failure. Elden Ring Nightreign is designed from the ground up to be played with others, after all. Your mileage may vary, though, so play however you have fun with it! You can pick up Nightreign now on PS5, Xbox Series X/S, and Windows PCs. You’ll have to look elsewhere to pick up two other friends to play with, though.
    #nightreign #solo #play #really #impossible
    Is Nightreign Solo Play Really Impossible?
    Elden Ring Nightreign is a tough-as-nails game that blends the beloved roguelike and soulslike genres into something fans of both should find appealing. However, unlike most games in either genre, this one’s inherently designed around working together in a group of three. So, you may be wondering if you can strike out on your own in Elden Ring Nightreign. While the game is about to get easier for folks who choose to go it alone, right now such a style proves an exceptionally difficult challenge.Suggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History Share SubtitlesOffEnglishview videoSuggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History Share SubtitlesOffEnglish Elden Ring Nightreign solo?Let’s get this out of the way first: Yes, Elden Ring Nightreign offers the option for solo play. To do so, you’ll need to open the expedition menu at Roundtable Hold, then switch over to the matchmaking settings tab. At the bottom of the menu, set the Expedition Type to “Singleplayer.”The real question is whether Elden Ring Nightreign’s single-player experience is manageable or fun, and that really depends on your skill level, class choice, and patience more so than in any other similar game I can remember playing. If you really want to go at it by yourself,Ironeye or Wylder.Elden Ring Nightreign is already pretty damn challenging when running with a group of three other folks. The game’s sense of randomness adds a lot of unknowns to an expedition, and things can go wrong very quickly. But with a team, you can be revived, have someone else available to take some aggro from you when things get hairy, and use your character’s abilities to complement one another in difficult showdowns. It’s often still hard as hell, but victory usually feels possible even when things don’t go quite as planned.However, when you’re alone…Well, you’re all alone. If you die on a solo expedition, that’s it. You’re done. Back to the Roundable Hold with you, loser.With this in mind, some folks may find the anxiety-inducing pacing and chaotic showdowns enjoyable even while solo; but those who struggle to succeed without a group may find it demoralizing to watch hours go by without making any meaningful progress. And since some classes are much better for solo play than others, it can be even more frustrating to go it alone for someone who prefers to play one of the support-focused classes.If you really want to go at it by yourself, I’d recommend taking a look at Ironeye or Wylder.Ironeye’s ranged playstyle is the safest in the game, giving you a lot of freedom to tackle enemies your own way. For instance, you can take the high ground against some foes to avoid their attacks altogether, or use his sliding ability to dodge an attack and get behind an enemy for better positioning.Wylder, meanwhile, is a jack-of-all-trades character with a solid health pool and balanced stats that make him great at adapting to whatever type of loot a run provides. Simply grab any melee weapon and you’ll probably be doing alright with this fella. Plus, he has some of the coolest skins in the game. That doesn’t help you in battle, but like…come on. He looks rad.In conclusion, while things can certainly go poorly even with a team, I’d argue playing by your lonesome leaves too little room for error for a game that requires such a hefty time investment and minimal payoff for failure. Elden Ring Nightreign is designed from the ground up to be played with others, after all. Your mileage may vary, though, so play however you have fun with it! You can pick up Nightreign now on PS5, Xbox Series X/S, and Windows PCs. You’ll have to look elsewhere to pick up two other friends to play with, though. #nightreign #solo #play #really #impossible
    KOTAKU.COM
    Is Nightreign Solo Play Really Impossible?
    Elden Ring Nightreign is a tough-as-nails game that blends the beloved roguelike and soulslike genres into something fans of both should find appealing. However, unlike most games in either genre, this one’s inherently designed around working together in a group of three. So, you may be wondering if you can strike out on your own in Elden Ring Nightreign. While the game is about to get easier for folks who choose to go it alone, right now such a style proves an exceptionally difficult challenge.Suggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History Share SubtitlesOffEnglishview videoSuggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History Share SubtitlesOffEnglish Elden Ring Nightreign solo?Let’s get this out of the way first: Yes, Elden Ring Nightreign offers the option for solo play. To do so, you’ll need to open the expedition menu at Roundtable Hold, then switch over to the matchmaking settings tab. At the bottom of the menu, set the Expedition Type to “Singleplayer.”The real question is whether Elden Ring Nightreign’s single-player experience is manageable or fun, and that really depends on your skill level, class choice, and patience more so than in any other similar game I can remember playing. If you really want to go at it by yourself, [play as] Ironeye or Wylder.Elden Ring Nightreign is already pretty damn challenging when running with a group of three other folks. The game’s sense of randomness adds a lot of unknowns to an expedition, and things can go wrong very quickly. But with a team, you can be revived, have someone else available to take some aggro from you when things get hairy, and use your character’s abilities to complement one another in difficult showdowns. It’s often still hard as hell, but victory usually feels possible even when things don’t go quite as planned.However, when you’re alone…Well, you’re all alone. If you die on a solo expedition, that’s it. You’re done. Back to the Roundable Hold with you, loser.With this in mind, some folks may find the anxiety-inducing pacing and chaotic showdowns enjoyable even while solo; but those who struggle to succeed without a group may find it demoralizing to watch hours go by without making any meaningful progress. And since some classes are much better for solo play than others, it can be even more frustrating to go it alone for someone who prefers to play one of the support-focused classes.If you really want to go at it by yourself, I’d recommend taking a look at Ironeye or Wylder.Ironeye’s ranged playstyle is the safest in the game, giving you a lot of freedom to tackle enemies your own way. For instance, you can take the high ground against some foes to avoid their attacks altogether, or use his sliding ability to dodge an attack and get behind an enemy for better positioning.Wylder, meanwhile, is a jack-of-all-trades character with a solid health pool and balanced stats that make him great at adapting to whatever type of loot a run provides. Simply grab any melee weapon and you’ll probably be doing alright with this fella. Plus, he has some of the coolest skins in the game. That doesn’t help you in battle, but like…come on. He looks rad.In conclusion, while things can certainly go poorly even with a team, I’d argue playing by your lonesome leaves too little room for error for a game that requires such a hefty time investment and minimal payoff for failure. Elden Ring Nightreign is designed from the ground up to be played with others, after all. Your mileage may vary, though, so play however you have fun with it! You can pick up Nightreign now on PS5, Xbox Series X/S, and Windows PCs. You’ll have to look elsewhere to pick up two other friends to play with, though.
    0 Reacties 0 aandelen
  • Yes, You Can Change Outfits In Nightreign, But Not At First

    The original Elden Ring allowed you to create a character and deck them out in any number of unique pieces of armor to truly show off your fashion skills. However, multiplayer spin-off Nightreign features fixed character classes and progress that resets after every run, so you won’t be finding any sick armor to wear in this one. Instead, you’ll need to purchase and change full outfits for each character if you want to look different. Suggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History

    Share SubtitlesOffEnglishview videoSuggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History

    Share SubtitlesOffEnglishElden Ring: NightreignTo change your outfit in Elden Ring Nightreign, you’ll first need to complete two full expeditions. This could take quite a few hours, depending on your skill level, luck, and access to a decent group. This is a brutally challenging game, after all, so nothing comes quickly. Give it time.After completing two full expeditions of your choosing, you’ll unlock the “Change Garb” feature in your menu, which you can select to instantly access the outfit-changing menu. However, you can also manually access a mirror on the east side of the Roundtable Hold to access the same menu—though that generally sounds like a bit of unnecessary work.Screenshot: FromSoftware / Billy Givens / KotakuWhen you first unlock the ability to change garbs, you’ll find that you only have access to a total of three outfits per character. You’ll have the default skin automatically, of course, and you can purchase two additional skins, Dawn and Darkness, for each character using Murk.If you don’t like either of the extra outfits available in the beginning, don’t worry about it. You can purchase additional outfits much later in the game, so keep at it. It’s an ultra-hard game, but practice makes perfect. Elden Ring Nightreign is available now on PS5, Xbox Series X/S, and Windows PCs.
    #yes #you #can #change #outfits
    Yes, You Can Change Outfits In Nightreign, But Not At First
    The original Elden Ring allowed you to create a character and deck them out in any number of unique pieces of armor to truly show off your fashion skills. However, multiplayer spin-off Nightreign features fixed character classes and progress that resets after every run, so you won’t be finding any sick armor to wear in this one. Instead, you’ll need to purchase and change full outfits for each character if you want to look different. Suggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History Share SubtitlesOffEnglishview videoSuggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History Share SubtitlesOffEnglishElden Ring: NightreignTo change your outfit in Elden Ring Nightreign, you’ll first need to complete two full expeditions. This could take quite a few hours, depending on your skill level, luck, and access to a decent group. This is a brutally challenging game, after all, so nothing comes quickly. Give it time.After completing two full expeditions of your choosing, you’ll unlock the “Change Garb” feature in your menu, which you can select to instantly access the outfit-changing menu. However, you can also manually access a mirror on the east side of the Roundtable Hold to access the same menu—though that generally sounds like a bit of unnecessary work.Screenshot: FromSoftware / Billy Givens / KotakuWhen you first unlock the ability to change garbs, you’ll find that you only have access to a total of three outfits per character. You’ll have the default skin automatically, of course, and you can purchase two additional skins, Dawn and Darkness, for each character using Murk.If you don’t like either of the extra outfits available in the beginning, don’t worry about it. You can purchase additional outfits much later in the game, so keep at it. It’s an ultra-hard game, but practice makes perfect. Elden Ring Nightreign is available now on PS5, Xbox Series X/S, and Windows PCs. #yes #you #can #change #outfits
    KOTAKU.COM
    Yes, You Can Change Outfits In Nightreign, But Not At First
    The original Elden Ring allowed you to create a character and deck them out in any number of unique pieces of armor to truly show off your fashion skills. However, multiplayer spin-off Nightreign features fixed character classes and progress that resets after every run, so you won’t be finding any sick armor to wear in this one. Instead, you’ll need to purchase and change full outfits for each character if you want to look different. Suggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History Share SubtitlesOffEnglishview videoSuggested ReadingThe Most Sought After Elden Ring Sword Has A Storied History Share SubtitlesOffEnglishElden Ring: NightreignTo change your outfit in Elden Ring Nightreign, you’ll first need to complete two full expeditions. This could take quite a few hours, depending on your skill level, luck, and access to a decent group. This is a brutally challenging game, after all, so nothing comes quickly. Give it time.After completing two full expeditions of your choosing, you’ll unlock the “Change Garb” feature in your menu, which you can select to instantly access the outfit-changing menu. However, you can also manually access a mirror on the east side of the Roundtable Hold to access the same menu—though that generally sounds like a bit of unnecessary work.Screenshot: FromSoftware / Billy Givens / KotakuWhen you first unlock the ability to change garbs, you’ll find that you only have access to a total of three outfits per character. You’ll have the default skin automatically, of course, and you can purchase two additional skins, Dawn and Darkness, for each character using Murk (currency).If you don’t like either of the extra outfits available in the beginning, don’t worry about it. You can purchase additional outfits much later in the game, so keep at it. It’s an ultra-hard game, but practice makes perfect. Elden Ring Nightreign is available now on PS5, Xbox Series X/S, and Windows PCs.
    0 Reacties 0 aandelen
  • Pick up these helpful tips on advanced profiling

    In June, we hosted a webinar featuring experts from Arm, the Unity Accelerate Solutions team, and SYBO Games, the creator of Subway Surfers. The resulting roundtable focused on profiling tips and strategies for mobile games, the business implications of poor performance, and how SYBO shipped a hit mobile game with 3 billion downloads to date.Let’s dive into some of the follow-up questions we didn’t have time to cover during the webinar. You can also watch the full recording.We hear a lot about the Unity Profiler in relation to CPU profiling, but not as much about the Profile Analyzer. Are there any plans to improve it or integrate it into the core Profiler toolset?There are no immediate plans to integrate the Profile Analyzer into the core Editor, but this might change as our profiling tools evolve.Does Unity have any plans to add an option for the GPU Usage Profiler module to appear in percentages like it does in milliseconds?That’s a great idea, and while we can’t say yes or no at the time of this blog post, it’s a request that’s been shared with our R&D teams for possible future consideration.Do you have plans for tackling “Application Not Responding”errors that are reported by the Google Play store and don’t contain any stack trace?Although we don’t have specific plans for tracking ANR without stack trace at the moment, we will consider it for the future roadmap.How can I share my feedback to help influence the future development of Unity’s profiling tools?You can keep track of upcoming features and share feedback via our product board and forums. We are also conducting a survey to learn more about our customers’ experience with the profiling tools. If you’ve used profiling tools beforeor are working on a project that requires optimization, we would love to get your input. The survey is designed to take no more than 5–10 minutes to complete.By participating, you’ll also have the chance to opt into a follow-up interview to share more feedback directly with the development team, including the opportunity to discuss potential prototypes of new features.Is there a good rule for determining what counts as a viable low-end device to target?A rule of thumb we hear from many Unity game developers is to target devices that are five years old at the time of your game’s release, as this helps to ensure the largest user base. But we also see teams reducing their release-date scope to devices that are only three years old if they’re aiming for higher graphical quality. A visually complex 3D application, for example, will have higher device requirements than a simple 2D application. This approach allows for a higher “min spec,” but reduces the size of the initial install base. It’s essentially a business decision: Will it cost more to develop for and support old devices than what your game will earn running on them?Sometimes the technical requirements of your game will dictate your minimum target specifications. So if your game uses up large amounts of texture memory even after optimization, but you absolutely cannot reduce quality or resolution, that probably rules out running on phones with insufficient memory. If your rendering solution requires compute shaders, that likely rules out devices with drivers that can’t support OpenGL ES 3.1, Metal, or Vulkan.It’s a good idea to look at market data for your priority target audience. For instance, mobile device specs can vary a lot between countries and regions. Remember to define some target “budgets” so that benchmarking goals for what’s acceptable are set prior to choosing low-end devices for testing.For live service games that will run for years, you’ll need to monitor their compatibility continuously and adapt over time based on both your actual user base and current devices on the market.Is it enough to test performance exclusively on low-end devices to ensure that the game will also run smoothly on high-end ones?It might be, if you have a uniform workload on all devices. However, you still need to consider variations across hardware from different vendors and/or driver versions.It’s common for graphically rich games to have tiers of graphical fidelity – the higher the visual tier, the more resources required on capable devices. This tier selection might be automatic, but increasingly, users themselves can control the choice via a graphical settings menu. For this style of development, you’ll need to test at least one “min spec” target device per feature/workload tier that your game supports.If your game detects the capabilities of the device it’s running on and adapts the graphics output as needed, it could perform differently on higher end devices. So be sure to test on a range of devices with the different quality levels you’ve programmed the title for.Note: In this section, we’ve specified whether the expert answering is from Arm or Unity.Do you have advice for detecting the power range of a device to support automatic quality settings, particularly for mobile?Arm: We typically see developers doing coarse capability binning based on CPU and GPU models, as well as the GPU shader core count. This is never perfect, but it’s “about right.” A lot of studios collect live analytics from deployed devices, so they can supplement the automated binning with device-specific opt-in/opt-out to work around point issues where the capability binning isn’t accurate enough.As related to the previous question, for graphically rich content, we see a trend in mobile toward settings menus where users can choose to turn effects on or off, thereby allowing them to make performance choices that suit their preferences.Unity: Device memory and screen resolution are also important factors for choosing quality settings. Regarding textures, developers should be aware that Render Textures used by effects or post-processing can become a problem on devices with high resolution screens, but without a lot of memory to match.Given the breadth of configurations available, can you suggest a way to categorize devices to reduce the number of tiers you need to optimize for?Arm: The number of tiers your team optimizes for is really a game design and business decision, and should be based on how important pushing visual quality is to the value proposition of the game. For some genres it might not matter at all, but for others, users will have high expectations for the visual fidelity.Does the texture memory limit differ among models and brands of Android devices that have the same amount of total system memory?Arm: To a first-order approximation, we would expect the total amount of texture memory to be similar across vendors and hardware generations. There will be minor differences caused by memory layout and alignment restrictions, so it won’t be exactly the same.Is it CPU or GPU usage that contributes the most to overheating on mobile devices?Arm: It’s entirely content dependent. The CPU, GPU, or the DRAM can individually overheat a high-end device if pushed hard enough, even if you ignore the other two completely. The exact balance will vary based on the workload you are running.What tips can you give for profiling on devices that have thermal throttling? What margin would you target to avoid thermal throttling?Arm: Optimizing for frame time can be misleading on Android because devices will constantly adjust frequency to optimize energy usage, making frame time an incomplete measure by itself. Preferably, monitor CPU and GPU cycles per frame, as well as GPU memory bandwidth per frame, to get some value that is independent of frequency. The cycle target you need will depend on each device’s chip design, so you’ll need to experiment.Any optimization helps when it comes to managing power consumption, even if it doesn’t directly improve frame rate. For example, reducing CPU cycles will reduce thermal load even if the CPU isn’t the critical path for your game.Beyond that, optimizing memory bandwidth is one of the biggest savings you can make. Accessing DRAM is orders of magnitude more expensive than accessing local data on-chip, so watch your triangle budget and keep data types in memory as small as possible.Unity: To limit the impact of CPU clock frequency on the performance metrics, we recommend trying to run at a consistent temperature. There are a couple of approaches for doing this:Run warm: Run the device for a while so that it reaches a stable warm state before profiling.Run cool: Leave the device to cool for a while before profiling. This strategy can eliminate confusion and inconsistency in profiling sessions by taking captures that are unlikely to be thermally throttled. However, such captures will always represent the best case performance a user will see rather than what they might actually see after long play sessions. This strategy can also delay the time between profiling runs due to the need to wait for the cooling period first.With some hardware, you can fix the clock frequency for more stable performance metrics. However, this is not representative of most devices your users will be using, and will not report accurate real-world performance. Basically, it’s a handy technique if you are using a continuous integration setup to check for performance changes in your codebase over time.Any thoughts on Vulkan vs OpenGL ES 3 on Android? Vulkan is generally slower performance-wise. At the same time, many devices lack support for various features on ES3.Arm: Recent drivers and engine builds have vastly improved the quality of the Vulkan implementations available; so for an equivalent workload, there shouldn’t be a performance gap between OpenGL ES and Vulkan. The switch to Vulkan is picking up speed and we expect to see more people choosing Vulkan by default over the next year or two. If you have counterexamples of areas where Vulkan isn’t performing well, please get in touch with us. We’d love to hear from you.What tools can we use to monitor memory bandwidth?Arm: The Streamline Profiler in Arm Mobile Studio can measure bandwidth between Mali GPUs and the external DRAM.Should you split graphical assets by device tiers or device resolution?Arm: You can get the best result by retuning assets, but it’s expensive to do. Start by reducing resolution and frame rate, or disabling some optional post-processing effects.What is the best way to record performance metric statistics from our development build?Arm: You can use the Performance Advisor tool in Arm Mobile Studio to automatically capture and export performance metrics from the Mali GPUs, although this comes with a caveat: The generation of JSON reports requires a Professional Edition license.Unity: The Unity Profiler can be used to view common rendering metrics, such as vertex and triangle counts in the Rendering module. Plus you can include custom packages, such as System Metrics Mali, in your project to add low-level Mali GPU metrics to the Unity Profiler.What are your recommendations for profiling shader code?You need a GPU Profiler to do this. The one you choose depends on your target platform. For example, on iOS devices, Xcode’s GPU Profiler includes the Shader Profiler, which breaks down shader performance on a line-by-line basis.Arm Mobile Studio supports Mali Offline Compiler, a static analysis tool for shader code and compute kernels. This tool provides some overall performance estimates and recommendations for the Arm Mali GPU family.When profiling, the general rule is to test your game or app on the target device. With the industry moving toward more types of chipsets, how can developers profile and pinpoint issues on the many different hardware configurations in a reasonable amount of time?The proliferation of chipsets is primarily a concern on desktop platforms. There are a limited number of hardware architectures to test for console games. On mobile, there’s Apple’s A Series for iOS devices and a range of Arm and Qualcomm architectures for Android – but selecting a manageable list of representative mobile devices is pretty straightforward.On desktop it’s trickier because there’s a wide range of available chipsets and architectures, and buying Macs and PCs for testing can be expensive. Our best advice is to do what you can. No studio has infinite time and money for testing. We generally wouldn’t expect any huge surprises when comparing performance between an Intel x86 CPU and a similarly specced AMD processor, for instance. As long as the game performs comfortably on your minimum spec machine, you should be reasonably confident about other machines. It’s also worth considering using analytics, such as Unity Analytics, to record frame rates, system specs, and player options’ settings to identify hotspots or problematic configurations.We’re seeing more studios move to using at least some level of automated testing for regular on-device profiling, with summary stats published where the whole team can keep an eye on performance across the range of target devices. With well-designed test scenes, this can usually be made into a mechanical process that’s suited for automation, so you don’t need an experienced technical artist or QA tester running builds through the process manually.Do you ever see performance issues on high-end devices that don’t occur on the low-end ones?It’s uncommon, but we have seen it. Often the issue lies in how the project is configured, such as with the use of fancy shaders and high-res textures on high-end devices, which can put extra pressure on the GPU or memory. Sometimes a high-end mobile device or console will use a high-res phone screen or 4K TV output as a selling point but not necessarily have enough GPU power or memory to live up to that promise without further optimization.If you make use of the current versions of the C# Job System, verify whether there’s a job scheduling overhead that scales with the number of worker threads, which in turn, scales with the number of CPU cores. This can result in code that runs more slowly on a 64+ core Threadripper™ than on a modest 4-core or 8-core CPU. This issue will be addressed in future versions of Unity, but in the meantime, try limiting the number of job worker threads by setting JobsUtility.JobWorkerCount.What are some pointers for setting a good frame budget?Most of the time when we talk about frame budgets, we’re talking about the overall time budget for the frame. You calculate 1000/target frames per secondto get your frame budget: 33.33 ms for 30 fps, 16.66 ms for 60 fps, 8.33 ms for 120 Hz, etc. Reduce that number by around 35% if you’re on mobile to give the chips a chance to cool down between each frame. Dividing the budget up to get specific sub-budgets for different features and/or systems is probably overkill except for projects with very specific, predictable systems, or those that make heavy use of Time Slicing.Generally, profiling is the process of finding the biggest bottlenecks – and therefore, the biggest potential performance gains. So rather than saying, “Physics is taking 1.2 ms when the budget only allows for 1 ms,” you might look at a frame and say, “Rendering is taking 6 ms, making it the biggest main thread CPU cost in the frame. How can we reduce that?”It seems like profiling early and often is still not common knowledge. What are your thoughts on why this might be the case?Building, releasing, promoting, and managing a game is difficult work on multiple fronts. So there will always be numerous priorities vying for a developer’s attention, and profiling can fall by the wayside. They know it’s something they should do, but perhaps they’re unfamiliar with the tools and don’t feel like they have time to learn. Or, they don’t know how to fit profiling into their workflows because they’re pushed toward completing features rather than performance optimization.Just as with bugs and technical debt, performance issues are cheaper and less risky to address early on, rather than later in a project’s development cycle. Our focus is on helping to demystify profiling tools and techniques for those developers who are unfamiliar with them. That’s what the profiling e-book and its related blog post and webinar aim to support.Is there a way to exclude certain methods from instrumentation or include only specific methods when using Deep Profiling in the Unity Profiler? When using a lot of async/await tasks, we create large stack traces, but how can we avoid slowing down both the client and the Profiler when Deep Profiling?You can enable Allocation call stacks to see the full call stacks that lead to managed allocations. Additionally, you can – and should! – manually instrument long-running methods and processes by sprinkling ProfilerMarkers throughout your code. There’s currently no way to automatically enable Deep Profiling or disable profiling entirely in specific parts of your application. But manually adding ProfilerMarkers and enabling Allocation call stacks when required can help you dig down into problem areas without having to resort to Deep Profiling.As of Unity 2022.2, you can also use our IgnoredByDeepProfilerAttribute to prevent the Unity Profiler from capturing method calls. Just add the IgnoredByDeepProfiler attribute to classes, structures, and methods.Where can I find more information on Deep Profiling in Unity?Deep Profiling is covered in our Profiler documentation. Then there’s the most in-depth, single resource for profiling information, the Ultimate Guide to profiling Unity games e-book, which links to relevant documentation and other resources throughout.Is it correct that Deep Profiling is only useful for the Allocations Profiler and that it skews results so much that it’s not useful for finding hitches in the game?Deep Profiling can be used to find the specific causes of managed allocations, although Allocation call stacks can do the same thing with less overhead, overall. At the same time, Deep Profiling can be helpful for quickly investigating why one specific ProfilerMarker seems to be taking so long, as it’s more convenient to enable than to add numerous ProfilerMarkers to your scripts and rebuild your game. But yes, it does skew performance quite heavily and so shouldn’t be enabled for general profiling.Is VSync worth setting to every VBlank? My mobile game runs at a very low fps when it’s disabled.Mobile devices force VSync to be enabled at a driver/hardware level, so disabling it in Unity’s Quality settings shouldn’t make any difference on those platforms. We haven’t heard of a case where disabling VSync negatively affects performance. Try taking a profile capture with VSync enabled, along with another capture of the same scene but with VSync disabled. Then compare the captures using Profile Analyzer to try to understand why the performance is so different.How can you determine if the main thread is waiting for the GPU and not the other way around?This is covered in the Ultimate Guide to profiling Unity games. You can also get more information in the blog post, Detecting performance bottlenecks with Unity Frame Timing Manager.Generally speaking, the telltale sign is that the main thread waits for the Render thread while the Render thread waits for the GPU. The specific marker names will differ depending on your target platform and graphics API, but you should look out for markers with names such as “PresentFrame” or “WaitForPresent.”Is there a solid process for finding memory leaks in profiling?Use the Memory Profiler to compare memory snapshots and check for leaks. For example, you can take a snapshot in your main menu, enter your game and then quit, go back to the main menu, and take a second snapshot. Comparing these two will tell you whether any objects/allocations from the game are still hanging around in memory.Does it make sense to optimize and rewrite part of the code for the DOTS system, for mobile devices including VR/AR? Do you use this system in your projects?A number of game projects now make use of parts of the Data-Oriented Technology Stack. Native Containers, the C# Job System, Mathematics, and the Burst compilerare all fully supported packages that you can use right away to write optimal, parallelized, high-performance C#code to improve your project’s CPU performance.A smaller number of projects are also using Entities and associated packages, such as the Hybrid Renderer, Unity Physics, and NetCode. However, at this time, the packages listed are experimental, and using them involves accepting a degree of technical risk. This risk derives from an API that is still evolving, missing or incomplete features, as well as the engineering learning curve required to understand Data-Oriented Designto get the most out of Unity’s Entity Component System. Unity engineer Steve McGreal wrote a guide on DOTS best practices, which includes some DOD fundamentals and tips for improving ECS performance.How do you go about setting limits on SetPass calls or shader complexity? Can you even set limits beforehand?Rendering is a complex process and there is no practical way to set a hard limit on the maximum number of SetPass calls or a metric for shader complexity. Even on a fixed hardware platform, such as a single console, the limits will depend on what kind of scene you want to render, and what other work is happening on the CPU and GPU during a frame.That’s why the rule on when to profile is “early and often.” Teams tend to create a “vertical slice” demo early on during production – usually a short burst of gameplay developed to the level of visual fidelity intended for the final game. This is your first opportunity to profile rendering and figure out what optimizations and limits might be needed. The profiling process should be repeated every time a new area or other major piece of visual content is added.Here are additional resources for learning about performance optimization:BlogsOptimize your mobile game performance: Expert tips on graphics and assetsOptimize your mobile game performance: Expert tips on physics, UI, and audio settingsOptimize your mobile game performance: Expert tips on profiling, memory, and code architecture from Unity’s top engineersExpert tips on optimizing your game graphics for consolesProfiling in Unity 2021 LTS: What, when, and howHow-to pagesProfiling and debugging toolsHow to profile memory in UnityBest practices for profiling game performanceE-booksOptimize your console and PC game performanceOptimize your mobile game performanceUltimate guide to profiling Unity gamesLearn tutorialsProfiling CPU performance in Android builds with Android StudioProfiling applications – Made with UnityEven more advanced technical content is coming soon – but in the meantime, please feel free to suggest topics for us to cover on the forum and check out the full roundtable webinar recording.
    #pick #these #helpful #tips #advanced
    Pick up these helpful tips on advanced profiling
    In June, we hosted a webinar featuring experts from Arm, the Unity Accelerate Solutions team, and SYBO Games, the creator of Subway Surfers. The resulting roundtable focused on profiling tips and strategies for mobile games, the business implications of poor performance, and how SYBO shipped a hit mobile game with 3 billion downloads to date.Let’s dive into some of the follow-up questions we didn’t have time to cover during the webinar. You can also watch the full recording.We hear a lot about the Unity Profiler in relation to CPU profiling, but not as much about the Profile Analyzer. Are there any plans to improve it or integrate it into the core Profiler toolset?There are no immediate plans to integrate the Profile Analyzer into the core Editor, but this might change as our profiling tools evolve.Does Unity have any plans to add an option for the GPU Usage Profiler module to appear in percentages like it does in milliseconds?That’s a great idea, and while we can’t say yes or no at the time of this blog post, it’s a request that’s been shared with our R&D teams for possible future consideration.Do you have plans for tackling “Application Not Responding”errors that are reported by the Google Play store and don’t contain any stack trace?Although we don’t have specific plans for tracking ANR without stack trace at the moment, we will consider it for the future roadmap.How can I share my feedback to help influence the future development of Unity’s profiling tools?You can keep track of upcoming features and share feedback via our product board and forums. We are also conducting a survey to learn more about our customers’ experience with the profiling tools. If you’ve used profiling tools beforeor are working on a project that requires optimization, we would love to get your input. The survey is designed to take no more than 5–10 minutes to complete.By participating, you’ll also have the chance to opt into a follow-up interview to share more feedback directly with the development team, including the opportunity to discuss potential prototypes of new features.Is there a good rule for determining what counts as a viable low-end device to target?A rule of thumb we hear from many Unity game developers is to target devices that are five years old at the time of your game’s release, as this helps to ensure the largest user base. But we also see teams reducing their release-date scope to devices that are only three years old if they’re aiming for higher graphical quality. A visually complex 3D application, for example, will have higher device requirements than a simple 2D application. This approach allows for a higher “min spec,” but reduces the size of the initial install base. It’s essentially a business decision: Will it cost more to develop for and support old devices than what your game will earn running on them?Sometimes the technical requirements of your game will dictate your minimum target specifications. So if your game uses up large amounts of texture memory even after optimization, but you absolutely cannot reduce quality or resolution, that probably rules out running on phones with insufficient memory. If your rendering solution requires compute shaders, that likely rules out devices with drivers that can’t support OpenGL ES 3.1, Metal, or Vulkan.It’s a good idea to look at market data for your priority target audience. For instance, mobile device specs can vary a lot between countries and regions. Remember to define some target “budgets” so that benchmarking goals for what’s acceptable are set prior to choosing low-end devices for testing.For live service games that will run for years, you’ll need to monitor their compatibility continuously and adapt over time based on both your actual user base and current devices on the market.Is it enough to test performance exclusively on low-end devices to ensure that the game will also run smoothly on high-end ones?It might be, if you have a uniform workload on all devices. However, you still need to consider variations across hardware from different vendors and/or driver versions.It’s common for graphically rich games to have tiers of graphical fidelity – the higher the visual tier, the more resources required on capable devices. This tier selection might be automatic, but increasingly, users themselves can control the choice via a graphical settings menu. For this style of development, you’ll need to test at least one “min spec” target device per feature/workload tier that your game supports.If your game detects the capabilities of the device it’s running on and adapts the graphics output as needed, it could perform differently on higher end devices. So be sure to test on a range of devices with the different quality levels you’ve programmed the title for.Note: In this section, we’ve specified whether the expert answering is from Arm or Unity.Do you have advice for detecting the power range of a device to support automatic quality settings, particularly for mobile?Arm: We typically see developers doing coarse capability binning based on CPU and GPU models, as well as the GPU shader core count. This is never perfect, but it’s “about right.” A lot of studios collect live analytics from deployed devices, so they can supplement the automated binning with device-specific opt-in/opt-out to work around point issues where the capability binning isn’t accurate enough.As related to the previous question, for graphically rich content, we see a trend in mobile toward settings menus where users can choose to turn effects on or off, thereby allowing them to make performance choices that suit their preferences.Unity: Device memory and screen resolution are also important factors for choosing quality settings. Regarding textures, developers should be aware that Render Textures used by effects or post-processing can become a problem on devices with high resolution screens, but without a lot of memory to match.Given the breadth of configurations available, can you suggest a way to categorize devices to reduce the number of tiers you need to optimize for?Arm: The number of tiers your team optimizes for is really a game design and business decision, and should be based on how important pushing visual quality is to the value proposition of the game. For some genres it might not matter at all, but for others, users will have high expectations for the visual fidelity.Does the texture memory limit differ among models and brands of Android devices that have the same amount of total system memory?Arm: To a first-order approximation, we would expect the total amount of texture memory to be similar across vendors and hardware generations. There will be minor differences caused by memory layout and alignment restrictions, so it won’t be exactly the same.Is it CPU or GPU usage that contributes the most to overheating on mobile devices?Arm: It’s entirely content dependent. The CPU, GPU, or the DRAM can individually overheat a high-end device if pushed hard enough, even if you ignore the other two completely. The exact balance will vary based on the workload you are running.What tips can you give for profiling on devices that have thermal throttling? What margin would you target to avoid thermal throttling?Arm: Optimizing for frame time can be misleading on Android because devices will constantly adjust frequency to optimize energy usage, making frame time an incomplete measure by itself. Preferably, monitor CPU and GPU cycles per frame, as well as GPU memory bandwidth per frame, to get some value that is independent of frequency. The cycle target you need will depend on each device’s chip design, so you’ll need to experiment.Any optimization helps when it comes to managing power consumption, even if it doesn’t directly improve frame rate. For example, reducing CPU cycles will reduce thermal load even if the CPU isn’t the critical path for your game.Beyond that, optimizing memory bandwidth is one of the biggest savings you can make. Accessing DRAM is orders of magnitude more expensive than accessing local data on-chip, so watch your triangle budget and keep data types in memory as small as possible.Unity: To limit the impact of CPU clock frequency on the performance metrics, we recommend trying to run at a consistent temperature. There are a couple of approaches for doing this:Run warm: Run the device for a while so that it reaches a stable warm state before profiling.Run cool: Leave the device to cool for a while before profiling. This strategy can eliminate confusion and inconsistency in profiling sessions by taking captures that are unlikely to be thermally throttled. However, such captures will always represent the best case performance a user will see rather than what they might actually see after long play sessions. This strategy can also delay the time between profiling runs due to the need to wait for the cooling period first.With some hardware, you can fix the clock frequency for more stable performance metrics. However, this is not representative of most devices your users will be using, and will not report accurate real-world performance. Basically, it’s a handy technique if you are using a continuous integration setup to check for performance changes in your codebase over time.Any thoughts on Vulkan vs OpenGL ES 3 on Android? Vulkan is generally slower performance-wise. At the same time, many devices lack support for various features on ES3.Arm: Recent drivers and engine builds have vastly improved the quality of the Vulkan implementations available; so for an equivalent workload, there shouldn’t be a performance gap between OpenGL ES and Vulkan. The switch to Vulkan is picking up speed and we expect to see more people choosing Vulkan by default over the next year or two. If you have counterexamples of areas where Vulkan isn’t performing well, please get in touch with us. We’d love to hear from you.What tools can we use to monitor memory bandwidth?Arm: The Streamline Profiler in Arm Mobile Studio can measure bandwidth between Mali GPUs and the external DRAM.Should you split graphical assets by device tiers or device resolution?Arm: You can get the best result by retuning assets, but it’s expensive to do. Start by reducing resolution and frame rate, or disabling some optional post-processing effects.What is the best way to record performance metric statistics from our development build?Arm: You can use the Performance Advisor tool in Arm Mobile Studio to automatically capture and export performance metrics from the Mali GPUs, although this comes with a caveat: The generation of JSON reports requires a Professional Edition license.Unity: The Unity Profiler can be used to view common rendering metrics, such as vertex and triangle counts in the Rendering module. Plus you can include custom packages, such as System Metrics Mali, in your project to add low-level Mali GPU metrics to the Unity Profiler.What are your recommendations for profiling shader code?You need a GPU Profiler to do this. The one you choose depends on your target platform. For example, on iOS devices, Xcode’s GPU Profiler includes the Shader Profiler, which breaks down shader performance on a line-by-line basis.Arm Mobile Studio supports Mali Offline Compiler, a static analysis tool for shader code and compute kernels. This tool provides some overall performance estimates and recommendations for the Arm Mali GPU family.When profiling, the general rule is to test your game or app on the target device. With the industry moving toward more types of chipsets, how can developers profile and pinpoint issues on the many different hardware configurations in a reasonable amount of time?The proliferation of chipsets is primarily a concern on desktop platforms. There are a limited number of hardware architectures to test for console games. On mobile, there’s Apple’s A Series for iOS devices and a range of Arm and Qualcomm architectures for Android – but selecting a manageable list of representative mobile devices is pretty straightforward.On desktop it’s trickier because there’s a wide range of available chipsets and architectures, and buying Macs and PCs for testing can be expensive. Our best advice is to do what you can. No studio has infinite time and money for testing. We generally wouldn’t expect any huge surprises when comparing performance between an Intel x86 CPU and a similarly specced AMD processor, for instance. As long as the game performs comfortably on your minimum spec machine, you should be reasonably confident about other machines. It’s also worth considering using analytics, such as Unity Analytics, to record frame rates, system specs, and player options’ settings to identify hotspots or problematic configurations.We’re seeing more studios move to using at least some level of automated testing for regular on-device profiling, with summary stats published where the whole team can keep an eye on performance across the range of target devices. With well-designed test scenes, this can usually be made into a mechanical process that’s suited for automation, so you don’t need an experienced technical artist or QA tester running builds through the process manually.Do you ever see performance issues on high-end devices that don’t occur on the low-end ones?It’s uncommon, but we have seen it. Often the issue lies in how the project is configured, such as with the use of fancy shaders and high-res textures on high-end devices, which can put extra pressure on the GPU or memory. Sometimes a high-end mobile device or console will use a high-res phone screen or 4K TV output as a selling point but not necessarily have enough GPU power or memory to live up to that promise without further optimization.If you make use of the current versions of the C# Job System, verify whether there’s a job scheduling overhead that scales with the number of worker threads, which in turn, scales with the number of CPU cores. This can result in code that runs more slowly on a 64+ core Threadripper™ than on a modest 4-core or 8-core CPU. This issue will be addressed in future versions of Unity, but in the meantime, try limiting the number of job worker threads by setting JobsUtility.JobWorkerCount.What are some pointers for setting a good frame budget?Most of the time when we talk about frame budgets, we’re talking about the overall time budget for the frame. You calculate 1000/target frames per secondto get your frame budget: 33.33 ms for 30 fps, 16.66 ms for 60 fps, 8.33 ms for 120 Hz, etc. Reduce that number by around 35% if you’re on mobile to give the chips a chance to cool down between each frame. Dividing the budget up to get specific sub-budgets for different features and/or systems is probably overkill except for projects with very specific, predictable systems, or those that make heavy use of Time Slicing.Generally, profiling is the process of finding the biggest bottlenecks – and therefore, the biggest potential performance gains. So rather than saying, “Physics is taking 1.2 ms when the budget only allows for 1 ms,” you might look at a frame and say, “Rendering is taking 6 ms, making it the biggest main thread CPU cost in the frame. How can we reduce that?”It seems like profiling early and often is still not common knowledge. What are your thoughts on why this might be the case?Building, releasing, promoting, and managing a game is difficult work on multiple fronts. So there will always be numerous priorities vying for a developer’s attention, and profiling can fall by the wayside. They know it’s something they should do, but perhaps they’re unfamiliar with the tools and don’t feel like they have time to learn. Or, they don’t know how to fit profiling into their workflows because they’re pushed toward completing features rather than performance optimization.Just as with bugs and technical debt, performance issues are cheaper and less risky to address early on, rather than later in a project’s development cycle. Our focus is on helping to demystify profiling tools and techniques for those developers who are unfamiliar with them. That’s what the profiling e-book and its related blog post and webinar aim to support.Is there a way to exclude certain methods from instrumentation or include only specific methods when using Deep Profiling in the Unity Profiler? When using a lot of async/await tasks, we create large stack traces, but how can we avoid slowing down both the client and the Profiler when Deep Profiling?You can enable Allocation call stacks to see the full call stacks that lead to managed allocations. Additionally, you can – and should! – manually instrument long-running methods and processes by sprinkling ProfilerMarkers throughout your code. There’s currently no way to automatically enable Deep Profiling or disable profiling entirely in specific parts of your application. But manually adding ProfilerMarkers and enabling Allocation call stacks when required can help you dig down into problem areas without having to resort to Deep Profiling.As of Unity 2022.2, you can also use our IgnoredByDeepProfilerAttribute to prevent the Unity Profiler from capturing method calls. Just add the IgnoredByDeepProfiler attribute to classes, structures, and methods.Where can I find more information on Deep Profiling in Unity?Deep Profiling is covered in our Profiler documentation. Then there’s the most in-depth, single resource for profiling information, the Ultimate Guide to profiling Unity games e-book, which links to relevant documentation and other resources throughout.Is it correct that Deep Profiling is only useful for the Allocations Profiler and that it skews results so much that it’s not useful for finding hitches in the game?Deep Profiling can be used to find the specific causes of managed allocations, although Allocation call stacks can do the same thing with less overhead, overall. At the same time, Deep Profiling can be helpful for quickly investigating why one specific ProfilerMarker seems to be taking so long, as it’s more convenient to enable than to add numerous ProfilerMarkers to your scripts and rebuild your game. But yes, it does skew performance quite heavily and so shouldn’t be enabled for general profiling.Is VSync worth setting to every VBlank? My mobile game runs at a very low fps when it’s disabled.Mobile devices force VSync to be enabled at a driver/hardware level, so disabling it in Unity’s Quality settings shouldn’t make any difference on those platforms. We haven’t heard of a case where disabling VSync negatively affects performance. Try taking a profile capture with VSync enabled, along with another capture of the same scene but with VSync disabled. Then compare the captures using Profile Analyzer to try to understand why the performance is so different.How can you determine if the main thread is waiting for the GPU and not the other way around?This is covered in the Ultimate Guide to profiling Unity games. You can also get more information in the blog post, Detecting performance bottlenecks with Unity Frame Timing Manager.Generally speaking, the telltale sign is that the main thread waits for the Render thread while the Render thread waits for the GPU. The specific marker names will differ depending on your target platform and graphics API, but you should look out for markers with names such as “PresentFrame” or “WaitForPresent.”Is there a solid process for finding memory leaks in profiling?Use the Memory Profiler to compare memory snapshots and check for leaks. For example, you can take a snapshot in your main menu, enter your game and then quit, go back to the main menu, and take a second snapshot. Comparing these two will tell you whether any objects/allocations from the game are still hanging around in memory.Does it make sense to optimize and rewrite part of the code for the DOTS system, for mobile devices including VR/AR? Do you use this system in your projects?A number of game projects now make use of parts of the Data-Oriented Technology Stack. Native Containers, the C# Job System, Mathematics, and the Burst compilerare all fully supported packages that you can use right away to write optimal, parallelized, high-performance C#code to improve your project’s CPU performance.A smaller number of projects are also using Entities and associated packages, such as the Hybrid Renderer, Unity Physics, and NetCode. However, at this time, the packages listed are experimental, and using them involves accepting a degree of technical risk. This risk derives from an API that is still evolving, missing or incomplete features, as well as the engineering learning curve required to understand Data-Oriented Designto get the most out of Unity’s Entity Component System. Unity engineer Steve McGreal wrote a guide on DOTS best practices, which includes some DOD fundamentals and tips for improving ECS performance.How do you go about setting limits on SetPass calls or shader complexity? Can you even set limits beforehand?Rendering is a complex process and there is no practical way to set a hard limit on the maximum number of SetPass calls or a metric for shader complexity. Even on a fixed hardware platform, such as a single console, the limits will depend on what kind of scene you want to render, and what other work is happening on the CPU and GPU during a frame.That’s why the rule on when to profile is “early and often.” Teams tend to create a “vertical slice” demo early on during production – usually a short burst of gameplay developed to the level of visual fidelity intended for the final game. This is your first opportunity to profile rendering and figure out what optimizations and limits might be needed. The profiling process should be repeated every time a new area or other major piece of visual content is added.Here are additional resources for learning about performance optimization:BlogsOptimize your mobile game performance: Expert tips on graphics and assetsOptimize your mobile game performance: Expert tips on physics, UI, and audio settingsOptimize your mobile game performance: Expert tips on profiling, memory, and code architecture from Unity’s top engineersExpert tips on optimizing your game graphics for consolesProfiling in Unity 2021 LTS: What, when, and howHow-to pagesProfiling and debugging toolsHow to profile memory in UnityBest practices for profiling game performanceE-booksOptimize your console and PC game performanceOptimize your mobile game performanceUltimate guide to profiling Unity gamesLearn tutorialsProfiling CPU performance in Android builds with Android StudioProfiling applications – Made with UnityEven more advanced technical content is coming soon – but in the meantime, please feel free to suggest topics for us to cover on the forum and check out the full roundtable webinar recording. #pick #these #helpful #tips #advanced
    UNITY.COM
    Pick up these helpful tips on advanced profiling
    In June, we hosted a webinar featuring experts from Arm, the Unity Accelerate Solutions team, and SYBO Games, the creator of Subway Surfers. The resulting roundtable focused on profiling tips and strategies for mobile games, the business implications of poor performance, and how SYBO shipped a hit mobile game with 3 billion downloads to date.Let’s dive into some of the follow-up questions we didn’t have time to cover during the webinar. You can also watch the full recording.We hear a lot about the Unity Profiler in relation to CPU profiling, but not as much about the Profile Analyzer (available as a Unity package). Are there any plans to improve it or integrate it into the core Profiler toolset?There are no immediate plans to integrate the Profile Analyzer into the core Editor, but this might change as our profiling tools evolve.Does Unity have any plans to add an option for the GPU Usage Profiler module to appear in percentages like it does in milliseconds?That’s a great idea, and while we can’t say yes or no at the time of this blog post, it’s a request that’s been shared with our R&D teams for possible future consideration.Do you have plans for tackling “Application Not Responding” (ANR) errors that are reported by the Google Play store and don’t contain any stack trace?Although we don’t have specific plans for tracking ANR without stack trace at the moment, we will consider it for the future roadmap.How can I share my feedback to help influence the future development of Unity’s profiling tools?You can keep track of upcoming features and share feedback via our product board and forums. We are also conducting a survey to learn more about our customers’ experience with the profiling tools. If you’ve used profiling tools before (either daily or just once) or are working on a project that requires optimization, we would love to get your input. The survey is designed to take no more than 5–10 minutes to complete.By participating, you’ll also have the chance to opt into a follow-up interview to share more feedback directly with the development team, including the opportunity to discuss potential prototypes of new features.Is there a good rule for determining what counts as a viable low-end device to target?A rule of thumb we hear from many Unity game developers is to target devices that are five years old at the time of your game’s release, as this helps to ensure the largest user base. But we also see teams reducing their release-date scope to devices that are only three years old if they’re aiming for higher graphical quality. A visually complex 3D application, for example, will have higher device requirements than a simple 2D application. This approach allows for a higher “min spec,” but reduces the size of the initial install base. It’s essentially a business decision: Will it cost more to develop for and support old devices than what your game will earn running on them?Sometimes the technical requirements of your game will dictate your minimum target specifications. So if your game uses up large amounts of texture memory even after optimization, but you absolutely cannot reduce quality or resolution, that probably rules out running on phones with insufficient memory. If your rendering solution requires compute shaders, that likely rules out devices with drivers that can’t support OpenGL ES 3.1, Metal, or Vulkan.It’s a good idea to look at market data for your priority target audience. For instance, mobile device specs can vary a lot between countries and regions. Remember to define some target “budgets” so that benchmarking goals for what’s acceptable are set prior to choosing low-end devices for testing.For live service games that will run for years, you’ll need to monitor their compatibility continuously and adapt over time based on both your actual user base and current devices on the market.Is it enough to test performance exclusively on low-end devices to ensure that the game will also run smoothly on high-end ones?It might be, if you have a uniform workload on all devices. However, you still need to consider variations across hardware from different vendors and/or driver versions.It’s common for graphically rich games to have tiers of graphical fidelity – the higher the visual tier, the more resources required on capable devices. This tier selection might be automatic, but increasingly, users themselves can control the choice via a graphical settings menu. For this style of development, you’ll need to test at least one “min spec” target device per feature/workload tier that your game supports.If your game detects the capabilities of the device it’s running on and adapts the graphics output as needed, it could perform differently on higher end devices. So be sure to test on a range of devices with the different quality levels you’ve programmed the title for.Note: In this section, we’ve specified whether the expert answering is from Arm or Unity.Do you have advice for detecting the power range of a device to support automatic quality settings, particularly for mobile?Arm: We typically see developers doing coarse capability binning based on CPU and GPU models, as well as the GPU shader core count. This is never perfect, but it’s “about right.” A lot of studios collect live analytics from deployed devices, so they can supplement the automated binning with device-specific opt-in/opt-out to work around point issues where the capability binning isn’t accurate enough.As related to the previous question, for graphically rich content, we see a trend in mobile toward settings menus where users can choose to turn effects on or off, thereby allowing them to make performance choices that suit their preferences.Unity: Device memory and screen resolution are also important factors for choosing quality settings. Regarding textures, developers should be aware that Render Textures used by effects or post-processing can become a problem on devices with high resolution screens, but without a lot of memory to match.Given the breadth of configurations available (CPU, GPU, SOC, memory, mobile, desktop, console, etc.), can you suggest a way to categorize devices to reduce the number of tiers you need to optimize for?Arm: The number of tiers your team optimizes for is really a game design and business decision, and should be based on how important pushing visual quality is to the value proposition of the game. For some genres it might not matter at all, but for others, users will have high expectations for the visual fidelity.Does the texture memory limit differ among models and brands of Android devices that have the same amount of total system memory?Arm: To a first-order approximation, we would expect the total amount of texture memory to be similar across vendors and hardware generations. There will be minor differences caused by memory layout and alignment restrictions, so it won’t be exactly the same.Is it CPU or GPU usage that contributes the most to overheating on mobile devices?Arm: It’s entirely content dependent. The CPU, GPU, or the DRAM can individually overheat a high-end device if pushed hard enough, even if you ignore the other two completely. The exact balance will vary based on the workload you are running.What tips can you give for profiling on devices that have thermal throttling? What margin would you target to avoid thermal throttling (i.e., targeting 20 ms instead of 33 ms)?Arm: Optimizing for frame time can be misleading on Android because devices will constantly adjust frequency to optimize energy usage, making frame time an incomplete measure by itself. Preferably, monitor CPU and GPU cycles per frame, as well as GPU memory bandwidth per frame, to get some value that is independent of frequency. The cycle target you need will depend on each device’s chip design, so you’ll need to experiment.Any optimization helps when it comes to managing power consumption, even if it doesn’t directly improve frame rate. For example, reducing CPU cycles will reduce thermal load even if the CPU isn’t the critical path for your game.Beyond that, optimizing memory bandwidth is one of the biggest savings you can make. Accessing DRAM is orders of magnitude more expensive than accessing local data on-chip, so watch your triangle budget and keep data types in memory as small as possible.Unity: To limit the impact of CPU clock frequency on the performance metrics, we recommend trying to run at a consistent temperature. There are a couple of approaches for doing this:Run warm: Run the device for a while so that it reaches a stable warm state before profiling.Run cool: Leave the device to cool for a while before profiling. This strategy can eliminate confusion and inconsistency in profiling sessions by taking captures that are unlikely to be thermally throttled. However, such captures will always represent the best case performance a user will see rather than what they might actually see after long play sessions. This strategy can also delay the time between profiling runs due to the need to wait for the cooling period first.With some hardware, you can fix the clock frequency for more stable performance metrics. However, this is not representative of most devices your users will be using, and will not report accurate real-world performance. Basically, it’s a handy technique if you are using a continuous integration setup to check for performance changes in your codebase over time.Any thoughts on Vulkan vs OpenGL ES 3 on Android? Vulkan is generally slower performance-wise. At the same time, many devices lack support for various features on ES3.Arm: Recent drivers and engine builds have vastly improved the quality of the Vulkan implementations available; so for an equivalent workload, there shouldn’t be a performance gap between OpenGL ES and Vulkan (if there is, please let us know). The switch to Vulkan is picking up speed and we expect to see more people choosing Vulkan by default over the next year or two. If you have counterexamples of areas where Vulkan isn’t performing well, please get in touch with us. We’d love to hear from you.What tools can we use to monitor memory bandwidth (RAM <-> VRAM)?Arm: The Streamline Profiler in Arm Mobile Studio can measure bandwidth between Mali GPUs and the external DRAM (or system cache).Should you split graphical assets by device tiers or device resolution?Arm: You can get the best result by retuning assets, but it’s expensive to do. Start by reducing resolution and frame rate, or disabling some optional post-processing effects.What is the best way to record performance metric statistics from our development build?Arm: You can use the Performance Advisor tool in Arm Mobile Studio to automatically capture and export performance metrics from the Mali GPUs, although this comes with a caveat: The generation of JSON reports requires a Professional Edition license.Unity: The Unity Profiler can be used to view common rendering metrics, such as vertex and triangle counts in the Rendering module. Plus you can include custom packages, such as System Metrics Mali, in your project to add low-level Mali GPU metrics to the Unity Profiler.What are your recommendations for profiling shader code?You need a GPU Profiler to do this. The one you choose depends on your target platform. For example, on iOS devices, Xcode’s GPU Profiler includes the Shader Profiler, which breaks down shader performance on a line-by-line basis.Arm Mobile Studio supports Mali Offline Compiler, a static analysis tool for shader code and compute kernels. This tool provides some overall performance estimates and recommendations for the Arm Mali GPU family.When profiling, the general rule is to test your game or app on the target device(s). With the industry moving toward more types of chipsets (Apple M1, Arm, x86 by Intel, AMD, etc.), how can developers profile and pinpoint issues on the many different hardware configurations in a reasonable amount of time?The proliferation of chipsets is primarily a concern on desktop platforms. There are a limited number of hardware architectures to test for console games. On mobile, there’s Apple’s A Series for iOS devices and a range of Arm and Qualcomm architectures for Android – but selecting a manageable list of representative mobile devices is pretty straightforward.On desktop it’s trickier because there’s a wide range of available chipsets and architectures, and buying Macs and PCs for testing can be expensive. Our best advice is to do what you can. No studio has infinite time and money for testing. We generally wouldn’t expect any huge surprises when comparing performance between an Intel x86 CPU and a similarly specced AMD processor, for instance. As long as the game performs comfortably on your minimum spec machine, you should be reasonably confident about other machines. It’s also worth considering using analytics, such as Unity Analytics, to record frame rates, system specs, and player options’ settings to identify hotspots or problematic configurations.We’re seeing more studios move to using at least some level of automated testing for regular on-device profiling, with summary stats published where the whole team can keep an eye on performance across the range of target devices. With well-designed test scenes, this can usually be made into a mechanical process that’s suited for automation, so you don’t need an experienced technical artist or QA tester running builds through the process manually.Do you ever see performance issues on high-end devices that don’t occur on the low-end ones?It’s uncommon, but we have seen it. Often the issue lies in how the project is configured, such as with the use of fancy shaders and high-res textures on high-end devices, which can put extra pressure on the GPU or memory. Sometimes a high-end mobile device or console will use a high-res phone screen or 4K TV output as a selling point but not necessarily have enough GPU power or memory to live up to that promise without further optimization.If you make use of the current versions of the C# Job System, verify whether there’s a job scheduling overhead that scales with the number of worker threads, which in turn, scales with the number of CPU cores. This can result in code that runs more slowly on a 64+ core Threadripper™ than on a modest 4-core or 8-core CPU. This issue will be addressed in future versions of Unity, but in the meantime, try limiting the number of job worker threads by setting JobsUtility.JobWorkerCount.What are some pointers for setting a good frame budget?Most of the time when we talk about frame budgets, we’re talking about the overall time budget for the frame. You calculate 1000/target frames per second (fps) to get your frame budget: 33.33 ms for 30 fps, 16.66 ms for 60 fps, 8.33 ms for 120 Hz, etc. Reduce that number by around 35% if you’re on mobile to give the chips a chance to cool down between each frame. Dividing the budget up to get specific sub-budgets for different features and/or systems is probably overkill except for projects with very specific, predictable systems, or those that make heavy use of Time Slicing.Generally, profiling is the process of finding the biggest bottlenecks – and therefore, the biggest potential performance gains. So rather than saying, “Physics is taking 1.2 ms when the budget only allows for 1 ms,” you might look at a frame and say, “Rendering is taking 6 ms, making it the biggest main thread CPU cost in the frame. How can we reduce that?”It seems like profiling early and often is still not common knowledge. What are your thoughts on why this might be the case?Building, releasing, promoting, and managing a game is difficult work on multiple fronts. So there will always be numerous priorities vying for a developer’s attention, and profiling can fall by the wayside. They know it’s something they should do, but perhaps they’re unfamiliar with the tools and don’t feel like they have time to learn. Or, they don’t know how to fit profiling into their workflows because they’re pushed toward completing features rather than performance optimization.Just as with bugs and technical debt, performance issues are cheaper and less risky to address early on, rather than later in a project’s development cycle. Our focus is on helping to demystify profiling tools and techniques for those developers who are unfamiliar with them. That’s what the profiling e-book and its related blog post and webinar aim to support.Is there a way to exclude certain methods from instrumentation or include only specific methods when using Deep Profiling in the Unity Profiler? When using a lot of async/await tasks, we create large stack traces, but how can we avoid slowing down both the client and the Profiler when Deep Profiling?You can enable Allocation call stacks to see the full call stacks that lead to managed allocations (shown as magenta in the Unity CPU Profiler Timeline view). Additionally, you can – and should! – manually instrument long-running methods and processes by sprinkling ProfilerMarkers throughout your code. There’s currently no way to automatically enable Deep Profiling or disable profiling entirely in specific parts of your application. But manually adding ProfilerMarkers and enabling Allocation call stacks when required can help you dig down into problem areas without having to resort to Deep Profiling.As of Unity 2022.2, you can also use our IgnoredByDeepProfilerAttribute to prevent the Unity Profiler from capturing method calls. Just add the IgnoredByDeepProfiler attribute to classes, structures, and methods.Where can I find more information on Deep Profiling in Unity?Deep Profiling is covered in our Profiler documentation. Then there’s the most in-depth, single resource for profiling information, the Ultimate Guide to profiling Unity games e-book, which links to relevant documentation and other resources throughout.Is it correct that Deep Profiling is only useful for the Allocations Profiler and that it skews results so much that it’s not useful for finding hitches in the game?Deep Profiling can be used to find the specific causes of managed allocations, although Allocation call stacks can do the same thing with less overhead, overall. At the same time, Deep Profiling can be helpful for quickly investigating why one specific ProfilerMarker seems to be taking so long, as it’s more convenient to enable than to add numerous ProfilerMarkers to your scripts and rebuild your game. But yes, it does skew performance quite heavily and so shouldn’t be enabled for general profiling.Is VSync worth setting to every VBlank? My mobile game runs at a very low fps when it’s disabled.Mobile devices force VSync to be enabled at a driver/hardware level, so disabling it in Unity’s Quality settings shouldn’t make any difference on those platforms. We haven’t heard of a case where disabling VSync negatively affects performance. Try taking a profile capture with VSync enabled, along with another capture of the same scene but with VSync disabled. Then compare the captures using Profile Analyzer to try to understand why the performance is so different.How can you determine if the main thread is waiting for the GPU and not the other way around?This is covered in the Ultimate Guide to profiling Unity games. You can also get more information in the blog post, Detecting performance bottlenecks with Unity Frame Timing Manager.Generally speaking, the telltale sign is that the main thread waits for the Render thread while the Render thread waits for the GPU. The specific marker names will differ depending on your target platform and graphics API, but you should look out for markers with names such as “PresentFrame” or “WaitForPresent.”Is there a solid process for finding memory leaks in profiling?Use the Memory Profiler to compare memory snapshots and check for leaks. For example, you can take a snapshot in your main menu, enter your game and then quit, go back to the main menu, and take a second snapshot. Comparing these two will tell you whether any objects/allocations from the game are still hanging around in memory.Does it make sense to optimize and rewrite part of the code for the DOTS system, for mobile devices including VR/AR? Do you use this system in your projects?A number of game projects now make use of parts of the Data-Oriented Technology Stack (DOTS). Native Containers, the C# Job System, Mathematics, and the Burst compilerare all fully supported packages that you can use right away to write optimal, parallelized, high-performance C# (HPC#) code to improve your project’s CPU performance.A smaller number of projects are also using Entities and associated packages, such as the Hybrid Renderer, Unity Physics, and NetCode. However, at this time, the packages listed are experimental, and using them involves accepting a degree of technical risk. This risk derives from an API that is still evolving, missing or incomplete features, as well as the engineering learning curve required to understand Data-Oriented Design (DOD) to get the most out of Unity’s Entity Component System (ECS). Unity engineer Steve McGreal wrote a guide on DOTS best practices, which includes some DOD fundamentals and tips for improving ECS performance.How do you go about setting limits on SetPass calls or shader complexity? Can you even set limits beforehand?Rendering is a complex process and there is no practical way to set a hard limit on the maximum number of SetPass calls or a metric for shader complexity. Even on a fixed hardware platform, such as a single console, the limits will depend on what kind of scene you want to render, and what other work is happening on the CPU and GPU during a frame.That’s why the rule on when to profile is “early and often.” Teams tend to create a “vertical slice” demo early on during production – usually a short burst of gameplay developed to the level of visual fidelity intended for the final game. This is your first opportunity to profile rendering and figure out what optimizations and limits might be needed. The profiling process should be repeated every time a new area or other major piece of visual content is added.Here are additional resources for learning about performance optimization:BlogsOptimize your mobile game performance: Expert tips on graphics and assetsOptimize your mobile game performance: Expert tips on physics, UI, and audio settingsOptimize your mobile game performance: Expert tips on profiling, memory, and code architecture from Unity’s top engineersExpert tips on optimizing your game graphics for consolesProfiling in Unity 2021 LTS: What, when, and howHow-to pagesProfiling and debugging toolsHow to profile memory in UnityBest practices for profiling game performanceE-booksOptimize your console and PC game performanceOptimize your mobile game performanceUltimate guide to profiling Unity gamesLearn tutorialsProfiling CPU performance in Android builds with Android StudioProfiling applications – Made with UnityEven more advanced technical content is coming soon – but in the meantime, please feel free to suggest topics for us to cover on the forum and check out the full roundtable webinar recording.
    0 Reacties 0 aandelen
  • Your last opportunity to vote on the TechCrunch Disrupt 2025 agenda lineup

    We’re thrilled by the overwhelming response to our call for speakers at TechCrunch Disrupt 2025, taking place October 27–29 at Moscone West in San Francisco. After a careful selection process, we’ve narrowed it down to 20 impressive finalists—10 breakout sessions and 10 roundtables. Now it’s your turn to help shape the agenda. Audience Choice voting
    #your #last #opportunity #vote #techcrunch
    Your last opportunity to vote on the TechCrunch Disrupt 2025 agenda lineup
    We’re thrilled by the overwhelming response to our call for speakers at TechCrunch Disrupt 2025, taking place October 27–29 at Moscone West in San Francisco. After a careful selection process, we’ve narrowed it down to 20 impressive finalists—10 breakout sessions and 10 roundtables. Now it’s your turn to help shape the agenda. Audience Choice voting #your #last #opportunity #vote #techcrunch
    TECHCRUNCH.COM
    Your last opportunity to vote on the TechCrunch Disrupt 2025 agenda lineup
    We’re thrilled by the overwhelming response to our call for speakers at TechCrunch Disrupt 2025, taking place October 27–29 at Moscone West in San Francisco. After a careful selection process, we’ve narrowed it down to 20 impressive finalists—10 breakout sessions and 10 roundtables. Now it’s your turn to help shape the agenda. Audience Choice voting […]
    0 Reacties 0 aandelen