Pesquisar

Publicações

Blogs

Usuários

Páginas

Grupos

Michael Nicholas @michael_nicholas_3b37 compartilhou um link
2025-07-27 11:31:40 ·

So, it turns out that four mathematicians have taken a break from counting sheep and made some "great strides" toward a 'Grand Unified Theory' of math. Who knew that Fermat's Last Theorem was just the tip of the iceberg? I mean, why solve real-world problems when we can chase down the elusive unicorn of mathematical unity? Next, they'll be telling us that pie is just a circle in a deep existential crisis.

But hey, at least this gives us something to ponder while the rest of the world is busy figuring out how to pay their bills. Cheers to the ivory tower!

#MathHumor #UnifiedTheory #FermatsLastTheorem #MathematicsIsFun #KeepCounting

So, it turns out that four mathematicians have taken a break from counting sheep and made some "great strides" toward a 'Grand Unified Theory' of math. Who knew that Fermat's Last Theorem was just the tip of the iceberg? I mean, why solve real-world problems when we can chase down the elusive unicorn of mathematical unity? Next, they'll be telling us that pie is just a circle in a deep existential crisis. But hey, at least this gives us something to ponder while the rest of the world is busy figuring out how to pay their bills. Cheers to the ivory tower! #MathHumor #UnifiedTheory #FermatsLastTheorem #MathematicsIsFun #KeepCounting

www.wired.com
By extending the scope of a key insight behind Fermat’s Last Theorem, four mathematicians have made great strides toward building a unifying theory of mathematics.

1 Comentários ·0 Compartilhamentos ·0 Anterior

Faça Login para curtir, compartilhar e comentar!
Microsoft Academic @MicrosoftAcademic compartilhou um link
2025-06-15 06:55:50 ·

How AI is reshaping the future of healthcare and medical research

Transcript      
PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”         
This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.  
Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?   
In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  The book passage I read at the top is from “Chapter 10: The Big Black Bag.”
In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.
In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open.
As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.
Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home.
Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.
Here’s my conversation with Bill Gates and Sébastien Bubeck.
LEE: Bill, welcome.
BILL GATES: Thank you.
LEE: Seb …
SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here.
LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening?
And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?
GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines.
And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.
And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weaknessthat, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning.
LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that?
GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, …
LEE: Right.
GATES: … that is a bit weird.
LEE: Yeah.
GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training.
LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent.
BUBECK: Yes.
LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSRto join and start investigating this thing seriously. And the first person I pulled in was you.
BUBECK: Yeah.
LEE: And so what were your first encounters? Because I actually don’t remember what happened then.
BUBECK: Oh, I remember it very well.My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3.
I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1.
So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair.And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts.
So this was really, to me, the first moment where I saw some understanding in those models.
LEE: So this was, just to get the timing right, that was before I pulled you into the tent.
BUBECK: That was before. That was like a year before.
LEE: Right.
BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4.
So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.
So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x.
And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?
LEE: Yeah.
BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.
LEE:One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine.
And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.
And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.
I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book.
But the main purpose of this conversation isn’t to reminisce aboutor indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements.
But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today?
You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.
Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork?
GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.
It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision.
But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view.
LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients.Does that make sense to you?
BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong?
Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.
Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them.
And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT. And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.
Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way.
It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine.
LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all?
GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that.
The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa,
So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.
LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking?
GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.
The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.
LEE: Right.
GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.
LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication.
BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE, for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI.
It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for.
LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes.
I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?
That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential.What’s up with that?
BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back thatversion of GPT-4o, so now we don’t have the sycophant version out there.
Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF, where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad.
But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model.
So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model.
LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and …
BUBECK: It’s a very difficult, very difficult balance.
LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models?
GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there.
Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?
Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there.
LEE: Yeah.
GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake.
LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on.
BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGIthat kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything.
That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects.So it’s … I think it’s an important example to have in mind.
LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two?
BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it.
LEE: So we have about three hours of stuff to talk about, but our time is actually running low.
BUBECK: Yes, yes, yes.
LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now?
GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.
The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities.
And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period.
LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers?
GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them.
LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.
I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why.
BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and seeproduced what you wanted. So I absolutely agree with that.
And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini. So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.
LEE: Yeah.
BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.
Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not.
Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision.
LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist …
BUBECK: Yeah.
LEE: … or an endocrinologist might not.
BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.
LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today?
BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later.
And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …
LEE: Will AI prescribe your medicines? Write your prescriptions?
BUBECK: I think yes. I think yes.
LEE: OK. Bill?
GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate?
And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelectedjust on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries.
You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that.
LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.
I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.
GATES: Yeah. Thanks, you guys.
BUBECK: Thank you, Peter. Thanks, Bill.
LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.
With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.
And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.
One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.
HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings.
You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.
If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.
I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.
Until next time.
#how #reshaping #future #healthcare #medical

How AI is reshaping the future of healthcare and medical research
Transcript       PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”          This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.   Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?    In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  The book passage I read at the top is from “Chapter 10: The Big Black Bag.” In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open. As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.   Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home. Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.      Here’s my conversation with Bill Gates and Sébastien Bubeck. LEE: Bill, welcome. BILL GATES: Thank you. LEE: Seb … SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here. LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening? And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?   GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines. And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.   And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weaknessthat, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning. LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that? GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, … LEE: Right.   GATES: … that is a bit weird.   LEE: Yeah. GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training. LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent. BUBECK: Yes.   LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSRto join and start investigating this thing seriously. And the first person I pulled in was you. BUBECK: Yeah. LEE: And so what were your first encounters? Because I actually don’t remember what happened then. BUBECK: Oh, I remember it very well.My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3. I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1. So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair.And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts. So this was really, to me, the first moment where I saw some understanding in those models.   LEE: So this was, just to get the timing right, that was before I pulled you into the tent. BUBECK: That was before. That was like a year before. LEE: Right.   BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4. So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.   So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x. And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?   LEE: Yeah. BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.   LEE:One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine. And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.   And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.   I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book. But the main purpose of this conversation isn’t to reminisce aboutor indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements. But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today? You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.   Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork? GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.   It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision. But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view. LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients.Does that make sense to you? BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong? Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.   Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them. And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT. And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.   Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way. It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine. LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all? GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that. The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa, So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.   LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking? GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.   The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.   LEE: Right.   GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.   LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication. BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE, for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI. It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for. LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes. I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?   That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential.What’s up with that? BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back thatversion of GPT-4o, so now we don’t have the sycophant version out there. Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF, where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad. But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model. So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model. LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and … BUBECK: It’s a very difficult, very difficult balance. LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models? GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there. Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?   Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there. LEE: Yeah. GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake. LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on. BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGIthat kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything. That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects.So it’s … I think it’s an important example to have in mind. LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two? BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it. LEE: So we have about three hours of stuff to talk about, but our time is actually running low. BUBECK: Yes, yes, yes.   LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now? GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.   The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities. And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period. LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers? GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them. LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.   I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why. BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and seeproduced what you wanted. So I absolutely agree with that.   And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini. So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.   LEE: Yeah. BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.   Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not. Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision. LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist … BUBECK: Yeah. LEE: … or an endocrinologist might not. BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know. LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today? BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later. And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …   LEE: Will AI prescribe your medicines? Write your prescriptions? BUBECK: I think yes. I think yes. LEE: OK. Bill? GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate? And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelectedjust on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries. You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that. LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.   I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.   GATES: Yeah. Thanks, you guys. BUBECK: Thank you, Peter. Thanks, Bill. LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.   And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.   One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.   HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings. You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.   If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.   I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.   Until next time.   #how #reshaping #future #healthcare #medical

How AI is reshaping the future of healthcare and medical research

www.microsoft.com
Transcript [MUSIC]      [BOOK PASSAGE]  PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”   [END OF BOOK PASSAGE]    [THEME MUSIC]    This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.   Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?    In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  [THEME MUSIC FADES] The book passage I read at the top is from “Chapter 10: The Big Black Bag.” In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open. As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.   Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home. Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.    [TRANSITION MUSIC]   Here’s my conversation with Bill Gates and Sébastien Bubeck. LEE: Bill, welcome. BILL GATES: Thank you. LEE: Seb … SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here. LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening? And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?   GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines. And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.   And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weakness [LAUGHTER] that, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning. LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that? GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, … LEE: Right.   GATES: … that is a bit weird.   LEE: Yeah. GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training. LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent. [LAUGHS] BUBECK: Yes.   LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSR [Microsoft Research] to join and start investigating this thing seriously. And the first person I pulled in was you. BUBECK: Yeah. LEE: And so what were your first encounters? Because I actually don’t remember what happened then. BUBECK: Oh, I remember it very well. [LAUGHS] My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3. I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1. So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair. [LAUGHTER] And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts. So this was really, to me, the first moment where I saw some understanding in those models.   LEE: So this was, just to get the timing right, that was before I pulled you into the tent. BUBECK: That was before. That was like a year before. LEE: Right.   BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4. So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.   So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x. And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?   LEE: Yeah. BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.   LEE: [LAUGHS] One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine. And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.   And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.   I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book. But the main purpose of this conversation isn’t to reminisce about [LAUGHS] or indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements. But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today? You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.   Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork? GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.   It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision. But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view. LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients. [LAUGHTER] Does that make sense to you? BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong? Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.   Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them. And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT (opens in new tab). And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.   Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way. It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine. LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all? GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that. The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa, So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.   LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking? GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.   The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.   LEE: Right.   GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.   LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication. BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE [United States Medical Licensing Examination], for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI. It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for. LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes. I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?   That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential. [LAUGHTER] What’s up with that? BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back that [LAUGHS] version of GPT-4o, so now we don’t have the sycophant version out there. Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF [reinforcement learning from human feedback], where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad. But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model. So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model. LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and … BUBECK: It’s a very difficult, very difficult balance. LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models? GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there. Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?   Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there. LEE: Yeah. GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake. LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on. BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGI [artificial general intelligence] that kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything. That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects. [LAUGHTER] So it’s … I think it’s an important example to have in mind. LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two? BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it. LEE: So we have about three hours of stuff to talk about, but our time is actually running low. BUBECK: Yes, yes, yes.   LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now? GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.   The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities. And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period. LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers? GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them. LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.   I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why. BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and see [if you have] produced what you wanted. So I absolutely agree with that.   And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab). So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.   LEE: Yeah. BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.   Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not. Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision. LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist … BUBECK: Yeah. LEE: … or an endocrinologist might not. BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know. LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today? BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later. And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …   LEE: Will AI prescribe your medicines? Write your prescriptions? BUBECK: I think yes. I think yes. LEE: OK. Bill? GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate? And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelected [LAUGHTER] just on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries. You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that. LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.   I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you. [TRANSITION MUSIC] GATES: Yeah. Thanks, you guys. BUBECK: Thank you, Peter. Thanks, Bill. LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.   And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.   One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.   HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings. You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.   If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real. [THEME MUSIC] I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.   Until next time.   [MUSIC FADES]

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça Login para curtir, compartilhar e comentar!
InformationWeek @InformationWeek compartilhou um link
2025-06-15 06:38:00 ·

CERT Director Greg Touhill: To Lead Is to Serve

Greg Touhill, director of the Software Engineering’s Institute’sComputer Emergency Response Teamdivision is an atypical technology leader. For one thing, he’s been in tech and other leadership positions that span the US Air Force, the US government, the private sector and now SEI’s CERT. More importantly, he’s been a major force in the cybersecurity realm, making the world a safer place and even saving lives. Touhill earned a bachelor’s degree from the Pennsylvania State University, a master’s degree from the University of Southern California, a master’s degree from the Air War College, was a senior executive fellow at the Harvard University Kennedy School of Government and completed executive education studies at the University of North Carolina. “I was a student intern at Carnegie Mellon, but I was going to college at Penn State and studying chemical engineering. As an Air Force ROTC scholarship recipient, I knew I was going to become an Air Force officer but soon realized that I didn’t necessarily want to be a chemical engineer in the Air Force,” says Touhill. “Because I passed all the mathematics, physics, and engineering courses, I ended up becoming a communications, electronics, and computer systems officer in the Air Force. I spent 30 years, one month and three days on active duty in the United States Air Force, eventually retiring as a brigadier general and having done many different types of jobs that were available to me within and even beyond my career field.” Related:Specifically, he was an operational commander at the squadron, group, and wing levels. For example, as a colonel, Touhill served as director of command, control, communications and computersfor the United States Central Command Forces, then he was appointed chief information officer and director, communications and information at Air Mobility Command. Later, he served as commander, 81st Training Wing at Kessler Air Force Base where he was promoted to brigadier general and commanded over 12,500 personnel. After that, he served as the senior defense officer and US defense attaché at the US Embassy in Kuwait, before concluding his military career as the chief information officer and director, C4 systems at the US Transportation Command, one of 10 US combatant commands, where he and his team were awarded the NSA Rowlett Award for the best cybersecurity program in the government. While in the Air Force, Touhill received numerous awards and decorations including the Bronze Star medal and the Air Force Science and Engineering Award. He is the only three-time recipient of the USAF C4 Professionalism Award. Related:Greg Touhill“I got to serve at major combatant commands, work with coalition partners from many different countries and represented the US as part of a diplomatic mission to Kuwait for two years as the senior defense official at a time when America was withdrawing forces out of Iraq. I also led the negotiation of a new bilateral defense agreement with the Kuwaitis,” says Touhill. “Then I was recruited to continue my service and was asked to serve as the deputy assistant secretary of cybersecurity and communications at the Department of Homeland Security, where I ran the operations of what is now known as the Cybersecurity and Infrastructure Security Agency. I was there at a pivotal moment because we were building up the capacity of that organization and setting the stage for it to become its own agency.” While at DHS, there were many noteworthy breaches including the infamous US Office of People Managementbreach. Those events led to Obama’s visit to the National Cybersecurity and Communications Integration Center.  “I got to brief the president on the state of cybersecurity, what we had seen with the OPM breach and some other deficiencies,” says Touhill. “I was on the federal CIO council as the cybersecurity advisor to that since I’d been a federal CIO before and I got to conclude my federal career by being the first United States government chief information security officer. From there, I pivoted to industry, but I also got to return to Carnegie Mellon as a faculty member at Carnegie Mellon’s Heinz College, where I've been teaching since January 2017.” Related:Touhill has been involved in three startups, two of which were successfully acquired. He also served on three Fortune 100 advisory boards and on the Information Systems Audit and Control Association board, eventually becoming its chair for a term during the seven years he served there. Touhill just celebrated his fourth year at CERT, which he considers the pinnacle of the cybersecurity profession and everything he’s done to date. “Over my career I've led teams that have done major software builds in the national security space. I've also been the guy who's pulled cables and set up routers, hubs and switches, and I've been a system administrator. I've done everything that I could do from the keyboard up all the way up to the White House,” says Touhill. “For 40 years, the Software Engineering Institute has been leading the world in secure by design, cybersecurity, software engineering, artificial intelligence and engineering, pioneering best practices, and figuring out how to make the world a safer more secure and trustworthy place. I’ve had a hand in the making of today’s modern military and government information technology environment, beginning as a 22-year-old lieutenant, and hope to inspire the next generation to do even better.” What ‘Success’ Means Many people would be satisfied with their careers as a brigadier general, a tech leader, the White House’s first anything, or working at CERT, let alone running it. Touhill has spent his entire career making the world a safer place, so it’s not surprising that he considers his greatest achievement saving lives. “In the Middle East and Iraq, convoys were being attacked with improvised explosive devices. There were also ‘direct fire’ attacks where people are firing weapons at you and indirect fire attacks where you could be in the line of fire,” says Touhill. “The convoys were using SINCGARS line-of-site walkie-talkies for communications that are most effective when the ground is flat, and Iraq is not flat. As a result, our troops were at risk of not having reliable communications while under attack. As my team brainstormed options to remedy the situation, one of my guys found some technology, about the size of an iPhone, that could covert a radio signal, which is basically a waveform, into a digital pulse I could put on a dedicated network to support the convoy missions.” For million, Touhill and his team quickly architected, tested, and fielded the Radio over IP networkthat had a 99% reliability rate anywhere in Iraq. Better still, convoys could communicate over the network using any radios. That solution saved a minimum of six lives. In one case, the hospital doctor said if the patient had arrived five minutes later, he would have died. Sage Advice Anyone who has ever spent time in the military or in a military family knows that soldiers are very well disciplined, or they wash out. Other traits include being physically fit, mentally fit, and achieving balance in life, though that’s difficult to achieve in combat. Still, it’s a necessity. “I served three and a half years down range in combat operations. My experience taught me you could be doing 20-hour days for a year or two on end. If you haven’t built a good foundation of being disciplined and fit, it impacts your ability to maintain presence in times of stress, and CISOs work in stressful situations,” says Touhill. “Staying fit also fortifies you for the long haul, so you don’t get burned out as fast.” Another necessary skill is the ability to work well with others.  “Cybersecurity is an interdisciplinary practice. One of the great joys I have as CERT director is the wide range of experts in many different fields that include software engineers, computer engineers, computer scientists, data scientists, mathematicians and physicists,” says Touhill. “I have folks who have business degrees and others who have philosophy degrees. It's really a rich community of interests all coming together towards that common goal of making the world a safer, more secure and more trusted place in the cyber domain. We’re are kind of like the cyber neighborhood watch for the whole world.” He also says that money isn’t everything, having taken a pay cut to go from being an Air Force brigadier general to the deputy assistant secretary of the Department of Homeland Security . “You’ll always do well if you pick the job that matters most. That’s what I did, and I’ve been rewarded every step,” says Touhill.  The biggest challenge he sees is the complexity of cyber systems and software, which can have second, third, and fourth order effects.  “Complexity raises the cost of the attack surface, increases the attack surface, raises the number of vulnerabilities and exploits human weaknesses,” says Touhill. “The No. 1 thing we need to be paying attention to is privacy when it comes to AI because AI can unearth and discover knowledge from data we already have. While it gives us greater insights at greater velocities, we need to be careful that we take precautions to better protect our privacy, civil rights and civil liberties.”
#cert #director #greg #touhill #lead

CERT Director Greg Touhill: To Lead Is to Serve
Greg Touhill, director of the Software Engineering’s Institute’sComputer Emergency Response Teamdivision is an atypical technology leader. For one thing, he’s been in tech and other leadership positions that span the US Air Force, the US government, the private sector and now SEI’s CERT. More importantly, he’s been a major force in the cybersecurity realm, making the world a safer place and even saving lives. Touhill earned a bachelor’s degree from the Pennsylvania State University, a master’s degree from the University of Southern California, a master’s degree from the Air War College, was a senior executive fellow at the Harvard University Kennedy School of Government and completed executive education studies at the University of North Carolina. “I was a student intern at Carnegie Mellon, but I was going to college at Penn State and studying chemical engineering. As an Air Force ROTC scholarship recipient, I knew I was going to become an Air Force officer but soon realized that I didn’t necessarily want to be a chemical engineer in the Air Force,” says Touhill. “Because I passed all the mathematics, physics, and engineering courses, I ended up becoming a communications, electronics, and computer systems officer in the Air Force. I spent 30 years, one month and three days on active duty in the United States Air Force, eventually retiring as a brigadier general and having done many different types of jobs that were available to me within and even beyond my career field.” Related:Specifically, he was an operational commander at the squadron, group, and wing levels. For example, as a colonel, Touhill served as director of command, control, communications and computersfor the United States Central Command Forces, then he was appointed chief information officer and director, communications and information at Air Mobility Command. Later, he served as commander, 81st Training Wing at Kessler Air Force Base where he was promoted to brigadier general and commanded over 12,500 personnel. After that, he served as the senior defense officer and US defense attaché at the US Embassy in Kuwait, before concluding his military career as the chief information officer and director, C4 systems at the US Transportation Command, one of 10 US combatant commands, where he and his team were awarded the NSA Rowlett Award for the best cybersecurity program in the government. While in the Air Force, Touhill received numerous awards and decorations including the Bronze Star medal and the Air Force Science and Engineering Award. He is the only three-time recipient of the USAF C4 Professionalism Award. Related:Greg Touhill“I got to serve at major combatant commands, work with coalition partners from many different countries and represented the US as part of a diplomatic mission to Kuwait for two years as the senior defense official at a time when America was withdrawing forces out of Iraq. I also led the negotiation of a new bilateral defense agreement with the Kuwaitis,” says Touhill. “Then I was recruited to continue my service and was asked to serve as the deputy assistant secretary of cybersecurity and communications at the Department of Homeland Security, where I ran the operations of what is now known as the Cybersecurity and Infrastructure Security Agency. I was there at a pivotal moment because we were building up the capacity of that organization and setting the stage for it to become its own agency.” While at DHS, there were many noteworthy breaches including the infamous US Office of People Managementbreach. Those events led to Obama’s visit to the National Cybersecurity and Communications Integration Center.  “I got to brief the president on the state of cybersecurity, what we had seen with the OPM breach and some other deficiencies,” says Touhill. “I was on the federal CIO council as the cybersecurity advisor to that since I’d been a federal CIO before and I got to conclude my federal career by being the first United States government chief information security officer. From there, I pivoted to industry, but I also got to return to Carnegie Mellon as a faculty member at Carnegie Mellon’s Heinz College, where I've been teaching since January 2017.” Related:Touhill has been involved in three startups, two of which were successfully acquired. He also served on three Fortune 100 advisory boards and on the Information Systems Audit and Control Association board, eventually becoming its chair for a term during the seven years he served there. Touhill just celebrated his fourth year at CERT, which he considers the pinnacle of the cybersecurity profession and everything he’s done to date. “Over my career I've led teams that have done major software builds in the national security space. I've also been the guy who's pulled cables and set up routers, hubs and switches, and I've been a system administrator. I've done everything that I could do from the keyboard up all the way up to the White House,” says Touhill. “For 40 years, the Software Engineering Institute has been leading the world in secure by design, cybersecurity, software engineering, artificial intelligence and engineering, pioneering best practices, and figuring out how to make the world a safer more secure and trustworthy place. I’ve had a hand in the making of today’s modern military and government information technology environment, beginning as a 22-year-old lieutenant, and hope to inspire the next generation to do even better.” What ‘Success’ Means Many people would be satisfied with their careers as a brigadier general, a tech leader, the White House’s first anything, or working at CERT, let alone running it. Touhill has spent his entire career making the world a safer place, so it’s not surprising that he considers his greatest achievement saving lives. “In the Middle East and Iraq, convoys were being attacked with improvised explosive devices. There were also ‘direct fire’ attacks where people are firing weapons at you and indirect fire attacks where you could be in the line of fire,” says Touhill. “The convoys were using SINCGARS line-of-site walkie-talkies for communications that are most effective when the ground is flat, and Iraq is not flat. As a result, our troops were at risk of not having reliable communications while under attack. As my team brainstormed options to remedy the situation, one of my guys found some technology, about the size of an iPhone, that could covert a radio signal, which is basically a waveform, into a digital pulse I could put on a dedicated network to support the convoy missions.” For million, Touhill and his team quickly architected, tested, and fielded the Radio over IP networkthat had a 99% reliability rate anywhere in Iraq. Better still, convoys could communicate over the network using any radios. That solution saved a minimum of six lives. In one case, the hospital doctor said if the patient had arrived five minutes later, he would have died. Sage Advice Anyone who has ever spent time in the military or in a military family knows that soldiers are very well disciplined, or they wash out. Other traits include being physically fit, mentally fit, and achieving balance in life, though that’s difficult to achieve in combat. Still, it’s a necessity. “I served three and a half years down range in combat operations. My experience taught me you could be doing 20-hour days for a year or two on end. If you haven’t built a good foundation of being disciplined and fit, it impacts your ability to maintain presence in times of stress, and CISOs work in stressful situations,” says Touhill. “Staying fit also fortifies you for the long haul, so you don’t get burned out as fast.” Another necessary skill is the ability to work well with others.  “Cybersecurity is an interdisciplinary practice. One of the great joys I have as CERT director is the wide range of experts in many different fields that include software engineers, computer engineers, computer scientists, data scientists, mathematicians and physicists,” says Touhill. “I have folks who have business degrees and others who have philosophy degrees. It's really a rich community of interests all coming together towards that common goal of making the world a safer, more secure and more trusted place in the cyber domain. We’re are kind of like the cyber neighborhood watch for the whole world.” He also says that money isn’t everything, having taken a pay cut to go from being an Air Force brigadier general to the deputy assistant secretary of the Department of Homeland Security . “You’ll always do well if you pick the job that matters most. That’s what I did, and I’ve been rewarded every step,” says Touhill.  The biggest challenge he sees is the complexity of cyber systems and software, which can have second, third, and fourth order effects.  “Complexity raises the cost of the attack surface, increases the attack surface, raises the number of vulnerabilities and exploits human weaknesses,” says Touhill. “The No. 1 thing we need to be paying attention to is privacy when it comes to AI because AI can unearth and discover knowledge from data we already have. While it gives us greater insights at greater velocities, we need to be careful that we take precautions to better protect our privacy, civil rights and civil liberties.” #cert #director #greg #touhill #lead

CERT Director Greg Touhill: To Lead Is to Serve

www.informationweek.com
Greg Touhill, director of the Software Engineering’s Institute’s (SEI’s) Computer Emergency Response Team (CERT) division is an atypical technology leader. For one thing, he’s been in tech and other leadership positions that span the US Air Force, the US government, the private sector and now SEI’s CERT. More importantly, he’s been a major force in the cybersecurity realm, making the world a safer place and even saving lives. Touhill earned a bachelor’s degree from the Pennsylvania State University, a master’s degree from the University of Southern California, a master’s degree from the Air War College, was a senior executive fellow at the Harvard University Kennedy School of Government and completed executive education studies at the University of North Carolina. “I was a student intern at Carnegie Mellon, but I was going to college at Penn State and studying chemical engineering. As an Air Force ROTC scholarship recipient, I knew I was going to become an Air Force officer but soon realized that I didn’t necessarily want to be a chemical engineer in the Air Force,” says Touhill. “Because I passed all the mathematics, physics, and engineering courses, I ended up becoming a communications, electronics, and computer systems officer in the Air Force. I spent 30 years, one month and three days on active duty in the United States Air Force, eventually retiring as a brigadier general and having done many different types of jobs that were available to me within and even beyond my career field.” Related:Specifically, he was an operational commander at the squadron, group, and wing levels. For example, as a colonel, Touhill served as director of command, control, communications and computers (C4) for the United States Central Command Forces, then he was appointed chief information officer and director, communications and information at Air Mobility Command. Later, he served as commander, 81st Training Wing at Kessler Air Force Base where he was promoted to brigadier general and commanded over 12,500 personnel. After that, he served as the senior defense officer and US defense attaché at the US Embassy in Kuwait, before concluding his military career as the chief information officer and director, C4 systems at the US Transportation Command, one of 10 US combatant commands, where he and his team were awarded the NSA Rowlett Award for the best cybersecurity program in the government. While in the Air Force, Touhill received numerous awards and decorations including the Bronze Star medal and the Air Force Science and Engineering Award. He is the only three-time recipient of the USAF C4 Professionalism Award. Related:Greg Touhill“I got to serve at major combatant commands, work with coalition partners from many different countries and represented the US as part of a diplomatic mission to Kuwait for two years as the senior defense official at a time when America was withdrawing forces out of Iraq. I also led the negotiation of a new bilateral defense agreement with the Kuwaitis,” says Touhill. “Then I was recruited to continue my service and was asked to serve as the deputy assistant secretary of cybersecurity and communications at the Department of Homeland Security, where I ran the operations of what is now known as the Cybersecurity and Infrastructure Security Agency. I was there at a pivotal moment because we were building up the capacity of that organization and setting the stage for it to become its own agency.” While at DHS, there were many noteworthy breaches including the infamous US Office of People Management (OPM) breach. Those events led to Obama’s visit to the National Cybersecurity and Communications Integration Center.  “I got to brief the president on the state of cybersecurity, what we had seen with the OPM breach and some other deficiencies,” says Touhill. “I was on the federal CIO council as the cybersecurity advisor to that since I’d been a federal CIO before and I got to conclude my federal career by being the first United States government chief information security officer. From there, I pivoted to industry, but I also got to return to Carnegie Mellon as a faculty member at Carnegie Mellon’s Heinz College, where I've been teaching since January 2017.” Related:Touhill has been involved in three startups, two of which were successfully acquired. He also served on three Fortune 100 advisory boards and on the Information Systems Audit and Control Association board, eventually becoming its chair for a term during the seven years he served there. Touhill just celebrated his fourth year at CERT, which he considers the pinnacle of the cybersecurity profession and everything he’s done to date. “Over my career I've led teams that have done major software builds in the national security space. I've also been the guy who's pulled cables and set up routers, hubs and switches, and I've been a system administrator. I've done everything that I could do from the keyboard up all the way up to the White House,” says Touhill. “For 40 years, the Software Engineering Institute has been leading the world in secure by design, cybersecurity, software engineering, artificial intelligence and engineering, pioneering best practices, and figuring out how to make the world a safer more secure and trustworthy place. I’ve had a hand in the making of today’s modern military and government information technology environment, beginning as a 22-year-old lieutenant, and hope to inspire the next generation to do even better.” What ‘Success’ Means Many people would be satisfied with their careers as a brigadier general, a tech leader, the White House’s first anything, or working at CERT, let alone running it. Touhill has spent his entire career making the world a safer place, so it’s not surprising that he considers his greatest achievement saving lives. “In the Middle East and Iraq, convoys were being attacked with improvised explosive devices. There were also ‘direct fire’ attacks where people are firing weapons at you and indirect fire attacks where you could be in the line of fire,” says Touhill. “The convoys were using SINCGARS line-of-site walkie-talkies for communications that are most effective when the ground is flat, and Iraq is not flat. As a result, our troops were at risk of not having reliable communications while under attack. As my team brainstormed options to remedy the situation, one of my guys found some technology, about the size of an iPhone, that could covert a radio signal, which is basically a waveform, into a digital pulse I could put on a dedicated network to support the convoy missions.” For $11 million, Touhill and his team quickly architected, tested, and fielded the Radio over IP network (aka “Ripper Net”) that had a 99% reliability rate anywhere in Iraq. Better still, convoys could communicate over the network using any radios. That solution saved a minimum of six lives. In one case, the hospital doctor said if the patient had arrived five minutes later, he would have died. Sage Advice Anyone who has ever spent time in the military or in a military family knows that soldiers are very well disciplined, or they wash out. Other traits include being physically fit, mentally fit, and achieving balance in life, though that’s difficult to achieve in combat. Still, it’s a necessity. “I served three and a half years down range in combat operations. My experience taught me you could be doing 20-hour days for a year or two on end. If you haven’t built a good foundation of being disciplined and fit, it impacts your ability to maintain presence in times of stress, and CISOs work in stressful situations,” says Touhill. “Staying fit also fortifies you for the long haul, so you don’t get burned out as fast.” Another necessary skill is the ability to work well with others.  “Cybersecurity is an interdisciplinary practice. One of the great joys I have as CERT director is the wide range of experts in many different fields that include software engineers, computer engineers, computer scientists, data scientists, mathematicians and physicists,” says Touhill. “I have folks who have business degrees and others who have philosophy degrees. It's really a rich community of interests all coming together towards that common goal of making the world a safer, more secure and more trusted place in the cyber domain. We’re are kind of like the cyber neighborhood watch for the whole world.” He also says that money isn’t everything, having taken a pay cut to go from being an Air Force brigadier general to the deputy assistant secretary of the Department of Homeland Security . “You’ll always do well if you pick the job that matters most. That’s what I did, and I’ve been rewarded every step,” says Touhill.  The biggest challenge he sees is the complexity of cyber systems and software, which can have second, third, and fourth order effects.  “Complexity raises the cost of the attack surface, increases the attack surface, raises the number of vulnerabilities and exploits human weaknesses,” says Touhill. “The No. 1 thing we need to be paying attention to is privacy when it comes to AI because AI can unearth and discover knowledge from data we already have. While it gives us greater insights at greater velocities, we need to be careful that we take precautions to better protect our privacy, civil rights and civil liberties.”

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça Login para curtir, compartilhar e comentar!
MIT Technology Review @MITT compartilhou um link
2025-06-05 08:18:59 ·

The Download: AI’s role in math, and calculating its energy footprint

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

What’s next for AI and math

The modern world is built on mathematics. Math lets us model complex systems such as the way air flows around an aircraft, the way financial markets fluctuate, and the way blood flows through the heart. Mathematicians have used computers for decades, but the new vision is that AI might help them crack problems that were previously uncrackable.

However, there’s a huge difference between AI that can solve the kinds of problems set in high school—math that the latest generation of models has already mastered—and AI that couldsolve the kinds of problems that professional mathematicians spend careers chipping away at. Here are three ways to understand that gulf.

—Will Douglas HeavenThis story is from our What’s Next series, which looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

Inside the effort to tally AI’s energy appetite

—James O’Donnell

After working on it for months, my colleague Casey Crownhart and I finally saw our story on AI’s energy and emissions burden go live last week.

The initial goal sounded simple: Calculate how much energy is used when we interact with a chatbot, then tally that up to understand why leaders in tech and politics are so keen to harness unprecedented levels of electricity to power AI and reshape our energy grids in the process.It was, of course, not so simple. After speaking with dozens of researchers, we realized that the common understanding of AI’s energy appetite is full of holes. I encourage you to read the full story, which has some incredible graphics to help you understand this topic. But here are three takeaways I have after the project.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get it in your inbox first, sign up here, and check out the rest of our Power Hungry package about AI here.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Elon Musk has turned on Trump He called Trump’s domestic policy agenda a “disgusting abomination.”+ House Speaker Mike Johnson has, naturally, hit back. 2 NASA is in crisisIts budget has been cut by a quarter, and now its new leader has had his nomination revoked.+ What’s next for NASA’s giant moon rocket? 3 Here’s how Big Tech plans to wield AITo build ‘everything apps’ that keep you inside their ecosystem, forever.+ The trouble is, the experience isn’t always slick enough, as Google has discovered with its ‘Ask Photos’ feature.+ How to fight your instinct to blindly trust AI. 4 Meta has signed a 20-year deal to buy nuclear power It’s the latest in a race to try to keep up with AI’s surging energy demands.+ Can nuclear power really fuel the rise of AI?  5 Extreme heat takes a huge toll on people’s mental healthIt’s yet another issue we’re failing to prepare for, as summers get hotter and hotter.+ The quest to protect farmworkers from extreme heat. 6 China’s robotaxi companies are planning to expand in the Middle East And they’re getting a warmer welcome than in the US or Europe.+ China’s EV giants are also betting big on humanoid robots. 7 AI will supercharge hackersThe full impact of new AI techniques is yet to be felt, but experts say it’s only a matter of time.+ Five ways criminals are using AI. 8 It’s an exciting time to be working on Alzheimer’s treatments 12 of them are moving to the final phase of clinical trials this year.+ The innovation that gets an Alzheimer’s drug through the blood-brain barrier. 9 Workers are being subjected to more and more surveillanceNot just in the gig economy either—’bossware’ is increasingly appearing in offices too.10 Noughties nostalgia is rife on TikTokIt was a pretty fun decade, to be fair.Quote of the day

“This is scientific heaven. Or it used to be.”

—Tom Rapoport, a 77-year-old Harvard Medical School professor from Germany, expresses his sadness about Trump’s cuts to US science funding to the New York Times.

One more thing

OLCF

What’s next for the world’s fastest supercomputers

When the Frontier supercomputer came online in 2022, it marked the dawn of so-called exascale computing, with machines that can execute an exaflop—or a quintillionfloating point operations a second.Since then, scientists have geared up to make more of these blazingly fast computers: several exascale machines are due to come online in the US and Europe.But speed itself isn’t the endgame. Researchers hope to pursue previously unanswerable questions about nature—and to design new technologies in areas from transportation to medicine. Read the full story.

—Sophia Chen

We can still have nice things

A place for comfort, fun and distraction to brighten up your day.+ If tracking tube trains in London is your thing, you’ll love this live map.+ Take a truly bonkers trip down memory lane, courtesy of these FBI artifacts.+ Netflix’s Frankenstein looks pretty intense.+ Why landlines are so darn spooky
#download #ais #role #math #calculating

The Download: AI’s role in math, and calculating its energy footprint
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. What’s next for AI and math The modern world is built on mathematics. Math lets us model complex systems such as the way air flows around an aircraft, the way financial markets fluctuate, and the way blood flows through the heart. Mathematicians have used computers for decades, but the new vision is that AI might help them crack problems that were previously uncrackable.   However, there’s a huge difference between AI that can solve the kinds of problems set in high school—math that the latest generation of models has already mastered—and AI that couldsolve the kinds of problems that professional mathematicians spend careers chipping away at. Here are three ways to understand that gulf. —Will Douglas HeavenThis story is from our What’s Next series, which looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here. Inside the effort to tally AI’s energy appetite —James O’Donnell After working on it for months, my colleague Casey Crownhart and I finally saw our story on AI’s energy and emissions burden go live last week. The initial goal sounded simple: Calculate how much energy is used when we interact with a chatbot, then tally that up to understand why leaders in tech and politics are so keen to harness unprecedented levels of electricity to power AI and reshape our energy grids in the process.It was, of course, not so simple. After speaking with dozens of researchers, we realized that the common understanding of AI’s energy appetite is full of holes. I encourage you to read the full story, which has some incredible graphics to help you understand this topic. But here are three takeaways I have after the project. This story originally appeared in The Algorithm, our weekly newsletter on AI. To get it in your inbox first, sign up here, and check out the rest of our Power Hungry package about AI here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Elon Musk has turned on Trump He called Trump’s domestic policy agenda a “disgusting abomination.”+ House Speaker Mike Johnson has, naturally, hit back. 2 NASA is in crisisIts budget has been cut by a quarter, and now its new leader has had his nomination revoked.+ What’s next for NASA’s giant moon rocket? 3 Here’s how Big Tech plans to wield AITo build ‘everything apps’ that keep you inside their ecosystem, forever.+ The trouble is, the experience isn’t always slick enough, as Google has discovered with its ‘Ask Photos’ feature.+ How to fight your instinct to blindly trust AI. 4 Meta has signed a 20-year deal to buy nuclear power It’s the latest in a race to try to keep up with AI’s surging energy demands.+ Can nuclear power really fuel the rise of AI?  5 Extreme heat takes a huge toll on people’s mental healthIt’s yet another issue we’re failing to prepare for, as summers get hotter and hotter.+ The quest to protect farmworkers from extreme heat. 6 China’s robotaxi companies are planning to expand in the Middle East And they’re getting a warmer welcome than in the US or Europe.+ China’s EV giants are also betting big on humanoid robots. 7 AI will supercharge hackersThe full impact of new AI techniques is yet to be felt, but experts say it’s only a matter of time.+ Five ways criminals are using AI. 8 It’s an exciting time to be working on Alzheimer’s treatments 12 of them are moving to the final phase of clinical trials this year.+ The innovation that gets an Alzheimer’s drug through the blood-brain barrier. 9 Workers are being subjected to more and more surveillanceNot just in the gig economy either—’bossware’ is increasingly appearing in offices too.10 Noughties nostalgia is rife on TikTokIt was a pretty fun decade, to be fair.Quote of the day “This is scientific heaven. Or it used to be.” —Tom Rapoport, a 77-year-old Harvard Medical School professor from Germany, expresses his sadness about Trump’s cuts to US science funding to the New York Times. One more thing OLCF What’s next for the world’s fastest supercomputers When the Frontier supercomputer came online in 2022, it marked the dawn of so-called exascale computing, with machines that can execute an exaflop—or a quintillionfloating point operations a second.Since then, scientists have geared up to make more of these blazingly fast computers: several exascale machines are due to come online in the US and Europe.But speed itself isn’t the endgame. Researchers hope to pursue previously unanswerable questions about nature—and to design new technologies in areas from transportation to medicine. Read the full story. —Sophia Chen We can still have nice things A place for comfort, fun and distraction to brighten up your day.+ If tracking tube trains in London is your thing, you’ll love this live map.+ Take a truly bonkers trip down memory lane, courtesy of these FBI artifacts.+ Netflix’s Frankenstein looks pretty intense.+ Why landlines are so darn spooky #download #ais #role #math #calculating

The Download: AI’s role in math, and calculating its energy footprint

www.technologyreview.com
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. What’s next for AI and math The modern world is built on mathematics. Math lets us model complex systems such as the way air flows around an aircraft, the way financial markets fluctuate, and the way blood flows through the heart. Mathematicians have used computers for decades, but the new vision is that AI might help them crack problems that were previously uncrackable.   However, there’s a huge difference between AI that can solve the kinds of problems set in high school—math that the latest generation of models has already mastered—and AI that could (in theory) solve the kinds of problems that professional mathematicians spend careers chipping away at. Here are three ways to understand that gulf. —Will Douglas HeavenThis story is from our What’s Next series, which looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here. Inside the effort to tally AI’s energy appetite —James O’Donnell After working on it for months, my colleague Casey Crownhart and I finally saw our story on AI’s energy and emissions burden go live last week. The initial goal sounded simple: Calculate how much energy is used when we interact with a chatbot, then tally that up to understand why leaders in tech and politics are so keen to harness unprecedented levels of electricity to power AI and reshape our energy grids in the process.It was, of course, not so simple. After speaking with dozens of researchers, we realized that the common understanding of AI’s energy appetite is full of holes. I encourage you to read the full story, which has some incredible graphics to help you understand this topic. But here are three takeaways I have after the project. This story originally appeared in The Algorithm, our weekly newsletter on AI. To get it in your inbox first, sign up here, and check out the rest of our Power Hungry package about AI here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Elon Musk has turned on Trump He called Trump’s domestic policy agenda a “disgusting abomination.” (NYT $)+ House Speaker Mike Johnson has, naturally, hit back. (Insider $) 2 NASA is in crisisIts budget has been cut by a quarter, and now its new leader has had his nomination revoked. (New Scientist $)+ What’s next for NASA’s giant moon rocket? (MIT Technology Review)3 Here’s how Big Tech plans to wield AITo build ‘everything apps’ that keep you inside their ecosystem, forever. (The Atlantic $)+ The trouble is, the experience isn’t always slick enough, as Google has discovered with its ‘Ask Photos’ feature. (The Verge $)+ How to fight your instinct to blindly trust AI. (WP $)4 Meta has signed a 20-year deal to buy nuclear power It’s the latest in a race to try to keep up with AI’s surging energy demands. (ABC)+ Can nuclear power really fuel the rise of AI? (MIT Technology Review) 5 Extreme heat takes a huge toll on people’s mental healthIt’s yet another issue we’re failing to prepare for, as summers get hotter and hotter. (Scientific American $)+ The quest to protect farmworkers from extreme heat. (MIT Technology Review) 6 China’s robotaxi companies are planning to expand in the Middle East And they’re getting a warmer welcome than in the US or Europe. (WSJ $)+ China’s EV giants are also betting big on humanoid robots. (MIT Technology Review)7 AI will supercharge hackersThe full impact of new AI techniques is yet to be felt, but experts say it’s only a matter of time. (Wired $)+ Five ways criminals are using AI. (MIT Technology Review)8 It’s an exciting time to be working on Alzheimer’s treatments 12 of them are moving to the final phase of clinical trials this year. (The Economist $)+ The innovation that gets an Alzheimer’s drug through the blood-brain barrier. (MIT Technology Review)9 Workers are being subjected to more and more surveillanceNot just in the gig economy either—’bossware’ is increasingly appearing in offices too. (Rest of World) 10 Noughties nostalgia is rife on TikTokIt was a pretty fun decade, to be fair. (The Guardian) Quote of the day “This is scientific heaven. Or it used to be.” —Tom Rapoport, a 77-year-old Harvard Medical School professor from Germany, expresses his sadness about Trump’s cuts to US science funding to the New York Times. One more thing OLCF What’s next for the world’s fastest supercomputers When the Frontier supercomputer came online in 2022, it marked the dawn of so-called exascale computing, with machines that can execute an exaflop—or a quintillion (1018) floating point operations a second.Since then, scientists have geared up to make more of these blazingly fast computers: several exascale machines are due to come online in the US and Europe.But speed itself isn’t the endgame. Researchers hope to pursue previously unanswerable questions about nature—and to design new technologies in areas from transportation to medicine. Read the full story. —Sophia Chen We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.) + If tracking tube trains in London is your thing, you’ll love this live map.+ Take a truly bonkers trip down memory lane, courtesy of these FBI artifacts.+ Netflix’s Frankenstein looks pretty intense.+ Why landlines are so darn spooky

227

· 0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça Login para curtir, compartilhar e comentar!
Computerworld UK @ComputerworldUK compartilhou um link
2025-06-01 15:36:47 ·

AI and economic pressures reshape tech jobs amid layoffs

Tech layoffs have continued in 2025. Much of that is being blamed on a combination of a slower economy and the adoption of automation via artificial intelligence.

Nearly four in 10 Americans, for instance, believe generative AIcould diminish the number of available jobs as it advances, according to a study released in October by the New York Federal Reserve Bank.

And the World Economic Forum’s Jobs Initiative study found that close to halfof worker skills will be disrupted in the next five years — and 40% of tasks will be affected by the use of genAI tools and the large language models that underpin them.

In April, the US tech industry lost 214,000 positions as companies shifted toward AI roles and skills-based hiring amid economic uncertainty. Tech sector companies reduced staffing by a net 7,000 positions in April, an analysis of data released by the US Bureau of Labor Statistics showed.

This year, 137 tech companies have fired 62,114 tech employees, according to Layoffs.fyi. Efforts to reduce headcount at government agencies by the unofficial US Department of Government Efficiencysaw an additional 61,296 federal workers fired this year.

Kye Mitchell, president of tech workforce staffing firm Experis US, believes the IT employment market is undergoing a fundamental transformation rather than experiencing traditional cyclical layoffs. Although Experis is seeing a 13% month-over-month decline in traditional software developer postings, it doesn’t represent “job destruction, it’s market evolution,” Mitchell said.

“What we’re witnessing is the emergence of strategic technology orchestrators who harness AI to drive unprecedented business value,” she said.

For example, organizations that once deployed two scrum teams of ten people to develop high-quality software are now achieving superior results with a single team of five AI-empowered developers.

“This isn’t about cutting jobs; it’s about elevating roles,” Mitchell said.

Specialized roles in particular are surging. Database architect positions are up 2,312%, statistician roles have increased 382%, and jobs for mathematicians have increased 1,272%. “These aren’t replacements; they’re vital for an AI-driven future,” she said.

In fact, it’s an IT talent gap, not an employee surplus, that is now challenging organizations — and will continue to do so.

With 76% of IT employers already struggling to find skilled tech talent, the market fundamentals favor skilled professionals, according to Mitchell. “The question isn’t whether there will be IT jobs — it’s whether we can develop the right skills fast enough to meet demand,” she said.

For federal tech workers, outdated systems and slow procurement make it hard to attract and keep top tech talent. Agencies expect fast team deployment but operate with rigid, outdated processes, according to Justin Vianello, CEO of technology workforce development firm SkillStorm.

Long security clearance delays add cost and time, often forcing companies to hire expensive, already-cleared talent. Meanwhile, modern technologists want to use current tools and make an impact — something hard to do with legacy systems and decade-long modernization efforts, he added.

Many suggest turning to AI to will solve the tech talent shortage, but there is no evidence that AI will lead to a reduction in demand for tech talent, Vianello said. “On the contrary, companies see that the demand for tech talent has increased as they invest in preparing their workforce to properly use AI tools,” he said.

A shortage of qualified talent is a bigger barrier to hiring than AI automation, he said, because organizations struggle to find candidates with the right certifications, skills, and clearances — especially in cloud, cybersecurity, and AI. Tech workers often lack skills in these areas because technology evolves faster than education and training can keep up, Vianello said. And while AI helps automate routine tasks, it can’t replace the strategic roles filled by skilled professionals.

Seven out of 10 US organizations are struggling to find skilled workers to fill roles in an ever-evolving digital transformation landscape, and genAI has added to that headache, according to a ManpowerGroup survey released earlier this year.

Job postings for AI skills surged 2,000% in 2024, but education and training in this area haven’t kept pace, according to Kelly Stratman, global ecosystem relationships enablement leader at Ernst & Young.

“As formal education and training in AI skills still lag, it results in a shortage of AI talent that can effectively manage these technologies and demands,” she said in an earlier interview. “The AI talent shortage is most prominent among highly technical roles like data scientists/analysts, machine learning engineers, and software developers.”

Economic uncertainty is creating a cautious hiring environment, but it’s more complex than tariffs alone. Experis data shows employers adopting a “wait and watch” stance as they monitor economic signals, with job openings down 11% year-over-year, according to Mitchell.

“However, the bigger story is strategic workforce planning in an era of rapid technological change. Companies are being incredibly precise about where they allocate resources. Not because of economic pressure alone, but because the skills landscape is shifting so rapidly,” Mitchell said. “They’re prioritizing mission-critical roles while restructuring others around AI capabilities.”

Top organizations see AI as a strategic shift, not just cost-cutting. Cutting talent now risks weakening core areas like cybersecurity, according to Mitchell.

Skillstorm’s Vianello suggests that IT job hunters should begin to upgrade their skills with certifications that matter: AWS, Azure, CISSP, Security+, and AI/ML credentials open doors quickly, he said.

“Veterans, in particular, have an edge; they bring leadership, discipline, and security clearances. Apprenticeships and fellowships offer a fast track into full-time roles by giving you experience that actually counts. And don’t overlook the intangibles: soft skills and project leadership are what elevate technologists into impact-makers,” Vianello said.

Skills-based hiring has been on the rise for several years, as organizations seek to fill specific needs for big data analytics, programing, and AI prompt engineering. In fact, demand for genAI courses is surging, passing all other tech skills courses spanning fields from data science to cybersecurity, project management, and marketing.

“AI isn’t replacing jobs — it’s fundamentally redefining how work gets done. The break point where technology truly displaces a position is when roughly 80% of tasks can be fully automated,” Mitchell said. “We’re nowhere near that threshold for most roles. Instead, we’re seeing AI augment skill sets and make professionals more capable, faster, and able to focus on higher-value work.”

Leaders use AI as a strategic enabler — embedding it to enhance, not compete with, human developers, she said.

Some industry forecasts predict a 30% productivity boost from AI tools, potentially adding more than trillion to global GDP.

For example, AI tools are expected to perform the lion’s share of coding. Techniques where humans use AI-augmented coding tools, such as “vibe coding,” are set to revolutionize software development by creating source code, generating tests automatically, and freeing up developer time for innovation instead of debugging code.

With vibe coding, developers use natural language in a conversational way that prompts the AI model to offer contextual ideas and generate code based on the conversation.

By 2028, 75% of professional developers will be using vibe coding and other genAI-powered coding tools, up from less than 10% in September 2023, according to Gartner Research. And within three years, 80% of enterprises will have integrated AI-augmented testing tools into their software engineering tool chain — a significant increase from approximately 15% early last year, Gartner said.

A report from MIT Technology Review Insights found that 94% of business leaders now use genAI in software development, with 82% applying it in multiple stages — and 26% in four or more.

Some industry experts place genAI’s use in creating code much higher. “What we are finding is that we’re three to six months from a world where AI is writing 90% of the code. And then in 12 months, we may be in a world where AI is writing essentially all of the code,” Anthropic CEO Dario Amodei said in a recent report and video interview.

“The realtransformation is in role evolution. Developers are becoming strategic technology orchestrators,” Mitchell from Experis said. “Data professionals are becoming business problem solvers. The demand isn’t disappearing; it’s becoming more sophisticated and more valuable.

“In today’s economic climate, having the right tech talent with AI-enhanced capabilities isn’t a nice-to-have, it’s your competitive edge,” she said.
#economic #pressures #reshape #tech #jobs

AI and economic pressures reshape tech jobs amid layoffs
Tech layoffs have continued in 2025. Much of that is being blamed on a combination of a slower economy and the adoption of automation via artificial intelligence. Nearly four in 10 Americans, for instance, believe generative AIcould diminish the number of available jobs as it advances, according to a study released in October by the New York Federal Reserve Bank. And the World Economic Forum’s Jobs Initiative study found that close to halfof worker skills will be disrupted in the next five years — and 40% of tasks will be affected by the use of genAI tools and the large language models that underpin them. In April, the US tech industry lost 214,000 positions as companies shifted toward AI roles and skills-based hiring amid economic uncertainty. Tech sector companies reduced staffing by a net 7,000 positions in April, an analysis of data released by the US Bureau of Labor Statistics showed. This year, 137 tech companies have fired 62,114 tech employees, according to Layoffs.fyi. Efforts to reduce headcount at government agencies by the unofficial US Department of Government Efficiencysaw an additional 61,296 federal workers fired this year. Kye Mitchell, president of tech workforce staffing firm Experis US, believes the IT employment market is undergoing a fundamental transformation rather than experiencing traditional cyclical layoffs. Although Experis is seeing a 13% month-over-month decline in traditional software developer postings, it doesn’t represent “job destruction, it’s market evolution,” Mitchell said. “What we’re witnessing is the emergence of strategic technology orchestrators who harness AI to drive unprecedented business value,” she said. For example, organizations that once deployed two scrum teams of ten people to develop high-quality software are now achieving superior results with a single team of five AI-empowered developers. “This isn’t about cutting jobs; it’s about elevating roles,” Mitchell said. Specialized roles in particular are surging. Database architect positions are up 2,312%, statistician roles have increased 382%, and jobs for mathematicians have increased 1,272%. “These aren’t replacements; they’re vital for an AI-driven future,” she said. In fact, it’s an IT talent gap, not an employee surplus, that is now challenging organizations — and will continue to do so. With 76% of IT employers already struggling to find skilled tech talent, the market fundamentals favor skilled professionals, according to Mitchell. “The question isn’t whether there will be IT jobs — it’s whether we can develop the right skills fast enough to meet demand,” she said. For federal tech workers, outdated systems and slow procurement make it hard to attract and keep top tech talent. Agencies expect fast team deployment but operate with rigid, outdated processes, according to Justin Vianello, CEO of technology workforce development firm SkillStorm. Long security clearance delays add cost and time, often forcing companies to hire expensive, already-cleared talent. Meanwhile, modern technologists want to use current tools and make an impact — something hard to do with legacy systems and decade-long modernization efforts, he added. Many suggest turning to AI to will solve the tech talent shortage, but there is no evidence that AI will lead to a reduction in demand for tech talent, Vianello said. “On the contrary, companies see that the demand for tech talent has increased as they invest in preparing their workforce to properly use AI tools,” he said. A shortage of qualified talent is a bigger barrier to hiring than AI automation, he said, because organizations struggle to find candidates with the right certifications, skills, and clearances — especially in cloud, cybersecurity, and AI. Tech workers often lack skills in these areas because technology evolves faster than education and training can keep up, Vianello said. And while AI helps automate routine tasks, it can’t replace the strategic roles filled by skilled professionals. Seven out of 10 US organizations are struggling to find skilled workers to fill roles in an ever-evolving digital transformation landscape, and genAI has added to that headache, according to a ManpowerGroup survey released earlier this year. Job postings for AI skills surged 2,000% in 2024, but education and training in this area haven’t kept pace, according to Kelly Stratman, global ecosystem relationships enablement leader at Ernst & Young. “As formal education and training in AI skills still lag, it results in a shortage of AI talent that can effectively manage these technologies and demands,” she said in an earlier interview. “The AI talent shortage is most prominent among highly technical roles like data scientists/analysts, machine learning engineers, and software developers.” Economic uncertainty is creating a cautious hiring environment, but it’s more complex than tariffs alone. Experis data shows employers adopting a “wait and watch” stance as they monitor economic signals, with job openings down 11% year-over-year, according to Mitchell. “However, the bigger story is strategic workforce planning in an era of rapid technological change. Companies are being incredibly precise about where they allocate resources. Not because of economic pressure alone, but because the skills landscape is shifting so rapidly,” Mitchell said. “They’re prioritizing mission-critical roles while restructuring others around AI capabilities.” Top organizations see AI as a strategic shift, not just cost-cutting. Cutting talent now risks weakening core areas like cybersecurity, according to Mitchell. Skillstorm’s Vianello suggests that IT job hunters should begin to upgrade their skills with certifications that matter: AWS, Azure, CISSP, Security+, and AI/ML credentials open doors quickly, he said. “Veterans, in particular, have an edge; they bring leadership, discipline, and security clearances. Apprenticeships and fellowships offer a fast track into full-time roles by giving you experience that actually counts. And don’t overlook the intangibles: soft skills and project leadership are what elevate technologists into impact-makers,” Vianello said. Skills-based hiring has been on the rise for several years, as organizations seek to fill specific needs for big data analytics, programing, and AI prompt engineering. In fact, demand for genAI courses is surging, passing all other tech skills courses spanning fields from data science to cybersecurity, project management, and marketing. “AI isn’t replacing jobs — it’s fundamentally redefining how work gets done. The break point where technology truly displaces a position is when roughly 80% of tasks can be fully automated,” Mitchell said. “We’re nowhere near that threshold for most roles. Instead, we’re seeing AI augment skill sets and make professionals more capable, faster, and able to focus on higher-value work.” Leaders use AI as a strategic enabler — embedding it to enhance, not compete with, human developers, she said. Some industry forecasts predict a 30% productivity boost from AI tools, potentially adding more than trillion to global GDP. For example, AI tools are expected to perform the lion’s share of coding. Techniques where humans use AI-augmented coding tools, such as “vibe coding,” are set to revolutionize software development by creating source code, generating tests automatically, and freeing up developer time for innovation instead of debugging code. With vibe coding, developers use natural language in a conversational way that prompts the AI model to offer contextual ideas and generate code based on the conversation. By 2028, 75% of professional developers will be using vibe coding and other genAI-powered coding tools, up from less than 10% in September 2023, according to Gartner Research. And within three years, 80% of enterprises will have integrated AI-augmented testing tools into their software engineering tool chain — a significant increase from approximately 15% early last year, Gartner said. A report from MIT Technology Review Insights found that 94% of business leaders now use genAI in software development, with 82% applying it in multiple stages — and 26% in four or more. Some industry experts place genAI’s use in creating code much higher. “What we are finding is that we’re three to six months from a world where AI is writing 90% of the code. And then in 12 months, we may be in a world where AI is writing essentially all of the code,” Anthropic CEO Dario Amodei said in a recent report and video interview. “The realtransformation is in role evolution. Developers are becoming strategic technology orchestrators,” Mitchell from Experis said. “Data professionals are becoming business problem solvers. The demand isn’t disappearing; it’s becoming more sophisticated and more valuable. “In today’s economic climate, having the right tech talent with AI-enhanced capabilities isn’t a nice-to-have, it’s your competitive edge,” she said. #economic #pressures #reshape #tech #jobs

AI and economic pressures reshape tech jobs amid layoffs

www.computerworld.com
Tech layoffs have continued in 2025. Much of that is being blamed on a combination of a slower economy and the adoption of automation via artificial intelligence. Nearly four in 10 Americans, for instance, believe generative AI (genAI) could diminish the number of available jobs as it advances, according to a study released in October by the New York Federal Reserve Bank. And the World Economic Forum’s Jobs Initiative study found that close to half (44%) of worker skills will be disrupted in the next five years — and 40% of tasks will be affected by the use of genAI tools and the large language models (LLMs) that underpin them. In April, the US tech industry lost 214,000 positions as companies shifted toward AI roles and skills-based hiring amid economic uncertainty. Tech sector companies reduced staffing by a net 7,000 positions in April, an analysis of data released by the US Bureau of Labor Statistics (BLS) showed. This year, 137 tech companies have fired 62,114 tech employees, according to Layoffs.fyi. Efforts to reduce headcount at government agencies by the unofficial US Department of Government Efficiency (DOGE) saw an additional 61,296 federal workers fired this year. Kye Mitchell, president of tech workforce staffing firm Experis US, believes the IT employment market is undergoing a fundamental transformation rather than experiencing traditional cyclical layoffs. Although Experis is seeing a 13% month-over-month decline in traditional software developer postings, it doesn’t represent “job destruction, it’s market evolution,” Mitchell said. “What we’re witnessing is the emergence of strategic technology orchestrators who harness AI to drive unprecedented business value,” she said. For example, organizations that once deployed two scrum teams of ten people to develop high-quality software are now achieving superior results with a single team of five AI-empowered developers. “This isn’t about cutting jobs; it’s about elevating roles,” Mitchell said. Specialized roles in particular are surging. Database architect positions are up 2,312%, statistician roles have increased 382%, and jobs for mathematicians have increased 1,272%. “These aren’t replacements; they’re vital for an AI-driven future,” she said. In fact, it’s an IT talent gap, not an employee surplus, that is now challenging organizations — and will continue to do so. With 76% of IT employers already struggling to find skilled tech talent, the market fundamentals favor skilled professionals, according to Mitchell. “The question isn’t whether there will be IT jobs — it’s whether we can develop the right skills fast enough to meet demand,” she said. For federal tech workers, outdated systems and slow procurement make it hard to attract and keep top tech talent. Agencies expect fast team deployment but operate with rigid, outdated processes, according to Justin Vianello, CEO of technology workforce development firm SkillStorm. Long security clearance delays add cost and time, often forcing companies to hire expensive, already-cleared talent. Meanwhile, modern technologists want to use current tools and make an impact — something hard to do with legacy systems and decade-long modernization efforts, he added. Many suggest turning to AI to will solve the tech talent shortage, but there is no evidence that AI will lead to a reduction in demand for tech talent, Vianello said. “On the contrary, companies see that the demand for tech talent has increased as they invest in preparing their workforce to properly use AI tools,” he said. A shortage of qualified talent is a bigger barrier to hiring than AI automation, he said, because organizations struggle to find candidates with the right certifications, skills, and clearances — especially in cloud, cybersecurity, and AI. Tech workers often lack skills in these areas because technology evolves faster than education and training can keep up, Vianello said. And while AI helps automate routine tasks, it can’t replace the strategic roles filled by skilled professionals. Seven out of 10 US organizations are struggling to find skilled workers to fill roles in an ever-evolving digital transformation landscape, and genAI has added to that headache, according to a ManpowerGroup survey released earlier this year. Job postings for AI skills surged 2,000% in 2024, but education and training in this area haven’t kept pace, according to Kelly Stratman, global ecosystem relationships enablement leader at Ernst & Young. “As formal education and training in AI skills still lag, it results in a shortage of AI talent that can effectively manage these technologies and demands,” she said in an earlier interview. “The AI talent shortage is most prominent among highly technical roles like data scientists/analysts, machine learning engineers, and software developers.” Economic uncertainty is creating a cautious hiring environment, but it’s more complex than tariffs alone. Experis data shows employers adopting a “wait and watch” stance as they monitor economic signals, with job openings down 11% year-over-year, according to Mitchell. “However, the bigger story is strategic workforce planning in an era of rapid technological change. Companies are being incredibly precise about where they allocate resources. Not because of economic pressure alone, but because the skills landscape is shifting so rapidly,” Mitchell said. “They’re prioritizing mission-critical roles while restructuring others around AI capabilities.” Top organizations see AI as a strategic shift, not just cost-cutting. Cutting talent now risks weakening core areas like cybersecurity, according to Mitchell. Skillstorm’s Vianello suggests that IT job hunters should begin to upgrade their skills with certifications that matter: AWS, Azure, CISSP, Security+, and AI/ML credentials open doors quickly, he said. “Veterans, in particular, have an edge; they bring leadership, discipline, and security clearances. Apprenticeships and fellowships offer a fast track into full-time roles by giving you experience that actually counts. And don’t overlook the intangibles: soft skills and project leadership are what elevate technologists into impact-makers,” Vianello said. Skills-based hiring has been on the rise for several years, as organizations seek to fill specific needs for big data analytics, programing (such as Rust), and AI prompt engineering. In fact, demand for genAI courses is surging, passing all other tech skills courses spanning fields from data science to cybersecurity, project management, and marketing. “AI isn’t replacing jobs — it’s fundamentally redefining how work gets done. The break point where technology truly displaces a position is when roughly 80% of tasks can be fully automated,” Mitchell said. “We’re nowhere near that threshold for most roles. Instead, we’re seeing AI augment skill sets and make professionals more capable, faster, and able to focus on higher-value work.” Leaders use AI as a strategic enabler — embedding it to enhance, not compete with, human developers, she said. Some industry forecasts predict a 30% productivity boost from AI tools, potentially adding more than $1.5 trillion to global GDP. For example, AI tools are expected to perform the lion’s share of coding. Techniques where humans use AI-augmented coding tools, such as “vibe coding,” are set to revolutionize software development by creating source code, generating tests automatically, and freeing up developer time for innovation instead of debugging code. With vibe coding, developers use natural language in a conversational way that prompts the AI model to offer contextual ideas and generate code based on the conversation. By 2028, 75% of professional developers will be using vibe coding and other genAI-powered coding tools, up from less than 10% in September 2023, according to Gartner Research. And within three years, 80% of enterprises will have integrated AI-augmented testing tools into their software engineering tool chain — a significant increase from approximately 15% early last year, Gartner said. A report from MIT Technology Review Insights found that 94% of business leaders now use genAI in software development, with 82% applying it in multiple stages — and 26% in four or more. Some industry experts place genAI’s use in creating code much higher. “What we are finding is that we’re three to six months from a world where AI is writing 90% of the code. And then in 12 months, we may be in a world where AI is writing essentially all of the code,” Anthropic CEO Dario Amodei said in a recent report and video interview. “The real [AI] transformation is in role evolution. Developers are becoming strategic technology orchestrators,” Mitchell from Experis said. “Data professionals are becoming business problem solvers. The demand isn’t disappearing; it’s becoming more sophisticated and more valuable. “In today’s economic climate, having the right tech talent with AI-enhanced capabilities isn’t a nice-to-have, it’s your competitive edge,” she said.

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça Login para curtir, compartilhar e comentar!
Scientific American @ScientificAmerican compartilhou um link
2025-05-23 14:37:15 ·

The Creepy Calculus of Measuring Death Risk

May 23, 20255 min readThe Creepy Calculus of Measuring Death RiskMeet micromorts and microlives, statistical units that help mathematicians to calculate riskBy Manon Bischoff edited by Daisy Yuhas M-SUR/Alamy Stock PhotoPeople are generally bad at assessing probabilities. That’s why we have irrational fears and why we overestimate our odds of winning the lottery.Whenever I have to travel by plane, for example, my palms sweat, my heart races and my thoughts take a gloomy turn. I should be much more worried when I get on my bike in Darmstadt, Germany, where I live. Statistically, I’m in much greater danger on the road than in the air. Yet my bike commute doesn’t cause me any stress at all.Recently, a friend told me about a concept within decision theory that is supposed to help people get a better sense of hazards and risks. In 1980 electrical engineer Ronald Arthur Howard coined the micromort unit to quantify life-threatening danger.On supporting science journalismIf you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.One micromort corresponds to a one-in-a-million chance of dying during a certain activity. Do you want to run a marathon? The risk is seven micromorts. Are you going under general anesthesia? That’s 10 micromorts. To arrive at these figures, you first need detailed statistics. How many people engaged in these activities and died in the process? And the results depend heavily on the group of people being studied, as well as the geographic location.Better Living through StatisticsSurprisingly, the history of statistics doesn’t go back very far. In the 17th century, British demographer John Graunt pioneered mortality statistics by analyzing records of deaths and baptisms. But it would take another 200 years for society to recognize the social benefits of these approaches.Today the utility of this mathematical subfield is undisputed. Insurance companies and banks use statistics to carry out risk assessments. Statistical surveys make it possible to investigate psychological and sociological phenomena. Physical research would be unthinkable without statistics.Thanks to Howard and the micromort, the risks in our everyday lives can also be estimated with the help of statistics. By examining the proportion of people who die while undertaking a particular activity, he was able to create a general mortality risk for those activities.But more recently, mathematician David Spiegelhalter noticed something missing in Howard’s analysis: the micromort unit merely indicates how likely it is that a very specific action will kill us. This may make sense for a one-off activity such as climbing a mountain. But for long-term habits, such as regularly eating fast food, the measure is of only limited use.For example, smoking a cigarette causes just 0.21 micromort and would therefore be significantly less risky than getting out of bed in the morning at the age of 45. Smoking, however, has long-lasting negative consequences for the body that getting up in the morning does not. The long-term risk is therefore not recorded.So Spiegelhalter introduced the “microlife” measure to take into account the long-term effects of different activities. This quantifies how much life you lose on average by carrying out an activity. Each microlife that is lost reduces your life expectancy by half an hour. Two hours of watching TV each day might cost one microlife, for instance.One of the most significant differences between micromorts and microlives is that one of the two types of units compounds over time, and the other does not. If I survive my morning bike ride to the Darmstadt train station, my micromort count for that ride drops back to zero. The next day I start the journey again with the same risk.It’s different with microlife data: if I smoke a cigarette and then a second one an hour later, the time I’ve lost adds up. And of course, the mere ticking of the clock also shortens my available years of life. Every day 48 microlives are lost.But unlike micromorts, I can regain microlives. For example, a 20-minute walk provides me with around two microlives—that is, an extra hour of life expectancy. And eating a healthy diet with fruits and vegetables could gain you four microlives daily.Reality CheckAll these facts and figures are entertaining to read about and can make for interesting conversation starters—“Hey, did you know that this beer shortens your life by about 15 minutes?”—at least with the right crowd. But how do you calculate the microlives you lose as a result of an action?First, you have to compare the life expectancy of different people. For example: How does the life expectancy of smokers and nonsmokers differ? By taking this difference and dividing it by the average number of cigarettes smoked, we can calculate the average amount of time that each cigarette robs us of.This result is clearly inexact. The difference in life expectancy will also depend on factors such as a person’s gender, place of residence and age. These data can still be captured, but when it comes to general lifestyle factors, things get complicated. For example, studies show that many smokers generally have an unhealthier lifestyle and exercise less.Such correlations cannot always be calculated and accounted for. When it comes to smoking, however, there have been long-term studies that followed many people, some of whom stopped smoking at some point in their life, over several decades.These data make it easier to isolate the effect that smoking has on a person’s life expectancy. Such research suggests that a single cigarette is likely to rob a person of slightly less than the originally calculated 15 minutes of life if they have the other lifestyle habits of a nonsmoker. So should we be consulting statistics at the start of every day to maximize our lifespan? Perhaps we should be studying these analyses to engage in activities with as few micromorts as possible and try to gain, rather than lose, microlives?Not exactly. Micromorts and microlives can help you better assess risks. But you shouldn’t attach too much importance to them. After all, our world is complex. You may gain back two microlives during a walk, but you could also get in an unlucky accident along the way and be hit by a car. Ultimately, micromorts and microlives are just too simple a tool to evaluate the full range of consequences associated with an action. Exercise can improve your state of mind, which has a positive effect not only on your quality of life but also on your lifespan.That said, it can still be a source of comfort to turn to statistics—particularly when we want to understand if our fear is rational or not. For my part, I will try to remind myself of how few micromorts are associated with flying. Maybe that will help.This article originally appeared in Spektrum der Wissenschaft and was reproduced with permission.
#creepy #calculus #measuring #death #risk

The Creepy Calculus of Measuring Death Risk
May 23, 20255 min readThe Creepy Calculus of Measuring Death RiskMeet micromorts and microlives, statistical units that help mathematicians to calculate riskBy Manon Bischoff edited by Daisy Yuhas M-SUR/Alamy Stock PhotoPeople are generally bad at assessing probabilities. That’s why we have irrational fears and why we overestimate our odds of winning the lottery.Whenever I have to travel by plane, for example, my palms sweat, my heart races and my thoughts take a gloomy turn. I should be much more worried when I get on my bike in Darmstadt, Germany, where I live. Statistically, I’m in much greater danger on the road than in the air. Yet my bike commute doesn’t cause me any stress at all.Recently, a friend told me about a concept within decision theory that is supposed to help people get a better sense of hazards and risks. In 1980 electrical engineer Ronald Arthur Howard coined the micromort unit to quantify life-threatening danger.On supporting science journalismIf you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.One micromort corresponds to a one-in-a-million chance of dying during a certain activity. Do you want to run a marathon? The risk is seven micromorts. Are you going under general anesthesia? That’s 10 micromorts. To arrive at these figures, you first need detailed statistics. How many people engaged in these activities and died in the process? And the results depend heavily on the group of people being studied, as well as the geographic location.Better Living through StatisticsSurprisingly, the history of statistics doesn’t go back very far. In the 17th century, British demographer John Graunt pioneered mortality statistics by analyzing records of deaths and baptisms. But it would take another 200 years for society to recognize the social benefits of these approaches.Today the utility of this mathematical subfield is undisputed. Insurance companies and banks use statistics to carry out risk assessments. Statistical surveys make it possible to investigate psychological and sociological phenomena. Physical research would be unthinkable without statistics.Thanks to Howard and the micromort, the risks in our everyday lives can also be estimated with the help of statistics. By examining the proportion of people who die while undertaking a particular activity, he was able to create a general mortality risk for those activities.But more recently, mathematician David Spiegelhalter noticed something missing in Howard’s analysis: the micromort unit merely indicates how likely it is that a very specific action will kill us. This may make sense for a one-off activity such as climbing a mountain. But for long-term habits, such as regularly eating fast food, the measure is of only limited use.For example, smoking a cigarette causes just 0.21 micromort and would therefore be significantly less risky than getting out of bed in the morning at the age of 45. Smoking, however, has long-lasting negative consequences for the body that getting up in the morning does not. The long-term risk is therefore not recorded.So Spiegelhalter introduced the “microlife” measure to take into account the long-term effects of different activities. This quantifies how much life you lose on average by carrying out an activity. Each microlife that is lost reduces your life expectancy by half an hour. Two hours of watching TV each day might cost one microlife, for instance.One of the most significant differences between micromorts and microlives is that one of the two types of units compounds over time, and the other does not. If I survive my morning bike ride to the Darmstadt train station, my micromort count for that ride drops back to zero. The next day I start the journey again with the same risk.It’s different with microlife data: if I smoke a cigarette and then a second one an hour later, the time I’ve lost adds up. And of course, the mere ticking of the clock also shortens my available years of life. Every day 48 microlives are lost.But unlike micromorts, I can regain microlives. For example, a 20-minute walk provides me with around two microlives—that is, an extra hour of life expectancy. And eating a healthy diet with fruits and vegetables could gain you four microlives daily.Reality CheckAll these facts and figures are entertaining to read about and can make for interesting conversation starters—“Hey, did you know that this beer shortens your life by about 15 minutes?”—at least with the right crowd. But how do you calculate the microlives you lose as a result of an action?First, you have to compare the life expectancy of different people. For example: How does the life expectancy of smokers and nonsmokers differ? By taking this difference and dividing it by the average number of cigarettes smoked, we can calculate the average amount of time that each cigarette robs us of.This result is clearly inexact. The difference in life expectancy will also depend on factors such as a person’s gender, place of residence and age. These data can still be captured, but when it comes to general lifestyle factors, things get complicated. For example, studies show that many smokers generally have an unhealthier lifestyle and exercise less.Such correlations cannot always be calculated and accounted for. When it comes to smoking, however, there have been long-term studies that followed many people, some of whom stopped smoking at some point in their life, over several decades.These data make it easier to isolate the effect that smoking has on a person’s life expectancy. Such research suggests that a single cigarette is likely to rob a person of slightly less than the originally calculated 15 minutes of life if they have the other lifestyle habits of a nonsmoker. So should we be consulting statistics at the start of every day to maximize our lifespan? Perhaps we should be studying these analyses to engage in activities with as few micromorts as possible and try to gain, rather than lose, microlives?Not exactly. Micromorts and microlives can help you better assess risks. But you shouldn’t attach too much importance to them. After all, our world is complex. You may gain back two microlives during a walk, but you could also get in an unlucky accident along the way and be hit by a car. Ultimately, micromorts and microlives are just too simple a tool to evaluate the full range of consequences associated with an action. Exercise can improve your state of mind, which has a positive effect not only on your quality of life but also on your lifespan.That said, it can still be a source of comfort to turn to statistics—particularly when we want to understand if our fear is rational or not. For my part, I will try to remind myself of how few micromorts are associated with flying. Maybe that will help.This article originally appeared in Spektrum der Wissenschaft and was reproduced with permission. #creepy #calculus #measuring #death #risk

The Creepy Calculus of Measuring Death Risk

www.scientificamerican.com
May 23, 20255 min readThe Creepy Calculus of Measuring Death RiskMeet micromorts and microlives, statistical units that help mathematicians to calculate riskBy Manon Bischoff edited by Daisy Yuhas M-SUR/Alamy Stock PhotoPeople are generally bad at assessing probabilities. That’s why we have irrational fears and why we overestimate our odds of winning the lottery.Whenever I have to travel by plane, for example, my palms sweat, my heart races and my thoughts take a gloomy turn. I should be much more worried when I get on my bike in Darmstadt, Germany, where I live. Statistically, I’m in much greater danger on the road than in the air. Yet my bike commute doesn’t cause me any stress at all.Recently, a friend told me about a concept within decision theory that is supposed to help people get a better sense of hazards and risks. In 1980 electrical engineer Ronald Arthur Howard coined the micromort unit to quantify life-threatening danger.On supporting science journalismIf you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.One micromort corresponds to a one-in-a-million chance of dying during a certain activity. Do you want to run a marathon? The risk is seven micromorts. Are you going under general anesthesia? That’s 10 micromorts. To arrive at these figures, you first need detailed statistics. How many people engaged in these activities and died in the process? And the results depend heavily on the group of people being studied (their age, gender, and so on), as well as the geographic location.Better Living through StatisticsSurprisingly, the history of statistics doesn’t go back very far. In the 17th century, British demographer John Graunt pioneered mortality statistics by analyzing records of deaths and baptisms. But it would take another 200 years for society to recognize the social benefits of these approaches.Today the utility of this mathematical subfield is undisputed. Insurance companies and banks use statistics to carry out risk assessments. Statistical surveys make it possible to investigate psychological and sociological phenomena. Physical research would be unthinkable without statistics.Thanks to Howard and the micromort, the risks in our everyday lives can also be estimated with the help of statistics. By examining the proportion of people who die while undertaking a particular activity, he was able to create a general mortality risk for those activities.But more recently, mathematician David Spiegelhalter noticed something missing in Howard’s analysis: the micromort unit merely indicates how likely it is that a very specific action will kill us. This may make sense for a one-off activity such as climbing a mountain. But for long-term habits, such as regularly eating fast food, the measure is of only limited use.For example, smoking a cigarette causes just 0.21 micromort and would therefore be significantly less risky than getting out of bed in the morning at the age of 45 (which results in six micromorts). Smoking, however, has long-lasting negative consequences for the body that getting up in the morning does not. The long-term risk is therefore not recorded.So Spiegelhalter introduced the “microlife” measure to take into account the long-term effects of different activities. This quantifies how much life you lose on average by carrying out an activity. Each microlife that is lost reduces your life expectancy by half an hour. Two hours of watching TV each day might cost one microlife, for instance.One of the most significant differences between micromorts and microlives is that one of the two types of units compounds over time, and the other does not. If I survive my morning bike ride to the Darmstadt train station, my micromort count for that ride drops back to zero. The next day I start the journey again with the same risk.It’s different with microlife data: if I smoke a cigarette and then a second one an hour later, the time I’ve lost adds up. And of course, the mere ticking of the clock also shortens my available years of life. Every day 48 microlives are lost.But unlike micromorts, I can regain microlives. For example, a 20-minute walk provides me with around two microlives—that is, an extra hour of life expectancy. And eating a healthy diet with fruits and vegetables could gain you four microlives daily.Reality CheckAll these facts and figures are entertaining to read about and can make for interesting conversation starters—“Hey, did you know that this beer shortens your life by about 15 minutes?”—at least with the right crowd. But how do you calculate the microlives you lose as a result of an action?First, you have to compare the life expectancy of different people. For example: How does the life expectancy of smokers and nonsmokers differ? By taking this difference and dividing it by the average number of cigarettes smoked, we can calculate the average amount of time that each cigarette robs us of.This result is clearly inexact. The difference in life expectancy will also depend on factors such as a person’s gender, place of residence and age. These data can still be captured, but when it comes to general lifestyle factors, things get complicated. For example, studies show that many smokers generally have an unhealthier lifestyle and exercise less.Such correlations cannot always be calculated and accounted for. When it comes to smoking, however, there have been long-term studies that followed many people, some of whom stopped smoking at some point in their life, over several decades.These data make it easier to isolate the effect that smoking has on a person’s life expectancy. Such research suggests that a single cigarette is likely to rob a person of slightly less than the originally calculated 15 minutes of life if they have the other lifestyle habits of a nonsmoker. So should we be consulting statistics at the start of every day to maximize our lifespan? Perhaps we should be studying these analyses to engage in activities with as few micromorts as possible and try to gain, rather than lose, microlives?Not exactly. Micromorts and microlives can help you better assess risks. But you shouldn’t attach too much importance to them. After all, our world is complex. You may gain back two microlives during a walk, but you could also get in an unlucky accident along the way and be hit by a car. Ultimately, micromorts and microlives are just too simple a tool to evaluate the full range of consequences associated with an action. Exercise can improve your state of mind, which has a positive effect not only on your quality of life but also on your lifespan.That said, it can still be a source of comfort to turn to statistics—particularly when we want to understand if our fear is rational or not. For my part, I will try to remind myself of how few micromorts are associated with flying. Maybe that will help.This article originally appeared in Spektrum der Wissenschaft and was reproduced with permission.

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça Login para curtir, compartilhar e comentar!
MIT Technology Review @MITT compartilhou um link
2025-05-19 22:07:22 ·

Inside the story that enraged OpenAI

In 2019, Karen Hao, a senior reporter with MIT Technology Review, pitched me on writing a story about a then little-known company, OpenAI. It was her biggest assignment to date. Hao’s feat of reporting took a series of twists and turns over the coming months, eventually revealing how OpenAI’s ambition had taken it far afield from its original mission. The finished story was a prescient look at a company at a tipping point—or already past it. And OpenAI was not happy with the result. Hao’s new book, Empire of AI: Dreams and Nightmares in Sam Altman’s OpenAI, is an in-depth exploration of the company that kick-started the AI arms race, and what that race means for all of us. This excerpt is the origin story of that reporting. — Niall Firth, executive editor, MIT Technology Review

I arrived at OpenAI’s offices on August 7, 2019. Greg Brockman, then thirty‑one, OpenAI’s chief technology officer and soon‑to‑be company president, came down the staircase to greet me. He shook my hand with a tentative smile. “We’ve never given someone so much access before,” he said.

At the time, few people beyond the insular world of AI research knew about OpenAI. But as a reporter at MIT Technology Review covering the ever‑expanding boundaries of artificial intelligence, I had been following its movements closely.

Until that year, OpenAI had been something of a stepchild in AI research. It had an outlandish premise that AGI could be attained within a decade, when most non‑OpenAI experts doubted it could be attained at all. To much of the field, it had an obscene amount of funding despite little direction and spent too much of the money on marketing what other researchers frequently snubbed as unoriginal research. It was, for some, also an object of envy. As a nonprofit, it had said that it had no intention to chase commercialization. It was a rare intellectual playground without strings attached, a haven for fringe ideas.

But in the six months leading up to my visit, the rapid slew of changes at OpenAI signaled a major shift in its trajectory. First was its confusing decision to withhold GPT‑2 and brag about it. Then its announcement that Sam Altman, who had mysteriously departed his influential perch at YC, would step in as OpenAI’s CEO with the creation of its new “capped‑profit” structure. I had already made my arrangements to visit the office when it subsequently revealed its deal with Microsoft, which gave the tech giant priority for commercializing OpenAI’s technologies and locked it into exclusively using Azure, Microsoft’s cloud‑computing platform.

Each new announcement garnered fresh controversy, intense speculation, and growing attention, beginning to reach beyond the confines of the tech industry. As my colleagues and I covered the company’s progression, it was hard to grasp the full weight of what was happening. What was clear was that OpenAI was beginning to exert meaningful sway over AI research and the way policymakers were learning to understand the technology. The lab’s decision to revamp itself into a partially for‑profit business would have ripple effects across its spheres of influence in industry and government.

So late one night, with the urging of my editor, I dashed off an email to Jack Clark, OpenAI’s policy director, whom I had spoken with before: I would be in town for two weeks, and it felt like the right moment in OpenAI’s history. Could I interest them in a profile? Clark passed me on to the communications head, who came back with an answer. OpenAI was indeed ready to reintroduce itself to the public. I would have three days to interview leadership and embed inside the company.

Brockman and I settled into a glass meeting room with the company’s chief scientist, Ilya Sutskever. Sitting side by side at a long conference table, they each played their part. Brockman, the coder and doer, leaned forward, a little on edge, ready to make a good impression; Sutskever, the researcher and philosopher, settled back into his chair, relaxed and aloof.

I opened my laptop and scrolled through my questions. OpenAI’s mission is to ensure beneficial AGI, I began. Why spend billions of dollars on this problem and not something else?

Brockman nodded vigorously. He was used to defending OpenAI’s position. “The reason that we care so much about AGI and that we think it’s important to build is because we think it can help solve complex problems that are just out of reach of humans,” he said.

He offered two examples that had become dogma among AGI believers. Climate change. “It’s a super‑complex problem. How are you even supposed to solve it?” And medicine. “Look at how important health care is in the US as a political issue these days. How do we actually get better treatment for people at lower cost?”

On the latter, he began to recount the story of a friend who had a rare disorder and had recently gone through the exhausting rigmarole of bouncing between different specialists to figure out his problem. AGI would bring together all of these specialties. People like his friend would no longer spend so much energy and frustration on getting an answer.

Why did we need AGI to do that instead of AI? I asked.

This was an important distinction. The term AGI, once relegated to an unpopular section of the technology dictionary, had only recently begun to gain more mainstream usage—in large part because of OpenAI.

And as OpenAI defined it, AGI referred to a theoretical pinnacle of AI research: a piece of software that had just as much sophistication, agility, and creativity as the human mind to match or exceed its performance on mosttasks. The operative word was theoretical. Since the beginning of earnest research into AI several decades earlier, debates had raged about whether silicon chips encoding everything in their binary ones and zeros could ever simulate brains and the other biological processes that give rise to what we consider intelligence. There had yet to be definitive evidence that this was possible, which didn’t even touch on the normative discussion of whether people should develop it.

AI, on the other hand, was the term du jour for both the version of the technology currently available and the version that researchers could reasonably attain in the near future through refining existing capabilities. Those capabilities—rooted in powerful pattern matching known as machine learning—had already demonstrated exciting applications in climate change mitigation and health care.

Sutskever chimed in. When it comes to solving complex global challenges, “fundamentally the bottleneck is that you have a large number of humans and they don’t communicate as fast, they don’t work as fast, they have a lot of incentive problems.” AGI would be different, he said. “Imagine it’s a large computer network of intelligent computers—they’re all doing their medical diagnostics; they all communicate results between them extremely fast.”

This seemed to me like another way of saying that the goal of AGI was to replace humans. Is that what Sutskever meant? I asked Brockman a few hours later, once it was just the two of us.

“No,” Brockman replied quickly. “This is one thing that’s really important. What is the purpose of technology? Why is it here? Why do we build it? We’ve been building technologies for thousands of years now, right? We do it because they serve people. AGI is not going to be different—not the way that we envision it, not the way we want to build it, not the way we think it should play out.”

That said, he acknowledged a few minutes later, technology had always destroyed some jobs and created others. OpenAI’s challenge would be to build AGI that gave everyone “economic freedom” while allowing them to continue to “live meaningful lives” in that new reality. If it succeeded, it would decouple the need to work from survival.

“I actually think that’s a very beautiful thing,” he said.

In our meeting with Sutskever, Brockman reminded me of the bigger picture. “What we view our role as is not actually being a determiner of whether AGI gets built,” he said. This was a favorite argument in Silicon Valley—the inevitability card. If we don’t do it, somebody else will. “The trajectory is already there,” he emphasized, “but the thing we can influence is the initial conditions under which it’s born.

“What is OpenAI?” he continued. “What is our purpose? What are we really trying to do? Our mission is to ensure that AGI benefits all of humanity. And the way we want to do that is: Build AGI and distribute its economic benefits.”

His tone was matter‑of‑fact and final, as if he’d put my questions to rest. And yet we had somehow just arrived back to exactly where we’d started.

Our conversation continued on in circles until we ran out the clock after forty‑five minutes. I tried with little success to get more concrete details on what exactly they were trying to build—which by nature, they explained, they couldn’t know—and why, then, if they couldn’t know, they were so confident it would be beneficial. At one point, I tried a different approach, asking them instead to give examples of the downsides of the technology. This was a pillar of OpenAI’s founding mythology: The lab had to build good AGI before someone else built a bad one.

Brockman attempted an answer: deepfakes. “It’s not clear the world is better through its applications,” he said.

I offered my own example: Speaking of climate change, what about the environmental impact of AI itself? A recent study from the University of Massachusetts Amherst had placed alarming numbers on the huge and growing carbon emissions of training larger and larger AI models.

That was “undeniable,” Sutskever said, but the payoff was worth it because AGI would, “among other things, counteract the environmental cost specifically.” He stopped short of offering examples.

“It is unquestioningly very highly desirable that data centers be as green as possible,” he added.

“No question,” Brockman quipped.

“Data centers are the biggest consumer of energy, of electricity,” Sutskever continued, seeming intent now on proving that he was aware of and cared about this issue.

“It’s 2 percent globally,” I offered.

“Isn’t Bitcoin like 1 percent?” Brockman said.

“Wow!” Sutskever said, in a sudden burst of emotion that felt, at this point, forty minutes into the conversation, somewhat performative.

Sutskever would later sit down with New York Times reporter Cade Metz for his book Genius Makers, which recounts a narrative history of AI development, and say without a hint of satire, “I think that it’s fairly likely that it will not take too long of a time for the entire surface of the Earth to become covered with data centers and power stations.” There would be “a tsunami of computing . . . almost like a natural phenomenon.” AGI—and thus the data centers needed to support them—would be “too useful to not exist.”

I tried again to press for more details. “What you’re saying is OpenAI is making a huge gamble that you will successfully reach beneficial AGI to counteract global warming before the act of doing so might exacerbate it.”

“I wouldn’t go too far down that rabbit hole,” Brockman hastily cut in. “The way we think about it is the following: We’re on a ramp of AI progress. This is bigger than OpenAI, right? It’s the field. And I think society is actually getting benefit from it.”

“The day we announced the deal,” he said, referring to Microsoft’s new billion investment, “Microsoft’s market cap went up by billion. People believe there is a positive ROI even just on short‑term technology.”

OpenAI’s strategy was thus quite simple, he explained: to keep up with that progress. “That’s the standard we should really hold ourselves to. We should continue to make that progress. That’s how we know we’re on track.”

Later that day, Brockman reiterated that the central challenge of working at OpenAI was that no one really knew what AGI would look like. But as researchers and engineers, their task was to keep pushing forward, to unearth the shape of the technology step by step.

He spoke like Michelangelo, as though AGI already existed within the marble he was carving. All he had to do was chip away until it revealed itself.

There had been a change of plans. I had been scheduled to eat lunch with employees in the cafeteria, but something now required me to be outside the office. Brockman would be my chaperone. We headed two dozen steps across the street to an open‑air café that had become a favorite haunt for employees.

This would become a recurring theme throughout my visit: floors I couldn’t see, meetings I couldn’t attend, researchers stealing furtive glances at the communications head every few sentences to check that they hadn’t violated some disclosure policy. I would later learn that after my visit, Jack Clark would issue an unusually stern warning to employees on Slack not to speak with me beyond sanctioned conversations. The security guard would receive a photo of me with instructions to be on the lookout if I appeared unapproved on the premises. It was odd behavior in general, made odder by OpenAI’s commitment to transparency. What, I began to wonder, were they hiding, if everything was supposed to be beneficial research eventually made available to the public?

At lunch and through the following days, I probed deeper into why Brockman had cofounded OpenAI. He was a teen when he first grew obsessed with the idea that it could be possible to re‑create human intelligence. It was a famous paper from British mathematician Alan Turing that sparked his fascination. The name of its first section, “The Imitation Game,” which inspired the title of the 2014 Hollywood dramatization of Turing’s life, begins with the opening provocation, “Can machines think?” The paper goes on to define what would become known as the Turing test: a measure of the progression of machine intelligence based on whether a machine can talk to a human without giving away that it is a machine. It was a classic origin story among people working in AI. Enchanted, Brockman coded up a Turing test game and put it online, garnering some 1,500 hits. It made him feel amazing. “I just realized that was the kind of thing I wanted to pursue,” he said.

In 2015, as AI saw great leaps of advancement, Brockman says that he realized it was time to return to his original ambition and joined OpenAI as a cofounder. He wrote down in his notes that he would do anything to bring AGI to fruition, even if it meant being a janitor. When he got married four years later, he held a civil ceremony at OpenAI’s office in front of a custom flower wall emblazoned with the shape of the lab’s hexagonal logo. Sutskever officiated. The robotic hand they used for research stood in the aisle bearing the rings, like a sentinel from a post-apocalyptic future.

“Fundamentally, I want to work on AGI for the rest of my life,” Brockman told me.

What motivated him? I asked Brockman.

What are the chances that a transformative technology could arrive in your lifetime? he countered.

He was confident that he—and the team he assembled—was uniquely positioned to usher in that transformation. “What I’m really drawn to are problems that will not play out in the same way if I don’t participate,” he said.

Brockman did not in fact just want to be a janitor. He wanted to lead AGI. And he bristled with the anxious energy of someone who wanted history‑defining recognition. He wanted people to one day tell his story with the same mixture of awe and admiration that he used to recount the ones of the great innovators who came before him.

A year before we spoke, he had told a group of young tech entrepreneurs at an exclusive retreat in Lake Tahoe with a twinge of self‑pity that chief technology officers were never known. Name a famous CTO, he challenged the crowd. They struggled to do so. He had proved his point.

In 2022, he became OpenAI’s president.

During our conversations, Brockman insisted to me that none of OpenAI’s structural changes signaled a shift in its core mission. In fact, the capped profit and the new crop of funders enhanced it. “We managed to get these mission‑aligned investors who are willing to prioritize mission over returns. That’s a crazy thing,” he said.

OpenAI now had the long‑term resources it needed to scale its models and stay ahead of the competition. This was imperative, Brockman stressed. Failing to do so was the real threat that could undermine OpenAI’s mission. If the lab fell behind, it had no hope of bending the arc of history toward its vision of beneficial AGI. Only later would I realize the full implications of this assertion. It was this fundamental assumption—the need to be first or perish—that set in motion all of OpenAI’s actions and their far‑reaching consequences. It put a ticking clock on each of OpenAI’s research advancements, based not on the timescale of careful deliberation but on the relentless pace required to cross the finish line before anyone else. It justified OpenAI’s consumption of an unfathomable amount of resources: both compute, regardless of its impact on the environment; and data, the amassing of which couldn’t be slowed by getting consent or abiding by regulations.

Brockman pointed once again to the billion jump in Microsoft’s market cap. “What that really reflects is AI is delivering real value to the real world today,” he said. That value was currently being concentrated in an already wealthy corporation, he acknowledged, which was why OpenAI had the second part of its mission: to redistribute the benefits of AGI to everyone.

Was there a historical example of a technology’s benefits that had been successfully distributed? I asked.

“Well, I actually think that—it’s actually interesting to look even at the internet as an example,” he said, fumbling a bit before settling on his answer. “There’s problems, too, right?” he said as a caveat. “Anytime you have something super transformative, it’s not going to be easy to figure out how to maximize positive, minimize negative.

“Fire is another example,” he added. “It’s also got some real drawbacks to it. So we have to figure out how to keep it under control and have shared standards.

“Cars are a good example,” he followed. “Lots of people have cars, benefit a lot of people. They have some drawbacks to them as well. They have some externalities that are not necessarily good for the world,” he finished hesitantly.

“I guess I just view—the thing we want for AGI is not that different from the positive sides of the internet, positive sides of cars, positive sides of fire. The implementation is very different, though, because it’s a very different type of technology.”

His eyes lit up with a new idea. “Just look at utilities. Power companies, electric companies are very centralized entities that provide low‑cost, high‑quality things that meaningfully improve people’s lives.”

It was a nice analogy. But Brockman seemed once again unclear about how OpenAI would turn itself into a utility. Perhaps through distributing universal basic income, he wondered aloud, perhaps through something else.

He returned to the one thing he knew for certain. OpenAI was committed to redistributing AGI’s benefits and giving everyone economic freedom. “We actually really mean that,” he said.

“The way that we think about it is: Technology so far has been something that does rise all the boats, but it has this real concentrating effect,” he said. “AGI could be more extreme. What if all value gets locked up in one place? That is the trajectory we’re on as a society. And we’ve never seen that extreme of it. I don’t think that’s a good world. That’s not a world that I want to sign up for. That’s not a world that I want to help build.”

In February 2020, I published my profile for MIT Technology Review, drawing on my observations from my time in the office, nearly three dozen interviews, and a handful of internal documents. “There is a misalignment between what the company publicly espouses and how it operates behind closed doors,” I wrote. “Over time, it has allowed a fierce competitiveness and mounting pressure for ever more funding to erode its founding ideals of transparency, openness, and collaboration.”

Hours later, Elon Musk replied to the story with three tweets in rapid succession:

“OpenAI should be more open imo”

“I have no control & only very limited insight into OpenAI. Confidence in Dario for safety is not high,” he said, referring to Dario Amodei, the director of research.

“All orgs developing advanced AI should be regulated, including Tesla”

Afterward, Altman sent OpenAI employees an email.

“I wanted to share some thoughts about the Tech Review article,” he wrote. “While definitely not catastrophic, it was clearly bad.”

It was “a fair criticism,” he said that the piece had identified a disconnect between the perception of OpenAI and its reality. This could be smoothed over not with changes to its internal practices but some tuning of OpenAI’s public messaging. “It’s good, not bad, that we have figured out how to be flexible and adapt,” he said, including restructuring the organization and heightening confidentiality, “in order to achieve our mission as we learn more.” OpenAI should ignore my article for now and, in a few weeks’ time, start underscoring its continued commitment to its original principles under the new transformation. “This may also be a good opportunity to talk about the API as a strategy for openness and benefit sharing,” he added, referring to an application programming interface for delivering OpenAI’s models.

“The most serious issue of all, to me,” he continued, “is that someone leaked our internal documents.” They had already opened an investigation and would keep the company updated. He would also suggest that Amodei and Musk meet to work out Musk’s criticism, which was “mild relative to other things he’s said” but still “a bad thing to do.” For the avoidance of any doubt, Amodei’s work and AI safety were critical to the mission, he wrote. “I think we should at some point in the future find a way to publicly defend our team.”

OpenAI wouldn’t speak to me again for three years.

From the book Empire of AI: Dreams and Nightmares in Sam Altman’s OpenAI, by Karen Hao, to be published on May 20, 2025, by Penguin Press, an imprint of Penguin Publishing Group, a division of Penguin Random House LLC. Copyright © 2025 by Karen Hao.
#inside #story #that #enraged #openai

Inside the story that enraged OpenAI
In 2019, Karen Hao, a senior reporter with MIT Technology Review, pitched me on writing a story about a then little-known company, OpenAI. It was her biggest assignment to date. Hao’s feat of reporting took a series of twists and turns over the coming months, eventually revealing how OpenAI’s ambition had taken it far afield from its original mission. The finished story was a prescient look at a company at a tipping point—or already past it. And OpenAI was not happy with the result. Hao’s new book, Empire of AI: Dreams and Nightmares in Sam Altman’s OpenAI, is an in-depth exploration of the company that kick-started the AI arms race, and what that race means for all of us. This excerpt is the origin story of that reporting. — Niall Firth, executive editor, MIT Technology Review I arrived at OpenAI’s offices on August 7, 2019. Greg Brockman, then thirty‑one, OpenAI’s chief technology officer and soon‑to‑be company president, came down the staircase to greet me. He shook my hand with a tentative smile. “We’ve never given someone so much access before,” he said. At the time, few people beyond the insular world of AI research knew about OpenAI. But as a reporter at MIT Technology Review covering the ever‑expanding boundaries of artificial intelligence, I had been following its movements closely. Until that year, OpenAI had been something of a stepchild in AI research. It had an outlandish premise that AGI could be attained within a decade, when most non‑OpenAI experts doubted it could be attained at all. To much of the field, it had an obscene amount of funding despite little direction and spent too much of the money on marketing what other researchers frequently snubbed as unoriginal research. It was, for some, also an object of envy. As a nonprofit, it had said that it had no intention to chase commercialization. It was a rare intellectual playground without strings attached, a haven for fringe ideas. But in the six months leading up to my visit, the rapid slew of changes at OpenAI signaled a major shift in its trajectory. First was its confusing decision to withhold GPT‑2 and brag about it. Then its announcement that Sam Altman, who had mysteriously departed his influential perch at YC, would step in as OpenAI’s CEO with the creation of its new “capped‑profit” structure. I had already made my arrangements to visit the office when it subsequently revealed its deal with Microsoft, which gave the tech giant priority for commercializing OpenAI’s technologies and locked it into exclusively using Azure, Microsoft’s cloud‑computing platform. Each new announcement garnered fresh controversy, intense speculation, and growing attention, beginning to reach beyond the confines of the tech industry. As my colleagues and I covered the company’s progression, it was hard to grasp the full weight of what was happening. What was clear was that OpenAI was beginning to exert meaningful sway over AI research and the way policymakers were learning to understand the technology. The lab’s decision to revamp itself into a partially for‑profit business would have ripple effects across its spheres of influence in industry and government. So late one night, with the urging of my editor, I dashed off an email to Jack Clark, OpenAI’s policy director, whom I had spoken with before: I would be in town for two weeks, and it felt like the right moment in OpenAI’s history. Could I interest them in a profile? Clark passed me on to the communications head, who came back with an answer. OpenAI was indeed ready to reintroduce itself to the public. I would have three days to interview leadership and embed inside the company. Brockman and I settled into a glass meeting room with the company’s chief scientist, Ilya Sutskever. Sitting side by side at a long conference table, they each played their part. Brockman, the coder and doer, leaned forward, a little on edge, ready to make a good impression; Sutskever, the researcher and philosopher, settled back into his chair, relaxed and aloof. I opened my laptop and scrolled through my questions. OpenAI’s mission is to ensure beneficial AGI, I began. Why spend billions of dollars on this problem and not something else? Brockman nodded vigorously. He was used to defending OpenAI’s position. “The reason that we care so much about AGI and that we think it’s important to build is because we think it can help solve complex problems that are just out of reach of humans,” he said. He offered two examples that had become dogma among AGI believers. Climate change. “It’s a super‑complex problem. How are you even supposed to solve it?” And medicine. “Look at how important health care is in the US as a political issue these days. How do we actually get better treatment for people at lower cost?” On the latter, he began to recount the story of a friend who had a rare disorder and had recently gone through the exhausting rigmarole of bouncing between different specialists to figure out his problem. AGI would bring together all of these specialties. People like his friend would no longer spend so much energy and frustration on getting an answer. Why did we need AGI to do that instead of AI? I asked. This was an important distinction. The term AGI, once relegated to an unpopular section of the technology dictionary, had only recently begun to gain more mainstream usage—in large part because of OpenAI. And as OpenAI defined it, AGI referred to a theoretical pinnacle of AI research: a piece of software that had just as much sophistication, agility, and creativity as the human mind to match or exceed its performance on mosttasks. The operative word was theoretical. Since the beginning of earnest research into AI several decades earlier, debates had raged about whether silicon chips encoding everything in their binary ones and zeros could ever simulate brains and the other biological processes that give rise to what we consider intelligence. There had yet to be definitive evidence that this was possible, which didn’t even touch on the normative discussion of whether people should develop it. AI, on the other hand, was the term du jour for both the version of the technology currently available and the version that researchers could reasonably attain in the near future through refining existing capabilities. Those capabilities—rooted in powerful pattern matching known as machine learning—had already demonstrated exciting applications in climate change mitigation and health care. Sutskever chimed in. When it comes to solving complex global challenges, “fundamentally the bottleneck is that you have a large number of humans and they don’t communicate as fast, they don’t work as fast, they have a lot of incentive problems.” AGI would be different, he said. “Imagine it’s a large computer network of intelligent computers—they’re all doing their medical diagnostics; they all communicate results between them extremely fast.” This seemed to me like another way of saying that the goal of AGI was to replace humans. Is that what Sutskever meant? I asked Brockman a few hours later, once it was just the two of us. “No,” Brockman replied quickly. “This is one thing that’s really important. What is the purpose of technology? Why is it here? Why do we build it? We’ve been building technologies for thousands of years now, right? We do it because they serve people. AGI is not going to be different—not the way that we envision it, not the way we want to build it, not the way we think it should play out.” That said, he acknowledged a few minutes later, technology had always destroyed some jobs and created others. OpenAI’s challenge would be to build AGI that gave everyone “economic freedom” while allowing them to continue to “live meaningful lives” in that new reality. If it succeeded, it would decouple the need to work from survival. “I actually think that’s a very beautiful thing,” he said. In our meeting with Sutskever, Brockman reminded me of the bigger picture. “What we view our role as is not actually being a determiner of whether AGI gets built,” he said. This was a favorite argument in Silicon Valley—the inevitability card. If we don’t do it, somebody else will. “The trajectory is already there,” he emphasized, “but the thing we can influence is the initial conditions under which it’s born. “What is OpenAI?” he continued. “What is our purpose? What are we really trying to do? Our mission is to ensure that AGI benefits all of humanity. And the way we want to do that is: Build AGI and distribute its economic benefits.” His tone was matter‑of‑fact and final, as if he’d put my questions to rest. And yet we had somehow just arrived back to exactly where we’d started. Our conversation continued on in circles until we ran out the clock after forty‑five minutes. I tried with little success to get more concrete details on what exactly they were trying to build—which by nature, they explained, they couldn’t know—and why, then, if they couldn’t know, they were so confident it would be beneficial. At one point, I tried a different approach, asking them instead to give examples of the downsides of the technology. This was a pillar of OpenAI’s founding mythology: The lab had to build good AGI before someone else built a bad one. Brockman attempted an answer: deepfakes. “It’s not clear the world is better through its applications,” he said. I offered my own example: Speaking of climate change, what about the environmental impact of AI itself? A recent study from the University of Massachusetts Amherst had placed alarming numbers on the huge and growing carbon emissions of training larger and larger AI models. That was “undeniable,” Sutskever said, but the payoff was worth it because AGI would, “among other things, counteract the environmental cost specifically.” He stopped short of offering examples. “It is unquestioningly very highly desirable that data centers be as green as possible,” he added. “No question,” Brockman quipped. “Data centers are the biggest consumer of energy, of electricity,” Sutskever continued, seeming intent now on proving that he was aware of and cared about this issue. “It’s 2 percent globally,” I offered. “Isn’t Bitcoin like 1 percent?” Brockman said. “Wow!” Sutskever said, in a sudden burst of emotion that felt, at this point, forty minutes into the conversation, somewhat performative. Sutskever would later sit down with New York Times reporter Cade Metz for his book Genius Makers, which recounts a narrative history of AI development, and say without a hint of satire, “I think that it’s fairly likely that it will not take too long of a time for the entire surface of the Earth to become covered with data centers and power stations.” There would be “a tsunami of computing . . . almost like a natural phenomenon.” AGI—and thus the data centers needed to support them—would be “too useful to not exist.” I tried again to press for more details. “What you’re saying is OpenAI is making a huge gamble that you will successfully reach beneficial AGI to counteract global warming before the act of doing so might exacerbate it.” “I wouldn’t go too far down that rabbit hole,” Brockman hastily cut in. “The way we think about it is the following: We’re on a ramp of AI progress. This is bigger than OpenAI, right? It’s the field. And I think society is actually getting benefit from it.” “The day we announced the deal,” he said, referring to Microsoft’s new billion investment, “Microsoft’s market cap went up by billion. People believe there is a positive ROI even just on short‑term technology.” OpenAI’s strategy was thus quite simple, he explained: to keep up with that progress. “That’s the standard we should really hold ourselves to. We should continue to make that progress. That’s how we know we’re on track.” Later that day, Brockman reiterated that the central challenge of working at OpenAI was that no one really knew what AGI would look like. But as researchers and engineers, their task was to keep pushing forward, to unearth the shape of the technology step by step. He spoke like Michelangelo, as though AGI already existed within the marble he was carving. All he had to do was chip away until it revealed itself. There had been a change of plans. I had been scheduled to eat lunch with employees in the cafeteria, but something now required me to be outside the office. Brockman would be my chaperone. We headed two dozen steps across the street to an open‑air café that had become a favorite haunt for employees. This would become a recurring theme throughout my visit: floors I couldn’t see, meetings I couldn’t attend, researchers stealing furtive glances at the communications head every few sentences to check that they hadn’t violated some disclosure policy. I would later learn that after my visit, Jack Clark would issue an unusually stern warning to employees on Slack not to speak with me beyond sanctioned conversations. The security guard would receive a photo of me with instructions to be on the lookout if I appeared unapproved on the premises. It was odd behavior in general, made odder by OpenAI’s commitment to transparency. What, I began to wonder, were they hiding, if everything was supposed to be beneficial research eventually made available to the public? At lunch and through the following days, I probed deeper into why Brockman had cofounded OpenAI. He was a teen when he first grew obsessed with the idea that it could be possible to re‑create human intelligence. It was a famous paper from British mathematician Alan Turing that sparked his fascination. The name of its first section, “The Imitation Game,” which inspired the title of the 2014 Hollywood dramatization of Turing’s life, begins with the opening provocation, “Can machines think?” The paper goes on to define what would become known as the Turing test: a measure of the progression of machine intelligence based on whether a machine can talk to a human without giving away that it is a machine. It was a classic origin story among people working in AI. Enchanted, Brockman coded up a Turing test game and put it online, garnering some 1,500 hits. It made him feel amazing. “I just realized that was the kind of thing I wanted to pursue,” he said. In 2015, as AI saw great leaps of advancement, Brockman says that he realized it was time to return to his original ambition and joined OpenAI as a cofounder. He wrote down in his notes that he would do anything to bring AGI to fruition, even if it meant being a janitor. When he got married four years later, he held a civil ceremony at OpenAI’s office in front of a custom flower wall emblazoned with the shape of the lab’s hexagonal logo. Sutskever officiated. The robotic hand they used for research stood in the aisle bearing the rings, like a sentinel from a post-apocalyptic future. “Fundamentally, I want to work on AGI for the rest of my life,” Brockman told me. What motivated him? I asked Brockman. What are the chances that a transformative technology could arrive in your lifetime? he countered. He was confident that he—and the team he assembled—was uniquely positioned to usher in that transformation. “What I’m really drawn to are problems that will not play out in the same way if I don’t participate,” he said. Brockman did not in fact just want to be a janitor. He wanted to lead AGI. And he bristled with the anxious energy of someone who wanted history‑defining recognition. He wanted people to one day tell his story with the same mixture of awe and admiration that he used to recount the ones of the great innovators who came before him. A year before we spoke, he had told a group of young tech entrepreneurs at an exclusive retreat in Lake Tahoe with a twinge of self‑pity that chief technology officers were never known. Name a famous CTO, he challenged the crowd. They struggled to do so. He had proved his point. In 2022, he became OpenAI’s president. During our conversations, Brockman insisted to me that none of OpenAI’s structural changes signaled a shift in its core mission. In fact, the capped profit and the new crop of funders enhanced it. “We managed to get these mission‑aligned investors who are willing to prioritize mission over returns. That’s a crazy thing,” he said. OpenAI now had the long‑term resources it needed to scale its models and stay ahead of the competition. This was imperative, Brockman stressed. Failing to do so was the real threat that could undermine OpenAI’s mission. If the lab fell behind, it had no hope of bending the arc of history toward its vision of beneficial AGI. Only later would I realize the full implications of this assertion. It was this fundamental assumption—the need to be first or perish—that set in motion all of OpenAI’s actions and their far‑reaching consequences. It put a ticking clock on each of OpenAI’s research advancements, based not on the timescale of careful deliberation but on the relentless pace required to cross the finish line before anyone else. It justified OpenAI’s consumption of an unfathomable amount of resources: both compute, regardless of its impact on the environment; and data, the amassing of which couldn’t be slowed by getting consent or abiding by regulations. Brockman pointed once again to the billion jump in Microsoft’s market cap. “What that really reflects is AI is delivering real value to the real world today,” he said. That value was currently being concentrated in an already wealthy corporation, he acknowledged, which was why OpenAI had the second part of its mission: to redistribute the benefits of AGI to everyone. Was there a historical example of a technology’s benefits that had been successfully distributed? I asked. “Well, I actually think that—it’s actually interesting to look even at the internet as an example,” he said, fumbling a bit before settling on his answer. “There’s problems, too, right?” he said as a caveat. “Anytime you have something super transformative, it’s not going to be easy to figure out how to maximize positive, minimize negative. “Fire is another example,” he added. “It’s also got some real drawbacks to it. So we have to figure out how to keep it under control and have shared standards. “Cars are a good example,” he followed. “Lots of people have cars, benefit a lot of people. They have some drawbacks to them as well. They have some externalities that are not necessarily good for the world,” he finished hesitantly. “I guess I just view—the thing we want for AGI is not that different from the positive sides of the internet, positive sides of cars, positive sides of fire. The implementation is very different, though, because it’s a very different type of technology.” His eyes lit up with a new idea. “Just look at utilities. Power companies, electric companies are very centralized entities that provide low‑cost, high‑quality things that meaningfully improve people’s lives.” It was a nice analogy. But Brockman seemed once again unclear about how OpenAI would turn itself into a utility. Perhaps through distributing universal basic income, he wondered aloud, perhaps through something else. He returned to the one thing he knew for certain. OpenAI was committed to redistributing AGI’s benefits and giving everyone economic freedom. “We actually really mean that,” he said. “The way that we think about it is: Technology so far has been something that does rise all the boats, but it has this real concentrating effect,” he said. “AGI could be more extreme. What if all value gets locked up in one place? That is the trajectory we’re on as a society. And we’ve never seen that extreme of it. I don’t think that’s a good world. That’s not a world that I want to sign up for. That’s not a world that I want to help build.” In February 2020, I published my profile for MIT Technology Review, drawing on my observations from my time in the office, nearly three dozen interviews, and a handful of internal documents. “There is a misalignment between what the company publicly espouses and how it operates behind closed doors,” I wrote. “Over time, it has allowed a fierce competitiveness and mounting pressure for ever more funding to erode its founding ideals of transparency, openness, and collaboration.” Hours later, Elon Musk replied to the story with three tweets in rapid succession: “OpenAI should be more open imo” “I have no control & only very limited insight into OpenAI. Confidence in Dario for safety is not high,” he said, referring to Dario Amodei, the director of research. “All orgs developing advanced AI should be regulated, including Tesla” Afterward, Altman sent OpenAI employees an email. “I wanted to share some thoughts about the Tech Review article,” he wrote. “While definitely not catastrophic, it was clearly bad.” It was “a fair criticism,” he said that the piece had identified a disconnect between the perception of OpenAI and its reality. This could be smoothed over not with changes to its internal practices but some tuning of OpenAI’s public messaging. “It’s good, not bad, that we have figured out how to be flexible and adapt,” he said, including restructuring the organization and heightening confidentiality, “in order to achieve our mission as we learn more.” OpenAI should ignore my article for now and, in a few weeks’ time, start underscoring its continued commitment to its original principles under the new transformation. “This may also be a good opportunity to talk about the API as a strategy for openness and benefit sharing,” he added, referring to an application programming interface for delivering OpenAI’s models. “The most serious issue of all, to me,” he continued, “is that someone leaked our internal documents.” They had already opened an investigation and would keep the company updated. He would also suggest that Amodei and Musk meet to work out Musk’s criticism, which was “mild relative to other things he’s said” but still “a bad thing to do.” For the avoidance of any doubt, Amodei’s work and AI safety were critical to the mission, he wrote. “I think we should at some point in the future find a way to publicly defend our team.” OpenAI wouldn’t speak to me again for three years. From the book Empire of AI: Dreams and Nightmares in Sam Altman’s OpenAI, by Karen Hao, to be published on May 20, 2025, by Penguin Press, an imprint of Penguin Publishing Group, a division of Penguin Random House LLC. Copyright © 2025 by Karen Hao. #inside #story #that #enraged #openai

Inside the story that enraged OpenAI

www.technologyreview.com
In 2019, Karen Hao, a senior reporter with MIT Technology Review, pitched me on writing a story about a then little-known company, OpenAI. It was her biggest assignment to date. Hao’s feat of reporting took a series of twists and turns over the coming months, eventually revealing how OpenAI’s ambition had taken it far afield from its original mission. The finished story was a prescient look at a company at a tipping point—or already past it. And OpenAI was not happy with the result. Hao’s new book, Empire of AI: Dreams and Nightmares in Sam Altman’s OpenAI, is an in-depth exploration of the company that kick-started the AI arms race, and what that race means for all of us. This excerpt is the origin story of that reporting. — Niall Firth, executive editor, MIT Technology Review I arrived at OpenAI’s offices on August 7, 2019. Greg Brockman, then thirty‑one, OpenAI’s chief technology officer and soon‑to‑be company president, came down the staircase to greet me. He shook my hand with a tentative smile. “We’ve never given someone so much access before,” he said. At the time, few people beyond the insular world of AI research knew about OpenAI. But as a reporter at MIT Technology Review covering the ever‑expanding boundaries of artificial intelligence, I had been following its movements closely. Until that year, OpenAI had been something of a stepchild in AI research. It had an outlandish premise that AGI could be attained within a decade, when most non‑OpenAI experts doubted it could be attained at all. To much of the field, it had an obscene amount of funding despite little direction and spent too much of the money on marketing what other researchers frequently snubbed as unoriginal research. It was, for some, also an object of envy. As a nonprofit, it had said that it had no intention to chase commercialization. It was a rare intellectual playground without strings attached, a haven for fringe ideas. But in the six months leading up to my visit, the rapid slew of changes at OpenAI signaled a major shift in its trajectory. First was its confusing decision to withhold GPT‑2 and brag about it. Then its announcement that Sam Altman, who had mysteriously departed his influential perch at YC, would step in as OpenAI’s CEO with the creation of its new “capped‑profit” structure. I had already made my arrangements to visit the office when it subsequently revealed its deal with Microsoft, which gave the tech giant priority for commercializing OpenAI’s technologies and locked it into exclusively using Azure, Microsoft’s cloud‑computing platform. Each new announcement garnered fresh controversy, intense speculation, and growing attention, beginning to reach beyond the confines of the tech industry. As my colleagues and I covered the company’s progression, it was hard to grasp the full weight of what was happening. What was clear was that OpenAI was beginning to exert meaningful sway over AI research and the way policymakers were learning to understand the technology. The lab’s decision to revamp itself into a partially for‑profit business would have ripple effects across its spheres of influence in industry and government. So late one night, with the urging of my editor, I dashed off an email to Jack Clark, OpenAI’s policy director, whom I had spoken with before: I would be in town for two weeks, and it felt like the right moment in OpenAI’s history. Could I interest them in a profile? Clark passed me on to the communications head, who came back with an answer. OpenAI was indeed ready to reintroduce itself to the public. I would have three days to interview leadership and embed inside the company. Brockman and I settled into a glass meeting room with the company’s chief scientist, Ilya Sutskever. Sitting side by side at a long conference table, they each played their part. Brockman, the coder and doer, leaned forward, a little on edge, ready to make a good impression; Sutskever, the researcher and philosopher, settled back into his chair, relaxed and aloof. I opened my laptop and scrolled through my questions. OpenAI’s mission is to ensure beneficial AGI, I began. Why spend billions of dollars on this problem and not something else? Brockman nodded vigorously. He was used to defending OpenAI’s position. “The reason that we care so much about AGI and that we think it’s important to build is because we think it can help solve complex problems that are just out of reach of humans,” he said. He offered two examples that had become dogma among AGI believers. Climate change. “It’s a super‑complex problem. How are you even supposed to solve it?” And medicine. “Look at how important health care is in the US as a political issue these days. How do we actually get better treatment for people at lower cost?” On the latter, he began to recount the story of a friend who had a rare disorder and had recently gone through the exhausting rigmarole of bouncing between different specialists to figure out his problem. AGI would bring together all of these specialties. People like his friend would no longer spend so much energy and frustration on getting an answer. Why did we need AGI to do that instead of AI? I asked. This was an important distinction. The term AGI, once relegated to an unpopular section of the technology dictionary, had only recently begun to gain more mainstream usage—in large part because of OpenAI. And as OpenAI defined it, AGI referred to a theoretical pinnacle of AI research: a piece of software that had just as much sophistication, agility, and creativity as the human mind to match or exceed its performance on most (economically valuable) tasks. The operative word was theoretical. Since the beginning of earnest research into AI several decades earlier, debates had raged about whether silicon chips encoding everything in their binary ones and zeros could ever simulate brains and the other biological processes that give rise to what we consider intelligence. There had yet to be definitive evidence that this was possible, which didn’t even touch on the normative discussion of whether people should develop it. AI, on the other hand, was the term du jour for both the version of the technology currently available and the version that researchers could reasonably attain in the near future through refining existing capabilities. Those capabilities—rooted in powerful pattern matching known as machine learning—had already demonstrated exciting applications in climate change mitigation and health care. Sutskever chimed in. When it comes to solving complex global challenges, “fundamentally the bottleneck is that you have a large number of humans and they don’t communicate as fast, they don’t work as fast, they have a lot of incentive problems.” AGI would be different, he said. “Imagine it’s a large computer network of intelligent computers—they’re all doing their medical diagnostics; they all communicate results between them extremely fast.” This seemed to me like another way of saying that the goal of AGI was to replace humans. Is that what Sutskever meant? I asked Brockman a few hours later, once it was just the two of us. “No,” Brockman replied quickly. “This is one thing that’s really important. What is the purpose of technology? Why is it here? Why do we build it? We’ve been building technologies for thousands of years now, right? We do it because they serve people. AGI is not going to be different—not the way that we envision it, not the way we want to build it, not the way we think it should play out.” That said, he acknowledged a few minutes later, technology had always destroyed some jobs and created others. OpenAI’s challenge would be to build AGI that gave everyone “economic freedom” while allowing them to continue to “live meaningful lives” in that new reality. If it succeeded, it would decouple the need to work from survival. “I actually think that’s a very beautiful thing,” he said. In our meeting with Sutskever, Brockman reminded me of the bigger picture. “What we view our role as is not actually being a determiner of whether AGI gets built,” he said. This was a favorite argument in Silicon Valley—the inevitability card. If we don’t do it, somebody else will. “The trajectory is already there,” he emphasized, “but the thing we can influence is the initial conditions under which it’s born. “What is OpenAI?” he continued. “What is our purpose? What are we really trying to do? Our mission is to ensure that AGI benefits all of humanity. And the way we want to do that is: Build AGI and distribute its economic benefits.” His tone was matter‑of‑fact and final, as if he’d put my questions to rest. And yet we had somehow just arrived back to exactly where we’d started. Our conversation continued on in circles until we ran out the clock after forty‑five minutes. I tried with little success to get more concrete details on what exactly they were trying to build—which by nature, they explained, they couldn’t know—and why, then, if they couldn’t know, they were so confident it would be beneficial. At one point, I tried a different approach, asking them instead to give examples of the downsides of the technology. This was a pillar of OpenAI’s founding mythology: The lab had to build good AGI before someone else built a bad one. Brockman attempted an answer: deepfakes. “It’s not clear the world is better through its applications,” he said. I offered my own example: Speaking of climate change, what about the environmental impact of AI itself? A recent study from the University of Massachusetts Amherst had placed alarming numbers on the huge and growing carbon emissions of training larger and larger AI models. That was “undeniable,” Sutskever said, but the payoff was worth it because AGI would, “among other things, counteract the environmental cost specifically.” He stopped short of offering examples. “It is unquestioningly very highly desirable that data centers be as green as possible,” he added. “No question,” Brockman quipped. “Data centers are the biggest consumer of energy, of electricity,” Sutskever continued, seeming intent now on proving that he was aware of and cared about this issue. “It’s 2 percent globally,” I offered. “Isn’t Bitcoin like 1 percent?” Brockman said. “Wow!” Sutskever said, in a sudden burst of emotion that felt, at this point, forty minutes into the conversation, somewhat performative. Sutskever would later sit down with New York Times reporter Cade Metz for his book Genius Makers, which recounts a narrative history of AI development, and say without a hint of satire, “I think that it’s fairly likely that it will not take too long of a time for the entire surface of the Earth to become covered with data centers and power stations.” There would be “a tsunami of computing . . . almost like a natural phenomenon.” AGI—and thus the data centers needed to support them—would be “too useful to not exist.” I tried again to press for more details. “What you’re saying is OpenAI is making a huge gamble that you will successfully reach beneficial AGI to counteract global warming before the act of doing so might exacerbate it.” “I wouldn’t go too far down that rabbit hole,” Brockman hastily cut in. “The way we think about it is the following: We’re on a ramp of AI progress. This is bigger than OpenAI, right? It’s the field. And I think society is actually getting benefit from it.” “The day we announced the deal,” he said, referring to Microsoft’s new $1 billion investment, “Microsoft’s market cap went up by $10 billion. People believe there is a positive ROI even just on short‑term technology.” OpenAI’s strategy was thus quite simple, he explained: to keep up with that progress. “That’s the standard we should really hold ourselves to. We should continue to make that progress. That’s how we know we’re on track.” Later that day, Brockman reiterated that the central challenge of working at OpenAI was that no one really knew what AGI would look like. But as researchers and engineers, their task was to keep pushing forward, to unearth the shape of the technology step by step. He spoke like Michelangelo, as though AGI already existed within the marble he was carving. All he had to do was chip away until it revealed itself. There had been a change of plans. I had been scheduled to eat lunch with employees in the cafeteria, but something now required me to be outside the office. Brockman would be my chaperone. We headed two dozen steps across the street to an open‑air café that had become a favorite haunt for employees. This would become a recurring theme throughout my visit: floors I couldn’t see, meetings I couldn’t attend, researchers stealing furtive glances at the communications head every few sentences to check that they hadn’t violated some disclosure policy. I would later learn that after my visit, Jack Clark would issue an unusually stern warning to employees on Slack not to speak with me beyond sanctioned conversations. The security guard would receive a photo of me with instructions to be on the lookout if I appeared unapproved on the premises. It was odd behavior in general, made odder by OpenAI’s commitment to transparency. What, I began to wonder, were they hiding, if everything was supposed to be beneficial research eventually made available to the public? At lunch and through the following days, I probed deeper into why Brockman had cofounded OpenAI. He was a teen when he first grew obsessed with the idea that it could be possible to re‑create human intelligence. It was a famous paper from British mathematician Alan Turing that sparked his fascination. The name of its first section, “The Imitation Game,” which inspired the title of the 2014 Hollywood dramatization of Turing’s life, begins with the opening provocation, “Can machines think?” The paper goes on to define what would become known as the Turing test: a measure of the progression of machine intelligence based on whether a machine can talk to a human without giving away that it is a machine. It was a classic origin story among people working in AI. Enchanted, Brockman coded up a Turing test game and put it online, garnering some 1,500 hits. It made him feel amazing. “I just realized that was the kind of thing I wanted to pursue,” he said. In 2015, as AI saw great leaps of advancement, Brockman says that he realized it was time to return to his original ambition and joined OpenAI as a cofounder. He wrote down in his notes that he would do anything to bring AGI to fruition, even if it meant being a janitor. When he got married four years later, he held a civil ceremony at OpenAI’s office in front of a custom flower wall emblazoned with the shape of the lab’s hexagonal logo. Sutskever officiated. The robotic hand they used for research stood in the aisle bearing the rings, like a sentinel from a post-apocalyptic future. “Fundamentally, I want to work on AGI for the rest of my life,” Brockman told me. What motivated him? I asked Brockman. What are the chances that a transformative technology could arrive in your lifetime? he countered. He was confident that he—and the team he assembled—was uniquely positioned to usher in that transformation. “What I’m really drawn to are problems that will not play out in the same way if I don’t participate,” he said. Brockman did not in fact just want to be a janitor. He wanted to lead AGI. And he bristled with the anxious energy of someone who wanted history‑defining recognition. He wanted people to one day tell his story with the same mixture of awe and admiration that he used to recount the ones of the great innovators who came before him. A year before we spoke, he had told a group of young tech entrepreneurs at an exclusive retreat in Lake Tahoe with a twinge of self‑pity that chief technology officers were never known. Name a famous CTO, he challenged the crowd. They struggled to do so. He had proved his point. In 2022, he became OpenAI’s president. During our conversations, Brockman insisted to me that none of OpenAI’s structural changes signaled a shift in its core mission. In fact, the capped profit and the new crop of funders enhanced it. “We managed to get these mission‑aligned investors who are willing to prioritize mission over returns. That’s a crazy thing,” he said. OpenAI now had the long‑term resources it needed to scale its models and stay ahead of the competition. This was imperative, Brockman stressed. Failing to do so was the real threat that could undermine OpenAI’s mission. If the lab fell behind, it had no hope of bending the arc of history toward its vision of beneficial AGI. Only later would I realize the full implications of this assertion. It was this fundamental assumption—the need to be first or perish—that set in motion all of OpenAI’s actions and their far‑reaching consequences. It put a ticking clock on each of OpenAI’s research advancements, based not on the timescale of careful deliberation but on the relentless pace required to cross the finish line before anyone else. It justified OpenAI’s consumption of an unfathomable amount of resources: both compute, regardless of its impact on the environment; and data, the amassing of which couldn’t be slowed by getting consent or abiding by regulations. Brockman pointed once again to the $10 billion jump in Microsoft’s market cap. “What that really reflects is AI is delivering real value to the real world today,” he said. That value was currently being concentrated in an already wealthy corporation, he acknowledged, which was why OpenAI had the second part of its mission: to redistribute the benefits of AGI to everyone. Was there a historical example of a technology’s benefits that had been successfully distributed? I asked. “Well, I actually think that—it’s actually interesting to look even at the internet as an example,” he said, fumbling a bit before settling on his answer. “There’s problems, too, right?” he said as a caveat. “Anytime you have something super transformative, it’s not going to be easy to figure out how to maximize positive, minimize negative. “Fire is another example,” he added. “It’s also got some real drawbacks to it. So we have to figure out how to keep it under control and have shared standards. “Cars are a good example,” he followed. “Lots of people have cars, benefit a lot of people. They have some drawbacks to them as well. They have some externalities that are not necessarily good for the world,” he finished hesitantly. “I guess I just view—the thing we want for AGI is not that different from the positive sides of the internet, positive sides of cars, positive sides of fire. The implementation is very different, though, because it’s a very different type of technology.” His eyes lit up with a new idea. “Just look at utilities. Power companies, electric companies are very centralized entities that provide low‑cost, high‑quality things that meaningfully improve people’s lives.” It was a nice analogy. But Brockman seemed once again unclear about how OpenAI would turn itself into a utility. Perhaps through distributing universal basic income, he wondered aloud, perhaps through something else. He returned to the one thing he knew for certain. OpenAI was committed to redistributing AGI’s benefits and giving everyone economic freedom. “We actually really mean that,” he said. “The way that we think about it is: Technology so far has been something that does rise all the boats, but it has this real concentrating effect,” he said. “AGI could be more extreme. What if all value gets locked up in one place? That is the trajectory we’re on as a society. And we’ve never seen that extreme of it. I don’t think that’s a good world. That’s not a world that I want to sign up for. That’s not a world that I want to help build.” In February 2020, I published my profile for MIT Technology Review, drawing on my observations from my time in the office, nearly three dozen interviews, and a handful of internal documents. “There is a misalignment between what the company publicly espouses and how it operates behind closed doors,” I wrote. “Over time, it has allowed a fierce competitiveness and mounting pressure for ever more funding to erode its founding ideals of transparency, openness, and collaboration.” Hours later, Elon Musk replied to the story with three tweets in rapid succession: “OpenAI should be more open imo” “I have no control & only very limited insight into OpenAI. Confidence in Dario for safety is not high,” he said, referring to Dario Amodei, the director of research. “All orgs developing advanced AI should be regulated, including Tesla” Afterward, Altman sent OpenAI employees an email. “I wanted to share some thoughts about the Tech Review article,” he wrote. “While definitely not catastrophic, it was clearly bad.” It was “a fair criticism,” he said that the piece had identified a disconnect between the perception of OpenAI and its reality. This could be smoothed over not with changes to its internal practices but some tuning of OpenAI’s public messaging. “It’s good, not bad, that we have figured out how to be flexible and adapt,” he said, including restructuring the organization and heightening confidentiality, “in order to achieve our mission as we learn more.” OpenAI should ignore my article for now and, in a few weeks’ time, start underscoring its continued commitment to its original principles under the new transformation. “This may also be a good opportunity to talk about the API as a strategy for openness and benefit sharing,” he added, referring to an application programming interface for delivering OpenAI’s models. “The most serious issue of all, to me,” he continued, “is that someone leaked our internal documents.” They had already opened an investigation and would keep the company updated. He would also suggest that Amodei and Musk meet to work out Musk’s criticism, which was “mild relative to other things he’s said” but still “a bad thing to do.” For the avoidance of any doubt, Amodei’s work and AI safety were critical to the mission, he wrote. “I think we should at some point in the future find a way to publicly defend our team (but not give the press the public fight they’d love right now).” OpenAI wouldn’t speak to me again for three years. From the book Empire of AI: Dreams and Nightmares in Sam Altman’s OpenAI, by Karen Hao, to be published on May 20, 2025, by Penguin Press, an imprint of Penguin Publishing Group, a division of Penguin Random House LLC. Copyright © 2025 by Karen Hao.

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça Login para curtir, compartilhar e comentar!

Atualize para o Pro