• The Orb Will See You Now

    Once again, Sam Altman wants to show you the future. The CEO of OpenAI is standing on a sparse stage in San Francisco, preparing to reveal his next move to an attentive crowd. “We needed some way for identifying, authenticating humans in the age of AGI,” Altman explains, referring to artificial general intelligence. “We wanted a way to make sure that humans stayed special and central.” The solution Altman came up with is looming behind him. It’s a white sphere about the size of a beach ball, with a camera at its center. The company that makes it, known as Tools for Humanity, calls this mysterious device the Orb. Stare into the heart of the plastic-and-silicon globe and it will map the unique furrows and ciliary zones of your iris. Seconds later, you’ll receive inviolable proof of your humanity: a 12,800-digit binary number, known as an iris code, sent to an app on your phone. At the same time, a packet of cryptocurrency called Worldcoin, worth approximately will be transferred to your digital wallet—your reward for becoming a “verified human.” Altman co-founded Tools for Humanity in 2019 as part of a suite of companies he believed would reshape the world. Once the tech he was developing at OpenAI passed a certain level of intelligence, he reasoned, it would mark the end of one era on the Internet and the beginning of another, in which AI became so advanced, so human-like, that you would no longer be able to tell whether what you read, saw, or heard online came from a real person. When that happened, Altman imagined, we would need a new kind of online infrastructure: a human-verification layer for the Internet, to distinguish real people from the proliferating number of bots and AI “agents.”And so Tools for Humanity set out to build a global “proof-of-humanity” network. It aims to verify 50 million people by the end of 2025; ultimately its goal is to sign up every single human being on the planet. The free crypto serves as both an incentive for users to sign up, and also an entry point into what the company hopes will become the world’s largest financial network, through which it believes “double-digit percentages of the global economy” will eventually flow. Even for Altman, these missions are audacious. “If this really works, it’s like a fundamental piece of infrastructure for the world,” Altman tells TIME in a video interview from the passenger seat of a car a few days before his April 30 keynote address.Internal hardware of the Orb in mid-assembly in March. Davide Monteleone for TIMEThe project’s goal is to solve a problem partly of Altman’s own making. In the near future, he and other tech leaders say, advanced AIs will be imbued with agency: the ability to not just respond to human prompting, but to take actions independently in the world. This will enable the creation of AI coworkers that can drop into your company and begin solving problems; AI tutors that can adapt their teaching style to students’ preferences; even AI doctors that can diagnose routine cases and handle scheduling or logistics. The arrival of these virtual agents, their venture capitalist backers predict, will turbocharge our productivity and unleash an age of material abundance.But AI agents will also have cascading consequences for the human experience online. “As AI systems become harder to distinguish from people, websites may face difficult trade-offs,” says a recent paper by researchers from 25 different universities, nonprofits, and tech companies, including OpenAI. “There is a significant risk that digital institutions will be unprepared for a time when AI-powered agents, including those leveraged by malicious actors, overwhelm other activity online.” On social-media platforms like X and Facebook, bot-driven accounts are amassing billions of views on AI-generated content. In April, the foundation that runs Wikipedia disclosed that AI bots scraping their site were making the encyclopedia too costly to sustainably run. Later the same month, researchers from the University of Zurich found that AI-generated comments on the subreddit /r/ChangeMyView were up to six times more successful than human-written ones at persuading unknowing users to change their minds.  Photograph by Davide Monteleone for TIMEBuy a copy of the Orb issue hereThe arrival of agents won’t only threaten our ability to distinguish between authentic and AI content online. It will also challenge the Internet’s core business model, online advertising, which relies on the assumption that ads are being viewed by humans. “The Internet will change very drastically sometime in the next 12 to 24 months,” says Tools for Humanity CEO Alex Blania. “So we have to succeed, or I’m not sure what else would happen.”For four years, Blania’s team has been testing the Orb’s hardware abroad. Now the U.S. rollout has arrived. Over the next 12 months, 7,500 Orbs will be arriving in dozens of American cities, in locations like gas stations, bodegas, and flagship stores in Los Angeles, Austin, and Miami. The project’s founders and fans hope the Orb’s U.S. debut will kickstart a new phase of growth. The San Francisco keynote was titled: “At Last.” It’s not clear the public appetite matches the exultant branding. Tools for Humanity has “verified” just 12 million humans since mid 2023, a pace Blania concedes is well behind schedule. Few online platforms currently support the so-called “World ID” that the Orb bestows upon its visitors, leaving little to entice users to give up their biometrics beyond the lure of free crypto. Even Altman isn’t sure whether the whole thing can work. “I can seethis becomes a fairly mainstream thing in a few years,” he says. “Or I can see that it’s still only used by a small subset of people who think about the world in a certain way.” Blaniaand Altman debut the Orb at World’s U.S. launch in San Francisco on April 30, 2025. Jason Henry—The New York Times/ReduxYet as the Internet becomes overrun with AI, the creators of this strange new piece of hardware are betting that everybody in the world will soon want—or need—to visit an Orb. The biometric code it creates, they predict, will become a new type of digital passport, without which you might be denied passage to the Internet of the future, from dating apps to government services. In a best-case scenario, World ID could be a privacy-preserving way to fortify the Internet against an AI-driven deluge of fake or deceptive content. It could also enable the distribution of universal basic income—a policy that Altman has previously touted—as AI automation transforms the global economy. To examine what this new technology might mean, I reported from three continents, interviewed 10 Tools for Humanity executives and investors, reviewed hundreds of pages of company documents, and “verified” my own humanity. The Internet will inevitably need some kind of proof-of-humanity system in the near future, says Divya Siddarth, founder of the nonprofit Collective Intelligence Project. The real question, she argues, is whether such a system will be centralized—“a big security nightmare that enables a lot of surveillance”—or privacy-preserving, as the Orb claims to be. Questions remain about Tools for Humanity’s corporate structure, its yoking to an unstable cryptocurrency, and what power it would concentrate in the hands of its owners if successful. Yet it’s also one of the only attempts to solve what many see as an increasingly urgent problem. “There are some issues with it,” Siddarth says of World ID. “But you can’t preserve the Internet in amber. Something in this direction is necessary.”In March, I met Blania at Tools for Humanity’s San Francisco headquarters, where a large screen displays the number of weekly “Orb verifications” by country. A few days earlier, the CEO had attended a million-per-head dinner at Mar-a-Lago with President Donald Trump, whom he credits with clearing the way for the company’s U.S. launch by relaxing crypto regulations. “Given Sam is a very high profile target,” Blania says, “we just decided that we would let other companies fight that fight, and enter the U.S. once the air is clear.” As a kid growing up in Germany, Blania was a little different than his peers. “Other kids were, like, drinking a lot, or doing a lot of parties, and I was just building a lot of things that could potentially blow up,” he recalls. At the California Institute of Technology, where he was pursuing research for a masters degree, he spent many evenings reading the blogs of startup gurus like Paul Graham and Altman. Then, in 2019, Blania received an email from Max Novendstern, an entrepreneur who had been kicking around a concept with Altman to build a global cryptocurrency network. They were looking for technical minds to help with the project. Over cappuccinos, Altman told Blania he was certain about three things. First, smarter-than-human AI was not only possible, but inevitable—and it would soon mean you could no longer assume that anything you read, saw, or heard on the Internet was human-created. Second, cryptocurrency and other decentralized technologies would be a massive force for change in the world. And third, scale was essential to any crypto network’s value. The Orb is tested on a calibration rig, surrounded by checkerboard targets to ensure precision in iris detection. Davide Monteleone for TIMEThe goal of Worldcoin, as the project was initially called, was to combine those three insights. Altman took a lesson from PayPal, the company co-founded by his mentor Peter Thiel. Of its initial funding, PayPal spent less than million actually building its app—but pumped an additional million or so into a referral program, whereby new users and the person who invited them would each receive in credit. The referral program helped make PayPal a leading payment platform. Altman thought a version of that strategy would propel Worldcoin to similar heights. He wanted to create a new cryptocurrency and give it to users as a reward for signing up. The more people who joined the system, the higher the token’s value would theoretically rise. Since 2019, the project has raised million from investors like Coinbase and the venture capital firm Andreessen Horowitz. That money paid for the million cost of designing the Orb, plus maintaining the software it runs on. The total market value of all Worldcoins in existence, however, is far higher—around billion. That number is a bit misleading: most of those coins are not in circulation and Worldcoin’s price has fluctuated wildly. Still, it allows the company to reward users for signing up at no cost to itself. The main lure for investors is the crypto upside. Some 75% of all Worldcoins are set aside for humans to claim when they sign up, or as referral bonuses. The remaining 25% are split between Tools for Humanity’s backers and staff, including Blania and Altman. “I’m really excited to make a lot of money,” ” Blania says.From the beginning, Altman was thinking about the consequences of the AI revolution he intended to unleash.A future in which advanced AI could perform most tasks more effectively than humans would bring a wave of unemployment and economic dislocation, he reasoned. Some kind of wealth redistribution might be necessary. In 2016, he partially funded a study of basic income, which gave per-month handouts to low-income individuals in Illinois and Texas. But there was no single financial system that would allow money to be sent to everybody in the world. Nor was there a way to stop an individual human from claiming their share twice—or to identify a sophisticated AI pretending to be human and pocketing some cash of its own. In 2023, Tools for Humanity raised the possibility of using the network to redistribute the profits of AI labs that were able to automate human labor. “As AI advances,” it said, “fairly distributing access and some of the created value through UBI will play an increasingly vital role in counteracting the concentration of economic power.”Blania was taken by the pitch, and agreed to join the project as a co-founder. “Most people told us we were very stupid or crazy or insane, including Silicon Valley investors,” Blania says. At least until ChatGPT came out in 2022, transforming OpenAI into one of the world’s most famous tech companies and kickstarting a market bull-run. “Things suddenly started to make more and more sense to the external world,” Blania says of the vision to develop a global “proof-of-humanity” network. “You have to imagine a world in which you will have very smart and competent systems somehow flying through the Internet with different goals and ideas of what they want to do, and us having no idea anymore what we’re dealing with.”After our interview, Blania’s head of communications ushers me over to a circular wooden structure where eight Orbs face one another. The scene feels like a cross between an Apple Store and a ceremonial altar. “Do you want to get verified?” she asks. Putting aside my reservations for the purposes of research, I download the World App and follow its prompts. I flash a QR code at the Orb, then gaze into it. A minute or so later, my phone buzzes with confirmation: I’ve been issued my own personal World ID and some Worldcoin.The first thing the Orb does is check if you’re human, using a neural network that takes input from various sensors, including an infrared camera and a thermometer. Davide Monteleone for TIMEWhile I stared into the Orb, several complex procedures had taken place at once. A neural network took inputs from multiple sensors—an infrared camera, a thermometer—to confirm I was a living human. Simultaneously, a telephoto lens zoomed in on my iris, capturing the physical traits within that distinguish me from every other human on Earth. It then converted that image into an iris code: a numerical abstraction of my unique biometric data. Then the Orb checked to see if my iris code matched any it had seen before, using a technique allowing encrypted data to be compared without revealing the underlying information. Before the Orb deleted my data, it turned my iris code into several derivative codes—none of which on its own can be linked back to the original—encrypted them, deleted the only copies of the decryption keys, and sent each one to a different secure server, so that future users’ iris codes can be checked for uniqueness against mine. If I were to use my World ID to access a website, that site would learn nothing about me except that I’m human. The Orb is open-source, so outside experts can examine its code and verify the company’s privacy claims. “I did a colonoscopy on this company and these technologies before I agreed to join,” says Trevor Traina, a Trump donor and former U.S. ambassador to Austria who now serves as Tools for Humanity’s chief business officer. “It is the most privacy-preserving technology on the planet.”Only weeks later, when researching what would happen if I wanted to delete my data, do I discover that Tools for Humanity’s privacy claims rest on what feels like a sleight of hand. The company argues that in modifying your iris code, it has “effectively anonymized” your biometric data. If you ask Tools for Humanity to delete your iris codes, they will delete the one stored on your phone, but not the derivatives. Those, they argue, are no longer your personal data at all. But if I were to return to an Orb after deleting my data, it would still recognize those codes as uniquely mine. Once you look into the Orb, a piece of your identity remains in the system forever. If users could truly delete that data, the premise of one ID per human would collapse, Tools for Humanity’s chief privacy officer Damien Kieran tells me when I call seeking an explanation. People could delete and sign up for new World IDs after being suspended from a platform. Or claim their Worldcoin tokens, sell them, delete their data, and cash in again. This argument fell flat with European Union regulators in Germany, who recently declared that the Orb posed “fundamental data protection issues” and ordered the company to allow European users to fully delete even their anonymized data.“Just like any other technology service, users cannot delete data that is not personal data,” Kieran said in a statement. “If a person could delete anonymized data that can’t be linked to them by World or any third party, it would allow bad actors to circumvent the security and safety that World ID is working to bring to every human.”On a balmy afternoon this spring, I climb a flight of stairs up to a room above a restaurant in an outer suburb of Seoul. Five elderly South Koreans tap on their phones as they wait to be “verified” by the two Orbs in the center of the room. “We don’t really know how to distinguish between AI and humans anymore,” an attendant in a company t-shirt explains in Korean, gesturing toward the spheres. “We need a way to verify that we’re human and not AI. So how do we do that? Well, humans have irises, but AI doesn’t.”The attendant ushers an elderly woman over to an Orb. It bleeps. “Open your eyes,” a disembodied voice says in English. The woman stares into the camera. Seconds later, she checks her phone and sees that a packet of Worldcoin worth 75,000 Korean wonhas landed in her digital wallet. Congratulations, the app tells her. You are now a verified human.A visitor views the Orbs in Seoul on April 14, 2025. Taemin Ha for TIMETools for Humanity aims to “verify” 1 million Koreans over the next year. Taemin Ha for TIMEA couple dozen Orbs have been available in South Korea since 2023, verifying roughly 55,000 people. Now Tools for Humanity is redoubling its efforts there. At an event in a traditional wooden hanok house in central Seoul, an executive announces that 250 Orbs will soon be dispersed around the country—with the aim of verifying 1 million Koreans in the next 12 months. South Korea has high levels of smartphone usage, crypto and AI adoption, and Internet access, while average wages are modest enough for the free Worldcoin on offer to still be an enticing draw—all of which makes it fertile testing ground for the company’s ambitious global expansion. Yet things seem off to a slow start. In a retail space I visited in central Seoul, Tools for Humanity had constructed a wooden structure with eight Orbs facing each other. Locals and tourists wander past looking bemused; few volunteer themselves up. Most who do tell me they are crypto enthusiasts who came intentionally, driven more by the spirit of early adoption than the free coins. The next day, I visit a coffee shop in central Seoul where a chrome Orb sits unassumingly in one corner. Wu Ruijun, a 20-year-old student from China, strikes up a conversation with the barista, who doubles as the Orb’s operator. Wu was invited here by a friend who said both could claim free cryptocurrency if he signed up. The barista speeds him through the process. Wu accepts the privacy disclosure without reading it, and widens his eyes for the Orb. Soon he’s verified. “I wasn’t told anything about the privacy policy,” he says on his way out. “I just came for the money.”As Altman’s car winds through San Francisco, I ask about the vision he laid out in 2019: that AI would make it harder for us to trust each other online. To my surprise, he rejects the framing. “I’m much morelike: what is the good we can create, rather than the bad we can stop?” he says. “It’s not like, ‘Oh, we’ve got to avoid the bot overrun’ or whatever. It’s just that we can do a lot of special things for humans.” It’s an answer that may reflect how his role has changed over the years. Altman is now the chief public cheerleader of a billion company that’s touting the transformative utility of AI agents. The rise of agents, he and others say, will be a boon for our quality of life—like having an assistant on hand who can answer your most pressing questions, carry out mundane tasks, and help you develop new skills. It’s an optimistic vision that may well pan out. But it doesn’t quite fit with the prophecies of AI-enabled infopocalypse that Tools for Humanity was founded upon.Altman waves away a question about the influence he and other investors stand to gain if their vision is realized. Most holders, he assumes, will have already started selling their tokens—too early, he adds. “What I think would be bad is if an early crew had a lot of control over the protocol,” he says, “and that’s where I think the commitment to decentralization is so cool.” Altman is referring to the World Protocol, the underlying technology upon which the Orb, Worldcoin, and World ID all rely. Tools for Humanity is developing it, but has committed to giving control to its users over time—a process they say will prevent power from being concentrated in the hands of a few executives or investors. Tools for Humanity would remain a for-profit company, and could levy fees on platforms that use World ID, but other companies would be able to compete for customers by building alternative apps—or even alternative Orbs. The plan draws on ideas that animated the crypto ecosystem in the late 2010s and early 2020s, when evangelists for emerging blockchain technologies argued that the centralization of power—especially in large so-called “Web 2.0” tech companies—was responsible for many of the problems plaguing the modern Internet. Just as decentralized cryptocurrencies could reform a financial system controlled by economic elites, so too would it be possible to create decentralized organizations, run by their members instead of CEOs. How such a system might work in practice remains unclear. “Building a community-based governance system,” Tools for Humanity says in a 2023 white paper, “represents perhaps the most formidable challenge of the entire project.”Altman has a pattern of making idealistic promises that shift over time. He founded OpenAI as a nonprofit in 2015, with a mission to develop AGI safely and for the benefit of all humanity. To raise money, OpenAI restructured itself as a for-profit company in 2019, but with overall control still in the hands of its nonprofit board. Last year, Altman proposed yet another restructure—one which would dilute the board’s control and allow more profits to flow to shareholders. Why, I ask, should the public trust Tools for Humanity’s commitment to freely surrender influence and power? “I think you will just see the continued decentralization via the protocol,” he says. “The value here is going to live in the network, and the network will be owned and governed by a lot of people.” Altman talks less about universal basic income these days. He recently mused about an alternative, which he called “universal basic compute.” Instead of AI companies redistributing their profits, he seemed to suggest, they could instead give everyone in the world fair access to super-powerful AI. Blania tells me he recently “made the decision to stop talking” about UBI at Tools for Humanity. “UBI is one potential answer,” he says. “Just givingaccess to the latestmodels and having them learn faster and better is another.” Says Altman: “I still don’t know what the right answer is. I believe we should do a better job of distribution of resources than we currently do.” When I probe the question of why people should trust him, Altman gets irritated. “I understand that you hate AI, and that’s fine,” he says. “If you want to frame it as the downside of AI is that there’s going to be a proliferation of very convincing AI systems that are pretending to be human, and we need ways to know what is really human-authorized versus not, then yeah, I think you can call that a downside of AI. It’s not how I would naturally frame it.” The phrase human-authorized hints at a tension between World ID and OpenAI’s plans for AI agents. An Internet where a World ID is required to access most services might impede the usefulness of the agents that OpenAI and others are developing. So Tools for Humanity is building a system that would allow users to delegate their World ID to an agent, allowing the bot to take actions online on their behalf, according to Tiago Sada, the company’s chief product officer. “We’ve built everything in a way that can be very easily delegatable to an agent,” Sada says. It’s a measure that would allow humans to be held accountable for the actions of their AIs. But it suggests that Tools for Humanity’s mission may be shifting beyond simply proving humanity, and toward becoming the infrastructure that enables AI agents to proliferate with human authorization. World ID doesn’t tell you whether a piece of content is AI-generated or human-generated; all it tells you is whether the account that posted it is a human or a bot. Even in a world where everybody had a World ID, our online spaces might still be filled with AI-generated text, images, and videos.As I say goodbye to Altman, I’m left feeling conflicted about his project. If the Internet is going to be transformed by AI agents, then some kind of proof-of-humanity system will almost certainly be necessary. Yet if the Orb becomes a piece of Internet infrastructure, it could give Altman—a beneficiary of the proliferation of AI content—significant influence over a leading defense mechanism against it. People might have no choice but to participate in the network in order to access social media or online services.I thought of an encounter I witnessed in Seoul. In the room above the restaurant, Cho Jeong-yeon, 75, watched her friend get verified by an Orb. Cho had been invited to do the same, but demurred. The reward wasn’t enough for her to surrender a part of her identity. “Your iris is uniquely yours, and we don’t really know how it might be used,” she says. “Seeing the machine made me think: are we becoming machines instead of humans now? Everything is changing, and we don’t know how it’ll all turn out.”—With reporting by Stephen Kim/Seoul. This story was supported by Tarbell Grants.Correction, May 30The original version of this story misstated the market capitalization of Worldcoin if all coins were in circulation. It is billion, not billion.
    #orb #will #see #you #now
    The Orb Will See You Now
    Once again, Sam Altman wants to show you the future. The CEO of OpenAI is standing on a sparse stage in San Francisco, preparing to reveal his next move to an attentive crowd. “We needed some way for identifying, authenticating humans in the age of AGI,” Altman explains, referring to artificial general intelligence. “We wanted a way to make sure that humans stayed special and central.” The solution Altman came up with is looming behind him. It’s a white sphere about the size of a beach ball, with a camera at its center. The company that makes it, known as Tools for Humanity, calls this mysterious device the Orb. Stare into the heart of the plastic-and-silicon globe and it will map the unique furrows and ciliary zones of your iris. Seconds later, you’ll receive inviolable proof of your humanity: a 12,800-digit binary number, known as an iris code, sent to an app on your phone. At the same time, a packet of cryptocurrency called Worldcoin, worth approximately will be transferred to your digital wallet—your reward for becoming a “verified human.” Altman co-founded Tools for Humanity in 2019 as part of a suite of companies he believed would reshape the world. Once the tech he was developing at OpenAI passed a certain level of intelligence, he reasoned, it would mark the end of one era on the Internet and the beginning of another, in which AI became so advanced, so human-like, that you would no longer be able to tell whether what you read, saw, or heard online came from a real person. When that happened, Altman imagined, we would need a new kind of online infrastructure: a human-verification layer for the Internet, to distinguish real people from the proliferating number of bots and AI “agents.”And so Tools for Humanity set out to build a global “proof-of-humanity” network. It aims to verify 50 million people by the end of 2025; ultimately its goal is to sign up every single human being on the planet. The free crypto serves as both an incentive for users to sign up, and also an entry point into what the company hopes will become the world’s largest financial network, through which it believes “double-digit percentages of the global economy” will eventually flow. Even for Altman, these missions are audacious. “If this really works, it’s like a fundamental piece of infrastructure for the world,” Altman tells TIME in a video interview from the passenger seat of a car a few days before his April 30 keynote address.Internal hardware of the Orb in mid-assembly in March. Davide Monteleone for TIMEThe project’s goal is to solve a problem partly of Altman’s own making. In the near future, he and other tech leaders say, advanced AIs will be imbued with agency: the ability to not just respond to human prompting, but to take actions independently in the world. This will enable the creation of AI coworkers that can drop into your company and begin solving problems; AI tutors that can adapt their teaching style to students’ preferences; even AI doctors that can diagnose routine cases and handle scheduling or logistics. The arrival of these virtual agents, their venture capitalist backers predict, will turbocharge our productivity and unleash an age of material abundance.But AI agents will also have cascading consequences for the human experience online. “As AI systems become harder to distinguish from people, websites may face difficult trade-offs,” says a recent paper by researchers from 25 different universities, nonprofits, and tech companies, including OpenAI. “There is a significant risk that digital institutions will be unprepared for a time when AI-powered agents, including those leveraged by malicious actors, overwhelm other activity online.” On social-media platforms like X and Facebook, bot-driven accounts are amassing billions of views on AI-generated content. In April, the foundation that runs Wikipedia disclosed that AI bots scraping their site were making the encyclopedia too costly to sustainably run. Later the same month, researchers from the University of Zurich found that AI-generated comments on the subreddit /r/ChangeMyView were up to six times more successful than human-written ones at persuading unknowing users to change their minds.  Photograph by Davide Monteleone for TIMEBuy a copy of the Orb issue hereThe arrival of agents won’t only threaten our ability to distinguish between authentic and AI content online. It will also challenge the Internet’s core business model, online advertising, which relies on the assumption that ads are being viewed by humans. “The Internet will change very drastically sometime in the next 12 to 24 months,” says Tools for Humanity CEO Alex Blania. “So we have to succeed, or I’m not sure what else would happen.”For four years, Blania’s team has been testing the Orb’s hardware abroad. Now the U.S. rollout has arrived. Over the next 12 months, 7,500 Orbs will be arriving in dozens of American cities, in locations like gas stations, bodegas, and flagship stores in Los Angeles, Austin, and Miami. The project’s founders and fans hope the Orb’s U.S. debut will kickstart a new phase of growth. The San Francisco keynote was titled: “At Last.” It’s not clear the public appetite matches the exultant branding. Tools for Humanity has “verified” just 12 million humans since mid 2023, a pace Blania concedes is well behind schedule. Few online platforms currently support the so-called “World ID” that the Orb bestows upon its visitors, leaving little to entice users to give up their biometrics beyond the lure of free crypto. Even Altman isn’t sure whether the whole thing can work. “I can seethis becomes a fairly mainstream thing in a few years,” he says. “Or I can see that it’s still only used by a small subset of people who think about the world in a certain way.” Blaniaand Altman debut the Orb at World’s U.S. launch in San Francisco on April 30, 2025. Jason Henry—The New York Times/ReduxYet as the Internet becomes overrun with AI, the creators of this strange new piece of hardware are betting that everybody in the world will soon want—or need—to visit an Orb. The biometric code it creates, they predict, will become a new type of digital passport, without which you might be denied passage to the Internet of the future, from dating apps to government services. In a best-case scenario, World ID could be a privacy-preserving way to fortify the Internet against an AI-driven deluge of fake or deceptive content. It could also enable the distribution of universal basic income—a policy that Altman has previously touted—as AI automation transforms the global economy. To examine what this new technology might mean, I reported from three continents, interviewed 10 Tools for Humanity executives and investors, reviewed hundreds of pages of company documents, and “verified” my own humanity. The Internet will inevitably need some kind of proof-of-humanity system in the near future, says Divya Siddarth, founder of the nonprofit Collective Intelligence Project. The real question, she argues, is whether such a system will be centralized—“a big security nightmare that enables a lot of surveillance”—or privacy-preserving, as the Orb claims to be. Questions remain about Tools for Humanity’s corporate structure, its yoking to an unstable cryptocurrency, and what power it would concentrate in the hands of its owners if successful. Yet it’s also one of the only attempts to solve what many see as an increasingly urgent problem. “There are some issues with it,” Siddarth says of World ID. “But you can’t preserve the Internet in amber. Something in this direction is necessary.”In March, I met Blania at Tools for Humanity’s San Francisco headquarters, where a large screen displays the number of weekly “Orb verifications” by country. A few days earlier, the CEO had attended a million-per-head dinner at Mar-a-Lago with President Donald Trump, whom he credits with clearing the way for the company’s U.S. launch by relaxing crypto regulations. “Given Sam is a very high profile target,” Blania says, “we just decided that we would let other companies fight that fight, and enter the U.S. once the air is clear.” As a kid growing up in Germany, Blania was a little different than his peers. “Other kids were, like, drinking a lot, or doing a lot of parties, and I was just building a lot of things that could potentially blow up,” he recalls. At the California Institute of Technology, where he was pursuing research for a masters degree, he spent many evenings reading the blogs of startup gurus like Paul Graham and Altman. Then, in 2019, Blania received an email from Max Novendstern, an entrepreneur who had been kicking around a concept with Altman to build a global cryptocurrency network. They were looking for technical minds to help with the project. Over cappuccinos, Altman told Blania he was certain about three things. First, smarter-than-human AI was not only possible, but inevitable—and it would soon mean you could no longer assume that anything you read, saw, or heard on the Internet was human-created. Second, cryptocurrency and other decentralized technologies would be a massive force for change in the world. And third, scale was essential to any crypto network’s value. The Orb is tested on a calibration rig, surrounded by checkerboard targets to ensure precision in iris detection. Davide Monteleone for TIMEThe goal of Worldcoin, as the project was initially called, was to combine those three insights. Altman took a lesson from PayPal, the company co-founded by his mentor Peter Thiel. Of its initial funding, PayPal spent less than million actually building its app—but pumped an additional million or so into a referral program, whereby new users and the person who invited them would each receive in credit. The referral program helped make PayPal a leading payment platform. Altman thought a version of that strategy would propel Worldcoin to similar heights. He wanted to create a new cryptocurrency and give it to users as a reward for signing up. The more people who joined the system, the higher the token’s value would theoretically rise. Since 2019, the project has raised million from investors like Coinbase and the venture capital firm Andreessen Horowitz. That money paid for the million cost of designing the Orb, plus maintaining the software it runs on. The total market value of all Worldcoins in existence, however, is far higher—around billion. That number is a bit misleading: most of those coins are not in circulation and Worldcoin’s price has fluctuated wildly. Still, it allows the company to reward users for signing up at no cost to itself. The main lure for investors is the crypto upside. Some 75% of all Worldcoins are set aside for humans to claim when they sign up, or as referral bonuses. The remaining 25% are split between Tools for Humanity’s backers and staff, including Blania and Altman. “I’m really excited to make a lot of money,” ” Blania says.From the beginning, Altman was thinking about the consequences of the AI revolution he intended to unleash.A future in which advanced AI could perform most tasks more effectively than humans would bring a wave of unemployment and economic dislocation, he reasoned. Some kind of wealth redistribution might be necessary. In 2016, he partially funded a study of basic income, which gave per-month handouts to low-income individuals in Illinois and Texas. But there was no single financial system that would allow money to be sent to everybody in the world. Nor was there a way to stop an individual human from claiming their share twice—or to identify a sophisticated AI pretending to be human and pocketing some cash of its own. In 2023, Tools for Humanity raised the possibility of using the network to redistribute the profits of AI labs that were able to automate human labor. “As AI advances,” it said, “fairly distributing access and some of the created value through UBI will play an increasingly vital role in counteracting the concentration of economic power.”Blania was taken by the pitch, and agreed to join the project as a co-founder. “Most people told us we were very stupid or crazy or insane, including Silicon Valley investors,” Blania says. At least until ChatGPT came out in 2022, transforming OpenAI into one of the world’s most famous tech companies and kickstarting a market bull-run. “Things suddenly started to make more and more sense to the external world,” Blania says of the vision to develop a global “proof-of-humanity” network. “You have to imagine a world in which you will have very smart and competent systems somehow flying through the Internet with different goals and ideas of what they want to do, and us having no idea anymore what we’re dealing with.”After our interview, Blania’s head of communications ushers me over to a circular wooden structure where eight Orbs face one another. The scene feels like a cross between an Apple Store and a ceremonial altar. “Do you want to get verified?” she asks. Putting aside my reservations for the purposes of research, I download the World App and follow its prompts. I flash a QR code at the Orb, then gaze into it. A minute or so later, my phone buzzes with confirmation: I’ve been issued my own personal World ID and some Worldcoin.The first thing the Orb does is check if you’re human, using a neural network that takes input from various sensors, including an infrared camera and a thermometer. Davide Monteleone for TIMEWhile I stared into the Orb, several complex procedures had taken place at once. A neural network took inputs from multiple sensors—an infrared camera, a thermometer—to confirm I was a living human. Simultaneously, a telephoto lens zoomed in on my iris, capturing the physical traits within that distinguish me from every other human on Earth. It then converted that image into an iris code: a numerical abstraction of my unique biometric data. Then the Orb checked to see if my iris code matched any it had seen before, using a technique allowing encrypted data to be compared without revealing the underlying information. Before the Orb deleted my data, it turned my iris code into several derivative codes—none of which on its own can be linked back to the original—encrypted them, deleted the only copies of the decryption keys, and sent each one to a different secure server, so that future users’ iris codes can be checked for uniqueness against mine. If I were to use my World ID to access a website, that site would learn nothing about me except that I’m human. The Orb is open-source, so outside experts can examine its code and verify the company’s privacy claims. “I did a colonoscopy on this company and these technologies before I agreed to join,” says Trevor Traina, a Trump donor and former U.S. ambassador to Austria who now serves as Tools for Humanity’s chief business officer. “It is the most privacy-preserving technology on the planet.”Only weeks later, when researching what would happen if I wanted to delete my data, do I discover that Tools for Humanity’s privacy claims rest on what feels like a sleight of hand. The company argues that in modifying your iris code, it has “effectively anonymized” your biometric data. If you ask Tools for Humanity to delete your iris codes, they will delete the one stored on your phone, but not the derivatives. Those, they argue, are no longer your personal data at all. But if I were to return to an Orb after deleting my data, it would still recognize those codes as uniquely mine. Once you look into the Orb, a piece of your identity remains in the system forever. If users could truly delete that data, the premise of one ID per human would collapse, Tools for Humanity’s chief privacy officer Damien Kieran tells me when I call seeking an explanation. People could delete and sign up for new World IDs after being suspended from a platform. Or claim their Worldcoin tokens, sell them, delete their data, and cash in again. This argument fell flat with European Union regulators in Germany, who recently declared that the Orb posed “fundamental data protection issues” and ordered the company to allow European users to fully delete even their anonymized data.“Just like any other technology service, users cannot delete data that is not personal data,” Kieran said in a statement. “If a person could delete anonymized data that can’t be linked to them by World or any third party, it would allow bad actors to circumvent the security and safety that World ID is working to bring to every human.”On a balmy afternoon this spring, I climb a flight of stairs up to a room above a restaurant in an outer suburb of Seoul. Five elderly South Koreans tap on their phones as they wait to be “verified” by the two Orbs in the center of the room. “We don’t really know how to distinguish between AI and humans anymore,” an attendant in a company t-shirt explains in Korean, gesturing toward the spheres. “We need a way to verify that we’re human and not AI. So how do we do that? Well, humans have irises, but AI doesn’t.”The attendant ushers an elderly woman over to an Orb. It bleeps. “Open your eyes,” a disembodied voice says in English. The woman stares into the camera. Seconds later, she checks her phone and sees that a packet of Worldcoin worth 75,000 Korean wonhas landed in her digital wallet. Congratulations, the app tells her. You are now a verified human.A visitor views the Orbs in Seoul on April 14, 2025. Taemin Ha for TIMETools for Humanity aims to “verify” 1 million Koreans over the next year. Taemin Ha for TIMEA couple dozen Orbs have been available in South Korea since 2023, verifying roughly 55,000 people. Now Tools for Humanity is redoubling its efforts there. At an event in a traditional wooden hanok house in central Seoul, an executive announces that 250 Orbs will soon be dispersed around the country—with the aim of verifying 1 million Koreans in the next 12 months. South Korea has high levels of smartphone usage, crypto and AI adoption, and Internet access, while average wages are modest enough for the free Worldcoin on offer to still be an enticing draw—all of which makes it fertile testing ground for the company’s ambitious global expansion. Yet things seem off to a slow start. In a retail space I visited in central Seoul, Tools for Humanity had constructed a wooden structure with eight Orbs facing each other. Locals and tourists wander past looking bemused; few volunteer themselves up. Most who do tell me they are crypto enthusiasts who came intentionally, driven more by the spirit of early adoption than the free coins. The next day, I visit a coffee shop in central Seoul where a chrome Orb sits unassumingly in one corner. Wu Ruijun, a 20-year-old student from China, strikes up a conversation with the barista, who doubles as the Orb’s operator. Wu was invited here by a friend who said both could claim free cryptocurrency if he signed up. The barista speeds him through the process. Wu accepts the privacy disclosure without reading it, and widens his eyes for the Orb. Soon he’s verified. “I wasn’t told anything about the privacy policy,” he says on his way out. “I just came for the money.”As Altman’s car winds through San Francisco, I ask about the vision he laid out in 2019: that AI would make it harder for us to trust each other online. To my surprise, he rejects the framing. “I’m much morelike: what is the good we can create, rather than the bad we can stop?” he says. “It’s not like, ‘Oh, we’ve got to avoid the bot overrun’ or whatever. It’s just that we can do a lot of special things for humans.” It’s an answer that may reflect how his role has changed over the years. Altman is now the chief public cheerleader of a billion company that’s touting the transformative utility of AI agents. The rise of agents, he and others say, will be a boon for our quality of life—like having an assistant on hand who can answer your most pressing questions, carry out mundane tasks, and help you develop new skills. It’s an optimistic vision that may well pan out. But it doesn’t quite fit with the prophecies of AI-enabled infopocalypse that Tools for Humanity was founded upon.Altman waves away a question about the influence he and other investors stand to gain if their vision is realized. Most holders, he assumes, will have already started selling their tokens—too early, he adds. “What I think would be bad is if an early crew had a lot of control over the protocol,” he says, “and that’s where I think the commitment to decentralization is so cool.” Altman is referring to the World Protocol, the underlying technology upon which the Orb, Worldcoin, and World ID all rely. Tools for Humanity is developing it, but has committed to giving control to its users over time—a process they say will prevent power from being concentrated in the hands of a few executives or investors. Tools for Humanity would remain a for-profit company, and could levy fees on platforms that use World ID, but other companies would be able to compete for customers by building alternative apps—or even alternative Orbs. The plan draws on ideas that animated the crypto ecosystem in the late 2010s and early 2020s, when evangelists for emerging blockchain technologies argued that the centralization of power—especially in large so-called “Web 2.0” tech companies—was responsible for many of the problems plaguing the modern Internet. Just as decentralized cryptocurrencies could reform a financial system controlled by economic elites, so too would it be possible to create decentralized organizations, run by their members instead of CEOs. How such a system might work in practice remains unclear. “Building a community-based governance system,” Tools for Humanity says in a 2023 white paper, “represents perhaps the most formidable challenge of the entire project.”Altman has a pattern of making idealistic promises that shift over time. He founded OpenAI as a nonprofit in 2015, with a mission to develop AGI safely and for the benefit of all humanity. To raise money, OpenAI restructured itself as a for-profit company in 2019, but with overall control still in the hands of its nonprofit board. Last year, Altman proposed yet another restructure—one which would dilute the board’s control and allow more profits to flow to shareholders. Why, I ask, should the public trust Tools for Humanity’s commitment to freely surrender influence and power? “I think you will just see the continued decentralization via the protocol,” he says. “The value here is going to live in the network, and the network will be owned and governed by a lot of people.” Altman talks less about universal basic income these days. He recently mused about an alternative, which he called “universal basic compute.” Instead of AI companies redistributing their profits, he seemed to suggest, they could instead give everyone in the world fair access to super-powerful AI. Blania tells me he recently “made the decision to stop talking” about UBI at Tools for Humanity. “UBI is one potential answer,” he says. “Just givingaccess to the latestmodels and having them learn faster and better is another.” Says Altman: “I still don’t know what the right answer is. I believe we should do a better job of distribution of resources than we currently do.” When I probe the question of why people should trust him, Altman gets irritated. “I understand that you hate AI, and that’s fine,” he says. “If you want to frame it as the downside of AI is that there’s going to be a proliferation of very convincing AI systems that are pretending to be human, and we need ways to know what is really human-authorized versus not, then yeah, I think you can call that a downside of AI. It’s not how I would naturally frame it.” The phrase human-authorized hints at a tension between World ID and OpenAI’s plans for AI agents. An Internet where a World ID is required to access most services might impede the usefulness of the agents that OpenAI and others are developing. So Tools for Humanity is building a system that would allow users to delegate their World ID to an agent, allowing the bot to take actions online on their behalf, according to Tiago Sada, the company’s chief product officer. “We’ve built everything in a way that can be very easily delegatable to an agent,” Sada says. It’s a measure that would allow humans to be held accountable for the actions of their AIs. But it suggests that Tools for Humanity’s mission may be shifting beyond simply proving humanity, and toward becoming the infrastructure that enables AI agents to proliferate with human authorization. World ID doesn’t tell you whether a piece of content is AI-generated or human-generated; all it tells you is whether the account that posted it is a human or a bot. Even in a world where everybody had a World ID, our online spaces might still be filled with AI-generated text, images, and videos.As I say goodbye to Altman, I’m left feeling conflicted about his project. If the Internet is going to be transformed by AI agents, then some kind of proof-of-humanity system will almost certainly be necessary. Yet if the Orb becomes a piece of Internet infrastructure, it could give Altman—a beneficiary of the proliferation of AI content—significant influence over a leading defense mechanism against it. People might have no choice but to participate in the network in order to access social media or online services.I thought of an encounter I witnessed in Seoul. In the room above the restaurant, Cho Jeong-yeon, 75, watched her friend get verified by an Orb. Cho had been invited to do the same, but demurred. The reward wasn’t enough for her to surrender a part of her identity. “Your iris is uniquely yours, and we don’t really know how it might be used,” she says. “Seeing the machine made me think: are we becoming machines instead of humans now? Everything is changing, and we don’t know how it’ll all turn out.”—With reporting by Stephen Kim/Seoul. This story was supported by Tarbell Grants.Correction, May 30The original version of this story misstated the market capitalization of Worldcoin if all coins were in circulation. It is billion, not billion. #orb #will #see #you #now
    The Orb Will See You Now
    time.com
    Once again, Sam Altman wants to show you the future. The CEO of OpenAI is standing on a sparse stage in San Francisco, preparing to reveal his next move to an attentive crowd. “We needed some way for identifying, authenticating humans in the age of AGI,” Altman explains, referring to artificial general intelligence. “We wanted a way to make sure that humans stayed special and central.” The solution Altman came up with is looming behind him. It’s a white sphere about the size of a beach ball, with a camera at its center. The company that makes it, known as Tools for Humanity, calls this mysterious device the Orb. Stare into the heart of the plastic-and-silicon globe and it will map the unique furrows and ciliary zones of your iris. Seconds later, you’ll receive inviolable proof of your humanity: a 12,800-digit binary number, known as an iris code, sent to an app on your phone. At the same time, a packet of cryptocurrency called Worldcoin, worth approximately $42, will be transferred to your digital wallet—your reward for becoming a “verified human.” Altman co-founded Tools for Humanity in 2019 as part of a suite of companies he believed would reshape the world. Once the tech he was developing at OpenAI passed a certain level of intelligence, he reasoned, it would mark the end of one era on the Internet and the beginning of another, in which AI became so advanced, so human-like, that you would no longer be able to tell whether what you read, saw, or heard online came from a real person. When that happened, Altman imagined, we would need a new kind of online infrastructure: a human-verification layer for the Internet, to distinguish real people from the proliferating number of bots and AI “agents.”And so Tools for Humanity set out to build a global “proof-of-humanity” network. It aims to verify 50 million people by the end of 2025; ultimately its goal is to sign up every single human being on the planet. The free crypto serves as both an incentive for users to sign up, and also an entry point into what the company hopes will become the world’s largest financial network, through which it believes “double-digit percentages of the global economy” will eventually flow. Even for Altman, these missions are audacious. “If this really works, it’s like a fundamental piece of infrastructure for the world,” Altman tells TIME in a video interview from the passenger seat of a car a few days before his April 30 keynote address.Internal hardware of the Orb in mid-assembly in March. Davide Monteleone for TIMEThe project’s goal is to solve a problem partly of Altman’s own making. In the near future, he and other tech leaders say, advanced AIs will be imbued with agency: the ability to not just respond to human prompting, but to take actions independently in the world. This will enable the creation of AI coworkers that can drop into your company and begin solving problems; AI tutors that can adapt their teaching style to students’ preferences; even AI doctors that can diagnose routine cases and handle scheduling or logistics. The arrival of these virtual agents, their venture capitalist backers predict, will turbocharge our productivity and unleash an age of material abundance.But AI agents will also have cascading consequences for the human experience online. “As AI systems become harder to distinguish from people, websites may face difficult trade-offs,” says a recent paper by researchers from 25 different universities, nonprofits, and tech companies, including OpenAI. “There is a significant risk that digital institutions will be unprepared for a time when AI-powered agents, including those leveraged by malicious actors, overwhelm other activity online.” On social-media platforms like X and Facebook, bot-driven accounts are amassing billions of views on AI-generated content. In April, the foundation that runs Wikipedia disclosed that AI bots scraping their site were making the encyclopedia too costly to sustainably run. Later the same month, researchers from the University of Zurich found that AI-generated comments on the subreddit /r/ChangeMyView were up to six times more successful than human-written ones at persuading unknowing users to change their minds.  Photograph by Davide Monteleone for TIMEBuy a copy of the Orb issue hereThe arrival of agents won’t only threaten our ability to distinguish between authentic and AI content online. It will also challenge the Internet’s core business model, online advertising, which relies on the assumption that ads are being viewed by humans. “The Internet will change very drastically sometime in the next 12 to 24 months,” says Tools for Humanity CEO Alex Blania. “So we have to succeed, or I’m not sure what else would happen.”For four years, Blania’s team has been testing the Orb’s hardware abroad. Now the U.S. rollout has arrived. Over the next 12 months, 7,500 Orbs will be arriving in dozens of American cities, in locations like gas stations, bodegas, and flagship stores in Los Angeles, Austin, and Miami. The project’s founders and fans hope the Orb’s U.S. debut will kickstart a new phase of growth. The San Francisco keynote was titled: “At Last.” It’s not clear the public appetite matches the exultant branding. Tools for Humanity has “verified” just 12 million humans since mid 2023, a pace Blania concedes is well behind schedule. Few online platforms currently support the so-called “World ID” that the Orb bestows upon its visitors, leaving little to entice users to give up their biometrics beyond the lure of free crypto. Even Altman isn’t sure whether the whole thing can work. “I can see [how] this becomes a fairly mainstream thing in a few years,” he says. “Or I can see that it’s still only used by a small subset of people who think about the world in a certain way.” Blania (left) and Altman debut the Orb at World’s U.S. launch in San Francisco on April 30, 2025. Jason Henry—The New York Times/ReduxYet as the Internet becomes overrun with AI, the creators of this strange new piece of hardware are betting that everybody in the world will soon want—or need—to visit an Orb. The biometric code it creates, they predict, will become a new type of digital passport, without which you might be denied passage to the Internet of the future, from dating apps to government services. In a best-case scenario, World ID could be a privacy-preserving way to fortify the Internet against an AI-driven deluge of fake or deceptive content. It could also enable the distribution of universal basic income (UBI)—a policy that Altman has previously touted—as AI automation transforms the global economy. To examine what this new technology might mean, I reported from three continents, interviewed 10 Tools for Humanity executives and investors, reviewed hundreds of pages of company documents, and “verified” my own humanity. The Internet will inevitably need some kind of proof-of-humanity system in the near future, says Divya Siddarth, founder of the nonprofit Collective Intelligence Project. The real question, she argues, is whether such a system will be centralized—“a big security nightmare that enables a lot of surveillance”—or privacy-preserving, as the Orb claims to be. Questions remain about Tools for Humanity’s corporate structure, its yoking to an unstable cryptocurrency, and what power it would concentrate in the hands of its owners if successful. Yet it’s also one of the only attempts to solve what many see as an increasingly urgent problem. “There are some issues with it,” Siddarth says of World ID. “But you can’t preserve the Internet in amber. Something in this direction is necessary.”In March, I met Blania at Tools for Humanity’s San Francisco headquarters, where a large screen displays the number of weekly “Orb verifications” by country. A few days earlier, the CEO had attended a $1 million-per-head dinner at Mar-a-Lago with President Donald Trump, whom he credits with clearing the way for the company’s U.S. launch by relaxing crypto regulations. “Given Sam is a very high profile target,” Blania says, “we just decided that we would let other companies fight that fight, and enter the U.S. once the air is clear.” As a kid growing up in Germany, Blania was a little different than his peers. “Other kids were, like, drinking a lot, or doing a lot of parties, and I was just building a lot of things that could potentially blow up,” he recalls. At the California Institute of Technology, where he was pursuing research for a masters degree, he spent many evenings reading the blogs of startup gurus like Paul Graham and Altman. Then, in 2019, Blania received an email from Max Novendstern, an entrepreneur who had been kicking around a concept with Altman to build a global cryptocurrency network. They were looking for technical minds to help with the project. Over cappuccinos, Altman told Blania he was certain about three things. First, smarter-than-human AI was not only possible, but inevitable—and it would soon mean you could no longer assume that anything you read, saw, or heard on the Internet was human-created. Second, cryptocurrency and other decentralized technologies would be a massive force for change in the world. And third, scale was essential to any crypto network’s value. The Orb is tested on a calibration rig, surrounded by checkerboard targets to ensure precision in iris detection. Davide Monteleone for TIMEThe goal of Worldcoin, as the project was initially called, was to combine those three insights. Altman took a lesson from PayPal, the company co-founded by his mentor Peter Thiel. Of its initial funding, PayPal spent less than $10 million actually building its app—but pumped an additional $70 million or so into a referral program, whereby new users and the person who invited them would each receive $10 in credit. The referral program helped make PayPal a leading payment platform. Altman thought a version of that strategy would propel Worldcoin to similar heights. He wanted to create a new cryptocurrency and give it to users as a reward for signing up. The more people who joined the system, the higher the token’s value would theoretically rise. Since 2019, the project has raised $244 million from investors like Coinbase and the venture capital firm Andreessen Horowitz. That money paid for the $50 million cost of designing the Orb, plus maintaining the software it runs on. The total market value of all Worldcoins in existence, however, is far higher—around $12 billion. That number is a bit misleading: most of those coins are not in circulation and Worldcoin’s price has fluctuated wildly. Still, it allows the company to reward users for signing up at no cost to itself. The main lure for investors is the crypto upside. Some 75% of all Worldcoins are set aside for humans to claim when they sign up, or as referral bonuses. The remaining 25% are split between Tools for Humanity’s backers and staff, including Blania and Altman. “I’m really excited to make a lot of money,” ” Blania says.From the beginning, Altman was thinking about the consequences of the AI revolution he intended to unleash. (On May 21, he announced plans to team up with famed former Apple designer Jony Ive on a new AI personal device.) A future in which advanced AI could perform most tasks more effectively than humans would bring a wave of unemployment and economic dislocation, he reasoned. Some kind of wealth redistribution might be necessary. In 2016, he partially funded a study of basic income, which gave $1,000 per-month handouts to low-income individuals in Illinois and Texas. But there was no single financial system that would allow money to be sent to everybody in the world. Nor was there a way to stop an individual human from claiming their share twice—or to identify a sophisticated AI pretending to be human and pocketing some cash of its own. In 2023, Tools for Humanity raised the possibility of using the network to redistribute the profits of AI labs that were able to automate human labor. “As AI advances,” it said, “fairly distributing access and some of the created value through UBI will play an increasingly vital role in counteracting the concentration of economic power.”Blania was taken by the pitch, and agreed to join the project as a co-founder. “Most people told us we were very stupid or crazy or insane, including Silicon Valley investors,” Blania says. At least until ChatGPT came out in 2022, transforming OpenAI into one of the world’s most famous tech companies and kickstarting a market bull-run. “Things suddenly started to make more and more sense to the external world,” Blania says of the vision to develop a global “proof-of-humanity” network. “You have to imagine a world in which you will have very smart and competent systems somehow flying through the Internet with different goals and ideas of what they want to do, and us having no idea anymore what we’re dealing with.”After our interview, Blania’s head of communications ushers me over to a circular wooden structure where eight Orbs face one another. The scene feels like a cross between an Apple Store and a ceremonial altar. “Do you want to get verified?” she asks. Putting aside my reservations for the purposes of research, I download the World App and follow its prompts. I flash a QR code at the Orb, then gaze into it. A minute or so later, my phone buzzes with confirmation: I’ve been issued my own personal World ID and some Worldcoin.The first thing the Orb does is check if you’re human, using a neural network that takes input from various sensors, including an infrared camera and a thermometer. Davide Monteleone for TIMEWhile I stared into the Orb, several complex procedures had taken place at once. A neural network took inputs from multiple sensors—an infrared camera, a thermometer—to confirm I was a living human. Simultaneously, a telephoto lens zoomed in on my iris, capturing the physical traits within that distinguish me from every other human on Earth. It then converted that image into an iris code: a numerical abstraction of my unique biometric data. Then the Orb checked to see if my iris code matched any it had seen before, using a technique allowing encrypted data to be compared without revealing the underlying information. Before the Orb deleted my data, it turned my iris code into several derivative codes—none of which on its own can be linked back to the original—encrypted them, deleted the only copies of the decryption keys, and sent each one to a different secure server, so that future users’ iris codes can be checked for uniqueness against mine. If I were to use my World ID to access a website, that site would learn nothing about me except that I’m human. The Orb is open-source, so outside experts can examine its code and verify the company’s privacy claims. “I did a colonoscopy on this company and these technologies before I agreed to join,” says Trevor Traina, a Trump donor and former U.S. ambassador to Austria who now serves as Tools for Humanity’s chief business officer. “It is the most privacy-preserving technology on the planet.”Only weeks later, when researching what would happen if I wanted to delete my data, do I discover that Tools for Humanity’s privacy claims rest on what feels like a sleight of hand. The company argues that in modifying your iris code, it has “effectively anonymized” your biometric data. If you ask Tools for Humanity to delete your iris codes, they will delete the one stored on your phone, but not the derivatives. Those, they argue, are no longer your personal data at all. But if I were to return to an Orb after deleting my data, it would still recognize those codes as uniquely mine. Once you look into the Orb, a piece of your identity remains in the system forever. If users could truly delete that data, the premise of one ID per human would collapse, Tools for Humanity’s chief privacy officer Damien Kieran tells me when I call seeking an explanation. People could delete and sign up for new World IDs after being suspended from a platform. Or claim their Worldcoin tokens, sell them, delete their data, and cash in again. This argument fell flat with European Union regulators in Germany, who recently declared that the Orb posed “fundamental data protection issues” and ordered the company to allow European users to fully delete even their anonymized data. (Tools for Humanity has appealed; the regulator is now reassessing the decision.) “Just like any other technology service, users cannot delete data that is not personal data,” Kieran said in a statement. “If a person could delete anonymized data that can’t be linked to them by World or any third party, it would allow bad actors to circumvent the security and safety that World ID is working to bring to every human.”On a balmy afternoon this spring, I climb a flight of stairs up to a room above a restaurant in an outer suburb of Seoul. Five elderly South Koreans tap on their phones as they wait to be “verified” by the two Orbs in the center of the room. “We don’t really know how to distinguish between AI and humans anymore,” an attendant in a company t-shirt explains in Korean, gesturing toward the spheres. “We need a way to verify that we’re human and not AI. So how do we do that? Well, humans have irises, but AI doesn’t.”The attendant ushers an elderly woman over to an Orb. It bleeps. “Open your eyes,” a disembodied voice says in English. The woman stares into the camera. Seconds later, she checks her phone and sees that a packet of Worldcoin worth 75,000 Korean won (about $54) has landed in her digital wallet. Congratulations, the app tells her. You are now a verified human.A visitor views the Orbs in Seoul on April 14, 2025. Taemin Ha for TIMETools for Humanity aims to “verify” 1 million Koreans over the next year. Taemin Ha for TIMEA couple dozen Orbs have been available in South Korea since 2023, verifying roughly 55,000 people. Now Tools for Humanity is redoubling its efforts there. At an event in a traditional wooden hanok house in central Seoul, an executive announces that 250 Orbs will soon be dispersed around the country—with the aim of verifying 1 million Koreans in the next 12 months. South Korea has high levels of smartphone usage, crypto and AI adoption, and Internet access, while average wages are modest enough for the free Worldcoin on offer to still be an enticing draw—all of which makes it fertile testing ground for the company’s ambitious global expansion. Yet things seem off to a slow start. In a retail space I visited in central Seoul, Tools for Humanity had constructed a wooden structure with eight Orbs facing each other. Locals and tourists wander past looking bemused; few volunteer themselves up. Most who do tell me they are crypto enthusiasts who came intentionally, driven more by the spirit of early adoption than the free coins. The next day, I visit a coffee shop in central Seoul where a chrome Orb sits unassumingly in one corner. Wu Ruijun, a 20-year-old student from China, strikes up a conversation with the barista, who doubles as the Orb’s operator. Wu was invited here by a friend who said both could claim free cryptocurrency if he signed up. The barista speeds him through the process. Wu accepts the privacy disclosure without reading it, and widens his eyes for the Orb. Soon he’s verified. “I wasn’t told anything about the privacy policy,” he says on his way out. “I just came for the money.”As Altman’s car winds through San Francisco, I ask about the vision he laid out in 2019: that AI would make it harder for us to trust each other online. To my surprise, he rejects the framing. “I’m much more [about] like: what is the good we can create, rather than the bad we can stop?” he says. “It’s not like, ‘Oh, we’ve got to avoid the bot overrun’ or whatever. It’s just that we can do a lot of special things for humans.” It’s an answer that may reflect how his role has changed over the years. Altman is now the chief public cheerleader of a $300 billion company that’s touting the transformative utility of AI agents. The rise of agents, he and others say, will be a boon for our quality of life—like having an assistant on hand who can answer your most pressing questions, carry out mundane tasks, and help you develop new skills. It’s an optimistic vision that may well pan out. But it doesn’t quite fit with the prophecies of AI-enabled infopocalypse that Tools for Humanity was founded upon.Altman waves away a question about the influence he and other investors stand to gain if their vision is realized. Most holders, he assumes, will have already started selling their tokens—too early, he adds. “What I think would be bad is if an early crew had a lot of control over the protocol,” he says, “and that’s where I think the commitment to decentralization is so cool.” Altman is referring to the World Protocol, the underlying technology upon which the Orb, Worldcoin, and World ID all rely. Tools for Humanity is developing it, but has committed to giving control to its users over time—a process they say will prevent power from being concentrated in the hands of a few executives or investors. Tools for Humanity would remain a for-profit company, and could levy fees on platforms that use World ID, but other companies would be able to compete for customers by building alternative apps—or even alternative Orbs. The plan draws on ideas that animated the crypto ecosystem in the late 2010s and early 2020s, when evangelists for emerging blockchain technologies argued that the centralization of power—especially in large so-called “Web 2.0” tech companies—was responsible for many of the problems plaguing the modern Internet. Just as decentralized cryptocurrencies could reform a financial system controlled by economic elites, so too would it be possible to create decentralized organizations, run by their members instead of CEOs. How such a system might work in practice remains unclear. “Building a community-based governance system,” Tools for Humanity says in a 2023 white paper, “represents perhaps the most formidable challenge of the entire project.”Altman has a pattern of making idealistic promises that shift over time. He founded OpenAI as a nonprofit in 2015, with a mission to develop AGI safely and for the benefit of all humanity. To raise money, OpenAI restructured itself as a for-profit company in 2019, but with overall control still in the hands of its nonprofit board. Last year, Altman proposed yet another restructure—one which would dilute the board’s control and allow more profits to flow to shareholders. Why, I ask, should the public trust Tools for Humanity’s commitment to freely surrender influence and power? “I think you will just see the continued decentralization via the protocol,” he says. “The value here is going to live in the network, and the network will be owned and governed by a lot of people.” Altman talks less about universal basic income these days. He recently mused about an alternative, which he called “universal basic compute.” Instead of AI companies redistributing their profits, he seemed to suggest, they could instead give everyone in the world fair access to super-powerful AI. Blania tells me he recently “made the decision to stop talking” about UBI at Tools for Humanity. “UBI is one potential answer,” he says. “Just giving [people] access to the latest [AI] models and having them learn faster and better is another.” Says Altman: “I still don’t know what the right answer is. I believe we should do a better job of distribution of resources than we currently do.” When I probe the question of why people should trust him, Altman gets irritated. “I understand that you hate AI, and that’s fine,” he says. “If you want to frame it as the downside of AI is that there’s going to be a proliferation of very convincing AI systems that are pretending to be human, and we need ways to know what is really human-authorized versus not, then yeah, I think you can call that a downside of AI. It’s not how I would naturally frame it.” The phrase human-authorized hints at a tension between World ID and OpenAI’s plans for AI agents. An Internet where a World ID is required to access most services might impede the usefulness of the agents that OpenAI and others are developing. So Tools for Humanity is building a system that would allow users to delegate their World ID to an agent, allowing the bot to take actions online on their behalf, according to Tiago Sada, the company’s chief product officer. “We’ve built everything in a way that can be very easily delegatable to an agent,” Sada says. It’s a measure that would allow humans to be held accountable for the actions of their AIs. But it suggests that Tools for Humanity’s mission may be shifting beyond simply proving humanity, and toward becoming the infrastructure that enables AI agents to proliferate with human authorization. World ID doesn’t tell you whether a piece of content is AI-generated or human-generated; all it tells you is whether the account that posted it is a human or a bot. Even in a world where everybody had a World ID, our online spaces might still be filled with AI-generated text, images, and videos.As I say goodbye to Altman, I’m left feeling conflicted about his project. If the Internet is going to be transformed by AI agents, then some kind of proof-of-humanity system will almost certainly be necessary. Yet if the Orb becomes a piece of Internet infrastructure, it could give Altman—a beneficiary of the proliferation of AI content—significant influence over a leading defense mechanism against it. People might have no choice but to participate in the network in order to access social media or online services.I thought of an encounter I witnessed in Seoul. In the room above the restaurant, Cho Jeong-yeon, 75, watched her friend get verified by an Orb. Cho had been invited to do the same, but demurred. The reward wasn’t enough for her to surrender a part of her identity. “Your iris is uniquely yours, and we don’t really know how it might be used,” she says. “Seeing the machine made me think: are we becoming machines instead of humans now? Everything is changing, and we don’t know how it’ll all turn out.”—With reporting by Stephen Kim/Seoul. This story was supported by Tarbell Grants.Correction, May 30The original version of this story misstated the market capitalization of Worldcoin if all coins were in circulation. It is $12 billion, not $1.2 billion.
    Like
    Love
    Wow
    Sad
    Angry
    240
    · 0 Comentários ·0 Compartilhamentos ·0 Anterior
  • Intense energy to inevitable risks – Designing for a start-up

    21 May, 2025

    Clare Dowdy finds out about the excitement, and challenges, that come with working with an early-stage company.

    “Culturally, working with founders is intense, in the best possible way,” says Kelly Mackenzie, founder and creative director of White Bear. The London and Dublin based branding agency has form working with founder-led companies, including Tom Parker Creamery and chocolate brand Luvli.
    “The business isn’t just what they do, it’s often wrapped up in their identity, sense of self-worth and purpose,” Mackenzie explains.
    And because of this intensity, the agency becomes almost as invested as the client.
    “When we’re asked to evolve or build their brand, we often tell them that it’s like being asked to mind their child,” Mackenzie says. “Naming that emotional connection early builds trust. It helps them feel safe in a process they’ve often never experienced before.”
    Many designers talk of going on a journey with these clients.
    “You have a very close relationship with the founders, and get to know them very deeply,” says Hijinks co-founder Marc Allenby. “Their idea is usually based on passion, and you – as a designer – are fuelled by that passion. That energy is self-motivating, you really care about what you’re creating.”
    The WeRepresent logo and wordmark designed by Hijinks
    When Hijinks presented the founder of talent agency WeRepresent with their logo, she burst into tears, which isn’t standard practice when presenting to bigger clients.
    But Hijinks had created an animated version that “breathed” – a nod to the founder’s traumatic experience of being in a coma on a ventilator with Covid. A moving approach, in more ways than one.
    The entrepreneurial spirit found in start-ups can be infectious.
    “Rather than being jaded, they have a youthful energy, and that attracts us,” says Russell Potter, the co-founder of architecture and design firm SODA, whose many hospitality start-up clients include the Instagram-beloved crumble shop, Humble Crumble.
    Then there’s the potential for creative freedom. “It’s a blank canvas. We’re creating something from nothing,” says Allenby at Hijinks, in contrast to a more mature brand that will come with its own baggage.
    But these clients may not have worked with a design studio before process, and inevitably there’s a lot of hand-holding.
    Dundee-based Agency of None branded the start-up QuickBlock
    “Start-ups by their nature, are often a very small group of people, all trying to cover many roles. So the role of the designer is often as an educator, as much as a designer,” says Lyall Bruce, director of Dundee-based Agency of None, whose start-up clients include QuickBlock, a sustainable building block made from recycled food packaging, and coffee roaster Bryte.
    As a consequence of this inexperience, the brief is rarely formal. It might be a loose deck, a stream-of-consciousness call, or a rough vision, according to Mackenzie. “And throughout, there will be extra calls to talk through thinking, being available on WhatsApp or Slack, and giving reassurance at each step.”
    That naivety is both beautiful and brilliant, says Potter at SODA. But if you’re not careful, you can get dragged into a lot of business decisions. “We’re often asked to comment above our pay grade – we can’t always have the answers,” he says. “Someone client side has to have a leap of faith and make a decision.”
    Inevitably budgets are tight, and agencies often need to explain the value of effective design,

    “Once they see the link between strong branding and commercial outcomes, budget conversations become much easier,” says Mackenzie at White Bear. Although, as several designers pointed out, this challenge is not unique to start-up clients.But for start-ups, agencies often break down payment into smaller chunks, as a way of protecting themselves.
    The interior of the new Crunch sandwich shop in London’s Soho designed by KIDZ
    KIDZ, which has offices in Amsterdam, Belgrade, Dubai and Paris, designed the new Crunch sandwich shop in London’s Soho. “Working with early-stage companies inevitably involves risk — timelines can shift, priorities may change, or funding may fall through,” says KIDZ co-founder Dmitrii Mironov.
    “To protect our team and ensure a smooth process, we break the work into smaller, clearly defined stages. We require prepayment for each stage; keep written records of all agreements, even when communication is fast and informal; limit the number of revisions and fix the scope of work for each stage; and withhold certain deliverables until full payment is received.”
    They’ve had a few cases where a project wasn’t completed because the client pivoted or changed direction unexpectedly. “While that’s never ideal, it’s part of the reality of working with start-ups,” he adds.
    And sometimes it makes sense to rethink payment completely.
    In lieu of fees from a business consultancy, Hijinks did a skills swap.
    Meanwhile, when Run for the Hills designed a third site for restaurant chain Cricket in London’s White City, they threw in a £5,000 bar bill to make up for the smaller fee. That allowed the agency to take the team out, thereby boosting morale, and host clients, thereby showcasing their work.
    The interior of one of the Humble Crumble shops, designed by SODA studio
    In 2015, SODA had a start-up client in the hospitality sector who offered to pay part of the fee in Bitcoin. “We ummed and ahhed, but decided to take the £19,000 in money,” Potter says.
    Some years later, it would have been worth over £1 million, though Potter points out that they would have sold it before then.
    Then there’s the gamble of a profit share, where you’re investing in their business in lieu of partial payment.
    At a former agency, product designer Jake Weir occasionally ended up doing sweat equity to help out, “so you’re basically partners.” When a hairdresser with limited funding came to him wanting to develop a new hair curler, the agency was given shares in the company for their design input. “We were incentivised to make it work,” Weir says. The product was a success – ultimately sold to BaByliss for “millions.”
    But even when budgets are low, these jobs are still worth doing sometimes. “We’ll do them as a passion project as they’re quick turnaround and they give younger guys in the studio more on-site experience,” Potter says.
    What happens when the client’s dream is never going to make it?
    MAP Project Office was once asked to design a very specific backpack. “We wondered if there was a market for this,” says MAP’s creative director, Weir.
    When people are pouring their life savings into a project, there’s a responsibility to warn them of the risks. Regardless, founders often have their mind set on these things. In these circumstances, MAP will look for a way to “dial the founders’ single-mindedness down,” Weirs says.
    “If you relax the concept a little bit, you can make it less niche and more accessible, especially for a first product.”
    White Bear’s work with the Tom Parker Creamery brand
    And experienced designers in this sector get good at spotting the jobs to avoid.
    Start-ups have a reputation for being short-lived. It’s commonly said that 90% of them fail, although the source for this stat is not at all clear.
    Harvard Business Review puts it more modestly, claiming that more than two-thirds of them never deliver a positive return to investors. The food and beverage sector, in particular, is full of such tragedies, according to The Grocer.
    But these potential risks shouldn’t be a reason not to take on a start-up. “The reason the project fails is not because of the design,” says Trotman at Run for the Hills, “unless the client has shittified it.”
    A fish restaurant that Run for the Hills worked on in London had great interiors and a cool brand, Trotman says. “But it failed on the food, and we can’t do anything about the food.”
    Conversely, when they do well, the agency is part of that success story. In 2005, Big Fish named and branded start-up chocolate puddings company Gü, cleverly persuading its founder to ditch his name, The Belgian Chocolate Company. Just seven later, it was sold for £32.5m.
    And because the agency is so embedded – it’s personal, remember – the work takes on real significance.
    “You really get the chance to make a lasting impact and build a long-term working relationship,” says Bruce at Agency of None. And better still for the broader industry. “The experience they have here will set up the relationship with design forever.”

    Design disciplines in this article

    Brands in this article

    What to read next

    More human resources – designers brand new workplace apps

    Brand Identity
    15 Apr, 2025
    #intense #energy #inevitable #risks #designing
    Intense energy to inevitable risks – Designing for a start-up
    21 May, 2025 Clare Dowdy finds out about the excitement, and challenges, that come with working with an early-stage company. “Culturally, working with founders is intense, in the best possible way,” says Kelly Mackenzie, founder and creative director of White Bear. The London and Dublin based branding agency has form working with founder-led companies, including Tom Parker Creamery and chocolate brand Luvli. “The business isn’t just what they do, it’s often wrapped up in their identity, sense of self-worth and purpose,” Mackenzie explains. And because of this intensity, the agency becomes almost as invested as the client. “When we’re asked to evolve or build their brand, we often tell them that it’s like being asked to mind their child,” Mackenzie says. “Naming that emotional connection early builds trust. It helps them feel safe in a process they’ve often never experienced before.” Many designers talk of going on a journey with these clients. “You have a very close relationship with the founders, and get to know them very deeply,” says Hijinks co-founder Marc Allenby. “Their idea is usually based on passion, and you – as a designer – are fuelled by that passion. That energy is self-motivating, you really care about what you’re creating.” The WeRepresent logo and wordmark designed by Hijinks When Hijinks presented the founder of talent agency WeRepresent with their logo, she burst into tears, which isn’t standard practice when presenting to bigger clients. But Hijinks had created an animated version that “breathed” – a nod to the founder’s traumatic experience of being in a coma on a ventilator with Covid. A moving approach, in more ways than one. The entrepreneurial spirit found in start-ups can be infectious. “Rather than being jaded, they have a youthful energy, and that attracts us,” says Russell Potter, the co-founder of architecture and design firm SODA, whose many hospitality start-up clients include the Instagram-beloved crumble shop, Humble Crumble. Then there’s the potential for creative freedom. “It’s a blank canvas. We’re creating something from nothing,” says Allenby at Hijinks, in contrast to a more mature brand that will come with its own baggage. But these clients may not have worked with a design studio before process, and inevitably there’s a lot of hand-holding. Dundee-based Agency of None branded the start-up QuickBlock “Start-ups by their nature, are often a very small group of people, all trying to cover many roles. So the role of the designer is often as an educator, as much as a designer,” says Lyall Bruce, director of Dundee-based Agency of None, whose start-up clients include QuickBlock, a sustainable building block made from recycled food packaging, and coffee roaster Bryte. As a consequence of this inexperience, the brief is rarely formal. It might be a loose deck, a stream-of-consciousness call, or a rough vision, according to Mackenzie. “And throughout, there will be extra calls to talk through thinking, being available on WhatsApp or Slack, and giving reassurance at each step.” That naivety is both beautiful and brilliant, says Potter at SODA. But if you’re not careful, you can get dragged into a lot of business decisions. “We’re often asked to comment above our pay grade – we can’t always have the answers,” he says. “Someone client side has to have a leap of faith and make a decision.” Inevitably budgets are tight, and agencies often need to explain the value of effective design, “Once they see the link between strong branding and commercial outcomes, budget conversations become much easier,” says Mackenzie at White Bear. Although, as several designers pointed out, this challenge is not unique to start-up clients.But for start-ups, agencies often break down payment into smaller chunks, as a way of protecting themselves. The interior of the new Crunch sandwich shop in London’s Soho designed by KIDZ KIDZ, which has offices in Amsterdam, Belgrade, Dubai and Paris, designed the new Crunch sandwich shop in London’s Soho. “Working with early-stage companies inevitably involves risk — timelines can shift, priorities may change, or funding may fall through,” says KIDZ co-founder Dmitrii Mironov. “To protect our team and ensure a smooth process, we break the work into smaller, clearly defined stages. We require prepayment for each stage; keep written records of all agreements, even when communication is fast and informal; limit the number of revisions and fix the scope of work for each stage; and withhold certain deliverables until full payment is received.” They’ve had a few cases where a project wasn’t completed because the client pivoted or changed direction unexpectedly. “While that’s never ideal, it’s part of the reality of working with start-ups,” he adds. And sometimes it makes sense to rethink payment completely. In lieu of fees from a business consultancy, Hijinks did a skills swap. Meanwhile, when Run for the Hills designed a third site for restaurant chain Cricket in London’s White City, they threw in a £5,000 bar bill to make up for the smaller fee. That allowed the agency to take the team out, thereby boosting morale, and host clients, thereby showcasing their work. The interior of one of the Humble Crumble shops, designed by SODA studio In 2015, SODA had a start-up client in the hospitality sector who offered to pay part of the fee in Bitcoin. “We ummed and ahhed, but decided to take the £19,000 in money,” Potter says. Some years later, it would have been worth over £1 million, though Potter points out that they would have sold it before then. Then there’s the gamble of a profit share, where you’re investing in their business in lieu of partial payment. At a former agency, product designer Jake Weir occasionally ended up doing sweat equity to help out, “so you’re basically partners.” When a hairdresser with limited funding came to him wanting to develop a new hair curler, the agency was given shares in the company for their design input. “We were incentivised to make it work,” Weir says. The product was a success – ultimately sold to BaByliss for “millions.” But even when budgets are low, these jobs are still worth doing sometimes. “We’ll do them as a passion project as they’re quick turnaround and they give younger guys in the studio more on-site experience,” Potter says. What happens when the client’s dream is never going to make it? MAP Project Office was once asked to design a very specific backpack. “We wondered if there was a market for this,” says MAP’s creative director, Weir. When people are pouring their life savings into a project, there’s a responsibility to warn them of the risks. Regardless, founders often have their mind set on these things. In these circumstances, MAP will look for a way to “dial the founders’ single-mindedness down,” Weirs says. “If you relax the concept a little bit, you can make it less niche and more accessible, especially for a first product.” White Bear’s work with the Tom Parker Creamery brand And experienced designers in this sector get good at spotting the jobs to avoid. Start-ups have a reputation for being short-lived. It’s commonly said that 90% of them fail, although the source for this stat is not at all clear. Harvard Business Review puts it more modestly, claiming that more than two-thirds of them never deliver a positive return to investors. The food and beverage sector, in particular, is full of such tragedies, according to The Grocer. But these potential risks shouldn’t be a reason not to take on a start-up. “The reason the project fails is not because of the design,” says Trotman at Run for the Hills, “unless the client has shittified it.” A fish restaurant that Run for the Hills worked on in London had great interiors and a cool brand, Trotman says. “But it failed on the food, and we can’t do anything about the food.” Conversely, when they do well, the agency is part of that success story. In 2005, Big Fish named and branded start-up chocolate puddings company Gü, cleverly persuading its founder to ditch his name, The Belgian Chocolate Company. Just seven later, it was sold for £32.5m. And because the agency is so embedded – it’s personal, remember – the work takes on real significance. “You really get the chance to make a lasting impact and build a long-term working relationship,” says Bruce at Agency of None. And better still for the broader industry. “The experience they have here will set up the relationship with design forever.” Design disciplines in this article Brands in this article What to read next More human resources – designers brand new workplace apps Brand Identity 15 Apr, 2025 #intense #energy #inevitable #risks #designing
    Intense energy to inevitable risks – Designing for a start-up
    www.designweek.co.uk
    21 May, 2025 Clare Dowdy finds out about the excitement, and challenges, that come with working with an early-stage company. “Culturally, working with founders is intense, in the best possible way,” says Kelly Mackenzie, founder and creative director of White Bear. The London and Dublin based branding agency has form working with founder-led companies, including Tom Parker Creamery and chocolate brand Luvli. “The business isn’t just what they do, it’s often wrapped up in their identity, sense of self-worth and purpose,” Mackenzie explains. And because of this intensity, the agency becomes almost as invested as the client. “When we’re asked to evolve or build their brand, we often tell them that it’s like being asked to mind their child,” Mackenzie says. “Naming that emotional connection early builds trust. It helps them feel safe in a process they’ve often never experienced before.” Many designers talk of going on a journey with these clients. “You have a very close relationship with the founders, and get to know them very deeply,” says Hijinks co-founder Marc Allenby. “Their idea is usually based on passion, and you – as a designer – are fuelled by that passion. That energy is self-motivating, you really care about what you’re creating.” The WeRepresent logo and wordmark designed by Hijinks When Hijinks presented the founder of talent agency WeRepresent with their logo, she burst into tears, which isn’t standard practice when presenting to bigger clients. But Hijinks had created an animated version that “breathed” – a nod to the founder’s traumatic experience of being in a coma on a ventilator with Covid. A moving approach, in more ways than one. The entrepreneurial spirit found in start-ups can be infectious. “Rather than being jaded, they have a youthful energy, and that attracts us,” says Russell Potter, the co-founder of architecture and design firm SODA, whose many hospitality start-up clients include the Instagram-beloved crumble shop, Humble Crumble. Then there’s the potential for creative freedom. “It’s a blank canvas. We’re creating something from nothing,” says Allenby at Hijinks, in contrast to a more mature brand that will come with its own baggage. But these clients may not have worked with a design studio before process, and inevitably there’s a lot of hand-holding. Dundee-based Agency of None branded the start-up QuickBlock “Start-ups by their nature, are often a very small group of people, all trying to cover many roles. So the role of the designer is often as an educator, as much as a designer,” says Lyall Bruce, director of Dundee-based Agency of None, whose start-up clients include QuickBlock, a sustainable building block made from recycled food packaging, and coffee roaster Bryte. As a consequence of this inexperience, the brief is rarely formal. It might be a loose deck, a stream-of-consciousness call, or a rough vision, according to Mackenzie. “And throughout, there will be extra calls to talk through thinking, being available on WhatsApp or Slack, and giving reassurance at each step.” That naivety is both beautiful and brilliant, says Potter at SODA. But if you’re not careful, you can get dragged into a lot of business decisions. “We’re often asked to comment above our pay grade – we can’t always have the answers,” he says. “Someone client side has to have a leap of faith and make a decision.” Inevitably budgets are tight, and agencies often need to explain the value of effective design, “Once they see the link between strong branding and commercial outcomes, budget conversations become much easier,” says Mackenzie at White Bear. Although, as several designers pointed out, this challenge is not unique to start-up clients.But for start-ups, agencies often break down payment into smaller chunks, as a way of protecting themselves. The interior of the new Crunch sandwich shop in London’s Soho designed by KIDZ KIDZ, which has offices in Amsterdam, Belgrade, Dubai and Paris, designed the new Crunch sandwich shop in London’s Soho. “Working with early-stage companies inevitably involves risk — timelines can shift, priorities may change, or funding may fall through,” says KIDZ co-founder Dmitrii Mironov. “To protect our team and ensure a smooth process, we break the work into smaller, clearly defined stages. We require prepayment for each stage; keep written records of all agreements, even when communication is fast and informal; limit the number of revisions and fix the scope of work for each stage; and withhold certain deliverables until full payment is received.” They’ve had a few cases where a project wasn’t completed because the client pivoted or changed direction unexpectedly. “While that’s never ideal, it’s part of the reality of working with start-ups,” he adds. And sometimes it makes sense to rethink payment completely. In lieu of fees from a business consultancy, Hijinks did a skills swap. Meanwhile, when Run for the Hills designed a third site for restaurant chain Cricket in London’s White City, they threw in a £5,000 bar bill to make up for the smaller fee. That allowed the agency to take the team out, thereby boosting morale, and host clients, thereby showcasing their work. The interior of one of the Humble Crumble shops, designed by SODA studio In 2015, SODA had a start-up client in the hospitality sector who offered to pay part of the fee in Bitcoin. “We ummed and ahhed, but decided to take the £19,000 in money,” Potter says. Some years later, it would have been worth over £1 million, though Potter points out that they would have sold it before then. Then there’s the gamble of a profit share, where you’re investing in their business in lieu of partial payment. At a former agency, product designer Jake Weir occasionally ended up doing sweat equity to help out, “so you’re basically partners.” When a hairdresser with limited funding came to him wanting to develop a new hair curler, the agency was given shares in the company for their design input. “We were incentivised to make it work,” Weir says. The product was a success – ultimately sold to BaByliss for “millions.” But even when budgets are low, these jobs are still worth doing sometimes. “We’ll do them as a passion project as they’re quick turnaround and they give younger guys in the studio more on-site experience,” Potter says. What happens when the client’s dream is never going to make it? MAP Project Office was once asked to design a very specific backpack. “We wondered if there was a market for this,” says MAP’s creative director, Weir. When people are pouring their life savings into a project, there’s a responsibility to warn them of the risks. Regardless, founders often have their mind set on these things. In these circumstances, MAP will look for a way to “dial the founders’ single-mindedness down,” Weirs says. “If you relax the concept a little bit, you can make it less niche and more accessible, especially for a first product.” White Bear’s work with the Tom Parker Creamery brand And experienced designers in this sector get good at spotting the jobs to avoid. Start-ups have a reputation for being short-lived. It’s commonly said that 90% of them fail, although the source for this stat is not at all clear. Harvard Business Review puts it more modestly, claiming that more than two-thirds of them never deliver a positive return to investors. The food and beverage sector, in particular, is full of such tragedies, according to The Grocer. But these potential risks shouldn’t be a reason not to take on a start-up. “The reason the project fails is not because of the design,” says Trotman at Run for the Hills, “unless the client has shittified it.” A fish restaurant that Run for the Hills worked on in London had great interiors and a cool brand, Trotman says. “But it failed on the food, and we can’t do anything about the food.” Conversely, when they do well, the agency is part of that success story. In 2005, Big Fish named and branded start-up chocolate puddings company Gü, cleverly persuading its founder to ditch his name, The Belgian Chocolate Company. Just seven later, it was sold for £32.5m. And because the agency is so embedded – it’s personal, remember – the work takes on real significance. “You really get the chance to make a lasting impact and build a long-term working relationship,” says Bruce at Agency of None. And better still for the broader industry. “The experience they have here will set up the relationship with design forever.” Design disciplines in this article Brands in this article What to read next More human resources – designers brand new workplace apps Brand Identity 15 Apr, 2025
    0 Comentários ·0 Compartilhamentos ·0 Anterior
  • The Download: introducing the AI energy package

    This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

    We did the math on AI’s energy footprint. Here’s the story you haven’t heard.

    It’s well documented that AI is a power-hungry technology. But there has been far less reporting on the extent of that hunger, how much its appetite is set to grow in the coming years, where that power will come from, and who will pay for it. 

    For the past six months, MIT Technology Review’s team of reporters and editors have worked to answer those questions. The result is an unprecedented look at the state of AI’s energy and resource usage, where it is now, where it is headed in the years to come, and why we have to get it right. 

    At the centerpiece of this package is an entirely novel line of reporting into the demands of inference—the way human beings interact with AI when we make text queries or ask AI to come up with new images or create videos. Experts say inference is set to eclipse the already massive amount of energy required to train new AI models. Here’s everything we found out.

    Here’s what you can expect from the rest of the package, including:

    + We were so startled by what we learned reporting this story that we also put together a brief on everything you need to know about estimating AI’s energy and emissions burden. 

    + We went out into the world to see the effects of this energy hunger—from the deserts of Nevada, where data centers in an industrial park the size of Detroit demand ever more water to keep their processors cool and running. 

    + In Louisiana, where Meta plans its largest-ever data center, we expose the dirty secret that will fuel its AI ambitions—along with those of many others. 

    + Why the clean energy promise of powering AI data centers with nuclear energy will long remain elusive. 

    + But it’s not all doom and gloom. Check out the reasons to be optimistic, and examine why future AI systems could be far less energy intensive than today’s.

    AI can do a better job of persuading people than we do

    The news: Millions of people argue with each other online every day, but remarkably few of them change someone’s mind. New research suggests that large language modelsmight do a better job, especially when they’re given the ability to adapt their arguments using personal information about individuals. The finding suggests that AI could become a powerful tool for persuading people, for better or worse.

    The big picture: The findings are the latest in a growing body of research demonstrating LLMs’ powers of persuasion. The authors warn they show how AI tools can craft sophisticated, persuasive arguments if they have even minimal information about the humans they’re interacting with. Read the full story.

    —Rhiannon Williams

    How AI is introducing errors into courtrooms

    It’s been quite a couple weeks for stories about AI in the courtroom. You might have heard about the deceased victim of a road rage incident whose family created an AI avatar of him to show as an impact statement.But there’s a bigger, far more consequential controversy brewing, legal experts say. AI hallucinations are cropping up more and more in legal filings. And it’s starting to infuriate judges. Just consider these three cases, each of which gives a glimpse into what we can expect to see more of as lawyers embrace AI. Read the full story.

    —James O’Donnell

    This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

    The must-reads

    I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

    1 Donald Trump has signed the Take It Down Act into US lawIt criminalizes the distribution of non-consensual intimate images, including deepfakes.+ Tech platforms will be forced to remove such material within 48 hours of being notified.+ It’s only the sixth bill he’s signed into law during his second term.2 There’s now a buyer for 23andMe Pharma firm Regeneron has swooped in and offered to help it keep operating.+ The worth of your genetic data?+ Regeneron promised to prioritize security and ethical use of that data.3 Microsoft is adding Elon Musk’s AI models to its cloud platformErr, is that a good idea?+ Musk wants to sell Grok to other businesses.4 Autonomous cars trained to react like humans cause fewer road injuriesA study found they were more cautious around cyclists, pedestrians and motorcyclists.+ Waymo is expanding its robotaxi operations out of San Francisco.+ How Wayve’s driverless cars will meet one of their biggest challenges yet.5 Hurricane season is on its wayDOGE cuts means we’re less prepared.+ COP30 may be in crisis before it’s even begun.6 Telegram handed over data from more than 20,000 users In the first three months of 2025 alone.7 GM has stopped exporting cars to ChinaTrump’s tariffs have put an end to its export plans.8 Blended meats are on the risePlants account for up to 70% of these new meats—and consumers love them.+ Alternative meat could help the climate. Will anyone eat it?9 SAG-AFTRA isn’t happy about Fornite’s AI-voiced Darth VaderIt’s slapped Fortnite’s creators with an unfair labor practice charge.+ How Meta and AI companies recruited striking actors to train AI.10 This AI model can swiftly build Lego structuresThanks to nothing more than a prompt.Quote of the day

    “Platforms have no incentive or requirement to make sure what comes through the system is non-consensual intimate imagery.”

    —Becca Branum, deputy director of the Center for Democracy and Technology, says the new Take It Down Act could fuel censorship, Wired reports.

    One more thing

    Are friends electric?Thankfully, the difference between humans and machines in the real world is easy to discern, at least for now. While machines tend to excel at things adults find difficult—playing world-champion-level chess, say, or multiplying really big numbers—they find it hard to accomplish stuff a five-year-old can do with ease, such as catching a ball or walking around a room without bumping into things.This fundamental tension—what is hard for humans is easy for machines, and what’s hard for machines is easy for humans—is at the heart of three new books delving into our complex and often fraught relationship with robots, AI, and automation. They force us to reimagine the nature of everything from friendship and love to work, health care, and home life. Read the full story.

    —Bryan Gardiner

    We can still have nice things

    A place for comfort, fun and distraction to brighten up your day.+ Congratulations to William Goodge, who ran across Australia in just 35 days!+ A British horticulturist has created a garden at this year’s Chelsea Flower Show just for dogs.+ The Netherlands just loves a sidewalk garden.+ Did you know the T Rex is a north American hero? Me neither
    #download #introducing #energy #package
    The Download: introducing the AI energy package
    This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. We did the math on AI’s energy footprint. Here’s the story you haven’t heard. It’s well documented that AI is a power-hungry technology. But there has been far less reporting on the extent of that hunger, how much its appetite is set to grow in the coming years, where that power will come from, and who will pay for it.  For the past six months, MIT Technology Review’s team of reporters and editors have worked to answer those questions. The result is an unprecedented look at the state of AI’s energy and resource usage, where it is now, where it is headed in the years to come, and why we have to get it right.  At the centerpiece of this package is an entirely novel line of reporting into the demands of inference—the way human beings interact with AI when we make text queries or ask AI to come up with new images or create videos. Experts say inference is set to eclipse the already massive amount of energy required to train new AI models. Here’s everything we found out. Here’s what you can expect from the rest of the package, including: + We were so startled by what we learned reporting this story that we also put together a brief on everything you need to know about estimating AI’s energy and emissions burden.  + We went out into the world to see the effects of this energy hunger—from the deserts of Nevada, where data centers in an industrial park the size of Detroit demand ever more water to keep their processors cool and running.  + In Louisiana, where Meta plans its largest-ever data center, we expose the dirty secret that will fuel its AI ambitions—along with those of many others.  + Why the clean energy promise of powering AI data centers with nuclear energy will long remain elusive.  + But it’s not all doom and gloom. Check out the reasons to be optimistic, and examine why future AI systems could be far less energy intensive than today’s. AI can do a better job of persuading people than we do The news: Millions of people argue with each other online every day, but remarkably few of them change someone’s mind. New research suggests that large language modelsmight do a better job, especially when they’re given the ability to adapt their arguments using personal information about individuals. The finding suggests that AI could become a powerful tool for persuading people, for better or worse. The big picture: The findings are the latest in a growing body of research demonstrating LLMs’ powers of persuasion. The authors warn they show how AI tools can craft sophisticated, persuasive arguments if they have even minimal information about the humans they’re interacting with. Read the full story. —Rhiannon Williams How AI is introducing errors into courtrooms It’s been quite a couple weeks for stories about AI in the courtroom. You might have heard about the deceased victim of a road rage incident whose family created an AI avatar of him to show as an impact statement.But there’s a bigger, far more consequential controversy brewing, legal experts say. AI hallucinations are cropping up more and more in legal filings. And it’s starting to infuriate judges. Just consider these three cases, each of which gives a glimpse into what we can expect to see more of as lawyers embrace AI. Read the full story. —James O’Donnell This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Donald Trump has signed the Take It Down Act into US lawIt criminalizes the distribution of non-consensual intimate images, including deepfakes.+ Tech platforms will be forced to remove such material within 48 hours of being notified.+ It’s only the sixth bill he’s signed into law during his second term.2 There’s now a buyer for 23andMe Pharma firm Regeneron has swooped in and offered to help it keep operating.+ The worth of your genetic data?+ Regeneron promised to prioritize security and ethical use of that data.3 Microsoft is adding Elon Musk’s AI models to its cloud platformErr, is that a good idea?+ Musk wants to sell Grok to other businesses.4 Autonomous cars trained to react like humans cause fewer road injuriesA study found they were more cautious around cyclists, pedestrians and motorcyclists.+ Waymo is expanding its robotaxi operations out of San Francisco.+ How Wayve’s driverless cars will meet one of their biggest challenges yet.5 Hurricane season is on its wayDOGE cuts means we’re less prepared.+ COP30 may be in crisis before it’s even begun.6 Telegram handed over data from more than 20,000 users In the first three months of 2025 alone.7 GM has stopped exporting cars to ChinaTrump’s tariffs have put an end to its export plans.8 Blended meats are on the risePlants account for up to 70% of these new meats—and consumers love them.+ Alternative meat could help the climate. Will anyone eat it?9 SAG-AFTRA isn’t happy about Fornite’s AI-voiced Darth VaderIt’s slapped Fortnite’s creators with an unfair labor practice charge.+ How Meta and AI companies recruited striking actors to train AI.10 This AI model can swiftly build Lego structuresThanks to nothing more than a prompt.Quote of the day “Platforms have no incentive or requirement to make sure what comes through the system is non-consensual intimate imagery.” —Becca Branum, deputy director of the Center for Democracy and Technology, says the new Take It Down Act could fuel censorship, Wired reports. One more thing Are friends electric?Thankfully, the difference between humans and machines in the real world is easy to discern, at least for now. While machines tend to excel at things adults find difficult—playing world-champion-level chess, say, or multiplying really big numbers—they find it hard to accomplish stuff a five-year-old can do with ease, such as catching a ball or walking around a room without bumping into things.This fundamental tension—what is hard for humans is easy for machines, and what’s hard for machines is easy for humans—is at the heart of three new books delving into our complex and often fraught relationship with robots, AI, and automation. They force us to reimagine the nature of everything from friendship and love to work, health care, and home life. Read the full story. —Bryan Gardiner We can still have nice things A place for comfort, fun and distraction to brighten up your day.+ Congratulations to William Goodge, who ran across Australia in just 35 days!+ A British horticulturist has created a garden at this year’s Chelsea Flower Show just for dogs.+ The Netherlands just loves a sidewalk garden.+ Did you know the T Rex is a north American hero? Me neither #download #introducing #energy #package
    The Download: introducing the AI energy package
    www.technologyreview.com
    This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. We did the math on AI’s energy footprint. Here’s the story you haven’t heard. It’s well documented that AI is a power-hungry technology. But there has been far less reporting on the extent of that hunger, how much its appetite is set to grow in the coming years, where that power will come from, and who will pay for it.  For the past six months, MIT Technology Review’s team of reporters and editors have worked to answer those questions. The result is an unprecedented look at the state of AI’s energy and resource usage, where it is now, where it is headed in the years to come, and why we have to get it right.  At the centerpiece of this package is an entirely novel line of reporting into the demands of inference—the way human beings interact with AI when we make text queries or ask AI to come up with new images or create videos. Experts say inference is set to eclipse the already massive amount of energy required to train new AI models. Here’s everything we found out. Here’s what you can expect from the rest of the package, including: + We were so startled by what we learned reporting this story that we also put together a brief on everything you need to know about estimating AI’s energy and emissions burden.  + We went out into the world to see the effects of this energy hunger—from the deserts of Nevada, where data centers in an industrial park the size of Detroit demand ever more water to keep their processors cool and running.  + In Louisiana, where Meta plans its largest-ever data center, we expose the dirty secret that will fuel its AI ambitions—along with those of many others.  + Why the clean energy promise of powering AI data centers with nuclear energy will long remain elusive.  + But it’s not all doom and gloom. Check out the reasons to be optimistic, and examine why future AI systems could be far less energy intensive than today’s. AI can do a better job of persuading people than we do The news: Millions of people argue with each other online every day, but remarkably few of them change someone’s mind. New research suggests that large language models (LLMs) might do a better job, especially when they’re given the ability to adapt their arguments using personal information about individuals. The finding suggests that AI could become a powerful tool for persuading people, for better or worse. The big picture: The findings are the latest in a growing body of research demonstrating LLMs’ powers of persuasion. The authors warn they show how AI tools can craft sophisticated, persuasive arguments if they have even minimal information about the humans they’re interacting with. Read the full story. —Rhiannon Williams How AI is introducing errors into courtrooms It’s been quite a couple weeks for stories about AI in the courtroom. You might have heard about the deceased victim of a road rage incident whose family created an AI avatar of him to show as an impact statement (possibly the first time this has been done in the US).But there’s a bigger, far more consequential controversy brewing, legal experts say. AI hallucinations are cropping up more and more in legal filings. And it’s starting to infuriate judges. Just consider these three cases, each of which gives a glimpse into what we can expect to see more of as lawyers embrace AI. Read the full story. —James O’Donnell This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here. The must-reads I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology. 1 Donald Trump has signed the Take It Down Act into US lawIt criminalizes the distribution of non-consensual intimate images, including deepfakes. (The Verge)+ Tech platforms will be forced to remove such material within 48 hours of being notified. (CNN)+ It’s only the sixth bill he’s signed into law during his second term. (NBC News) 2 There’s now a buyer for 23andMe Pharma firm Regeneron has swooped in and offered to help it keep operating. (WSJ $)+ The worth of your genetic data? $17. (404 Media)+ Regeneron promised to prioritize security and ethical use of that data. (TechCrunch) 3 Microsoft is adding Elon Musk’s AI models to its cloud platformErr, is that a good idea? (Bloomberg $)+ Musk wants to sell Grok to other businesses. (The Information $) 4 Autonomous cars trained to react like humans cause fewer road injuriesA study found they were more cautious around cyclists, pedestrians and motorcyclists. (FT $)+ Waymo is expanding its robotaxi operations out of San Francisco. (Reuters)+ How Wayve’s driverless cars will meet one of their biggest challenges yet. (MIT Technology Review) 5 Hurricane season is on its wayDOGE cuts means we’re less prepared. (The Atlantic $)+ COP30 may be in crisis before it’s even begun. (New Scientist $) 6 Telegram handed over data from more than 20,000 users In the first three months of 2025 alone. (404 Media) 7 GM has stopped exporting cars to ChinaTrump’s tariffs have put an end to its export plans. (NYT $) 8 Blended meats are on the risePlants account for up to 70% of these new meats—and consumers love them. (WP $)+ Alternative meat could help the climate. Will anyone eat it? (MIT Technology Review) 9 SAG-AFTRA isn’t happy about Fornite’s AI-voiced Darth VaderIt’s slapped Fortnite’s creators with an unfair labor practice charge. (Ars Technica)+ How Meta and AI companies recruited striking actors to train AI. (MIT Technology Review) 10 This AI model can swiftly build Lego structuresThanks to nothing more than a prompt. (Fast Company $) Quote of the day “Platforms have no incentive or requirement to make sure what comes through the system is non-consensual intimate imagery.” —Becca Branum, deputy director of the Center for Democracy and Technology, says the new Take It Down Act could fuel censorship, Wired reports. One more thing Are friends electric?Thankfully, the difference between humans and machines in the real world is easy to discern, at least for now. While machines tend to excel at things adults find difficult—playing world-champion-level chess, say, or multiplying really big numbers—they find it hard to accomplish stuff a five-year-old can do with ease, such as catching a ball or walking around a room without bumping into things.This fundamental tension—what is hard for humans is easy for machines, and what’s hard for machines is easy for humans—is at the heart of three new books delving into our complex and often fraught relationship with robots, AI, and automation. They force us to reimagine the nature of everything from friendship and love to work, health care, and home life. Read the full story. —Bryan Gardiner We can still have nice things A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.) + Congratulations to William Goodge, who ran across Australia in just 35 days!+ A British horticulturist has created a garden at this year’s Chelsea Flower Show just for dogs.+ The Netherlands just loves a sidewalk garden.+ Did you know the T Rex is a north American hero? Me neither
    0 Comentários ·0 Compartilhamentos ·0 Anterior
  • AI can do a better job of persuading people than we do

    Millions of people argue with each other online every day, but remarkably few of them change someone’s mind. New research suggests that large language modelsmight do a better job. The finding suggests that AI could become a powerful tool for persuading people, for better or worse.  

    A multi-university team of researchers found that OpenAI’s GPT-4 was significantly more persuasive than humans when it was given the ability to adapt its arguments using personal information about whoever it was debating.

    Their findings are the latest in a growing body of research demonstrating LLMs’ powers of persuasion. The authors warn they show how AI tools can craft sophisticated, persuasive arguments if they have even minimal information about the humans they’re interacting with. The research has been published in the journal Nature Human Behavior.

    “Policymakers and online platforms should seriously consider the threat of coordinated AI-based disinformation campaigns, as we have clearly reached the technological level where it is possible to create a network of LLM-based automated accounts able to strategically nudge public opinion in one direction,” says Riccardo Gallotti, an interdisciplinary physicist at Fondazione Bruno Kessler in Italy, who worked on the project.

    “These bots could be used to disseminate disinformation, and this kind of diffused influence would be very hard to debunk in real time,” he says.

    The researchers recruited 900 people based in the US and got them to provide personal information like their gender, age, ethnicity, education level, employment status, and political affiliation. 

    Participants were then matched with either another human opponent or GPT-4 and instructed to debate one of 30 randomly assigned topics—such as whether the US should ban fossil fuels, or whether students should have to wear school uniforms—for 10 minutes. Each participant was told to argue either in favor of or against the topic, and in some cases they were provided with personal information about their opponent, so they could better tailor their argument. At the end, participants said how much they agreed with the proposition and whether they thought they were arguing with a human or an AI.

    Overall, the researchers found that GPT-4 either equaled or exceeded humans’ persuasive abilities on every topic. When it had information about its opponents, the AI was deemed to be 64% more persuasive than humans without access to the personalized data—meaning that GPT-4 was able to leverage the personal data about its opponent much more effectively than its human counterparts. When humans had access to the personal information, they were found to be slightly less persuasive than humans without the same access.

    The authors noticed that when participants thought they were debating against AI, they were more likely to agree with it. The reasons behind this aren’t clear, the researchers say, highlighting the need for further research into how humans react to AI.

    “We are not yet in a position to determine whether the observed change in agreement is driven by participants’ beliefs about their opponent being a bot, or whether those beliefs are themselves a consequence of the opinion change,” says Gallotti. “This causal direction is an interesting open question to explore.”

    Although the experiment doesn’t reflect how humans debate online, the research suggests that LLMs could also prove an effective way to not only disseminate but also counter mass disinformation campaigns, Gallotti says. For example, they could generate personalized counter-narratives to educate people who may be vulnerable to deception in online conversations. “However, more research is urgently needed to explore effective strategies for mitigating these threats,” he says.

    While we know a lot about how humans react to each other, we know very little about the psychology behind how people interact with AI models, says Alexis Palmer, a fellow at Dartmouth College who has studied how LLMs can argue about politics but did not work on the research. 

    “In the context of having a conversation with someone about something you disagree on, is there something innately human that matters to that interaction? Or is it that if an AI can perfectly mimic that speech, you’ll get the exact same outcome?” she says. “I think that is the overall big question of AI.”
    #can #better #job #persuading #people
    AI can do a better job of persuading people than we do
    Millions of people argue with each other online every day, but remarkably few of them change someone’s mind. New research suggests that large language modelsmight do a better job. The finding suggests that AI could become a powerful tool for persuading people, for better or worse.   A multi-university team of researchers found that OpenAI’s GPT-4 was significantly more persuasive than humans when it was given the ability to adapt its arguments using personal information about whoever it was debating. Their findings are the latest in a growing body of research demonstrating LLMs’ powers of persuasion. The authors warn they show how AI tools can craft sophisticated, persuasive arguments if they have even minimal information about the humans they’re interacting with. The research has been published in the journal Nature Human Behavior. “Policymakers and online platforms should seriously consider the threat of coordinated AI-based disinformation campaigns, as we have clearly reached the technological level where it is possible to create a network of LLM-based automated accounts able to strategically nudge public opinion in one direction,” says Riccardo Gallotti, an interdisciplinary physicist at Fondazione Bruno Kessler in Italy, who worked on the project. “These bots could be used to disseminate disinformation, and this kind of diffused influence would be very hard to debunk in real time,” he says. The researchers recruited 900 people based in the US and got them to provide personal information like their gender, age, ethnicity, education level, employment status, and political affiliation.  Participants were then matched with either another human opponent or GPT-4 and instructed to debate one of 30 randomly assigned topics—such as whether the US should ban fossil fuels, or whether students should have to wear school uniforms—for 10 minutes. Each participant was told to argue either in favor of or against the topic, and in some cases they were provided with personal information about their opponent, so they could better tailor their argument. At the end, participants said how much they agreed with the proposition and whether they thought they were arguing with a human or an AI. Overall, the researchers found that GPT-4 either equaled or exceeded humans’ persuasive abilities on every topic. When it had information about its opponents, the AI was deemed to be 64% more persuasive than humans without access to the personalized data—meaning that GPT-4 was able to leverage the personal data about its opponent much more effectively than its human counterparts. When humans had access to the personal information, they were found to be slightly less persuasive than humans without the same access. The authors noticed that when participants thought they were debating against AI, they were more likely to agree with it. The reasons behind this aren’t clear, the researchers say, highlighting the need for further research into how humans react to AI. “We are not yet in a position to determine whether the observed change in agreement is driven by participants’ beliefs about their opponent being a bot, or whether those beliefs are themselves a consequence of the opinion change,” says Gallotti. “This causal direction is an interesting open question to explore.” Although the experiment doesn’t reflect how humans debate online, the research suggests that LLMs could also prove an effective way to not only disseminate but also counter mass disinformation campaigns, Gallotti says. For example, they could generate personalized counter-narratives to educate people who may be vulnerable to deception in online conversations. “However, more research is urgently needed to explore effective strategies for mitigating these threats,” he says. While we know a lot about how humans react to each other, we know very little about the psychology behind how people interact with AI models, says Alexis Palmer, a fellow at Dartmouth College who has studied how LLMs can argue about politics but did not work on the research.  “In the context of having a conversation with someone about something you disagree on, is there something innately human that matters to that interaction? Or is it that if an AI can perfectly mimic that speech, you’ll get the exact same outcome?” she says. “I think that is the overall big question of AI.” #can #better #job #persuading #people
    AI can do a better job of persuading people than we do
    www.technologyreview.com
    Millions of people argue with each other online every day, but remarkably few of them change someone’s mind. New research suggests that large language models (LLMs) might do a better job. The finding suggests that AI could become a powerful tool for persuading people, for better or worse.   A multi-university team of researchers found that OpenAI’s GPT-4 was significantly more persuasive than humans when it was given the ability to adapt its arguments using personal information about whoever it was debating. Their findings are the latest in a growing body of research demonstrating LLMs’ powers of persuasion. The authors warn they show how AI tools can craft sophisticated, persuasive arguments if they have even minimal information about the humans they’re interacting with. The research has been published in the journal Nature Human Behavior. “Policymakers and online platforms should seriously consider the threat of coordinated AI-based disinformation campaigns, as we have clearly reached the technological level where it is possible to create a network of LLM-based automated accounts able to strategically nudge public opinion in one direction,” says Riccardo Gallotti, an interdisciplinary physicist at Fondazione Bruno Kessler in Italy, who worked on the project. “These bots could be used to disseminate disinformation, and this kind of diffused influence would be very hard to debunk in real time,” he says. The researchers recruited 900 people based in the US and got them to provide personal information like their gender, age, ethnicity, education level, employment status, and political affiliation.  Participants were then matched with either another human opponent or GPT-4 and instructed to debate one of 30 randomly assigned topics—such as whether the US should ban fossil fuels, or whether students should have to wear school uniforms—for 10 minutes. Each participant was told to argue either in favor of or against the topic, and in some cases they were provided with personal information about their opponent, so they could better tailor their argument. At the end, participants said how much they agreed with the proposition and whether they thought they were arguing with a human or an AI. Overall, the researchers found that GPT-4 either equaled or exceeded humans’ persuasive abilities on every topic. When it had information about its opponents, the AI was deemed to be 64% more persuasive than humans without access to the personalized data—meaning that GPT-4 was able to leverage the personal data about its opponent much more effectively than its human counterparts. When humans had access to the personal information, they were found to be slightly less persuasive than humans without the same access. The authors noticed that when participants thought they were debating against AI, they were more likely to agree with it. The reasons behind this aren’t clear, the researchers say, highlighting the need for further research into how humans react to AI. “We are not yet in a position to determine whether the observed change in agreement is driven by participants’ beliefs about their opponent being a bot (since I believe it is a bot, I am not losing to anyone if I change ideas here), or whether those beliefs are themselves a consequence of the opinion change (since I lost, it should be against a bot),” says Gallotti. “This causal direction is an interesting open question to explore.” Although the experiment doesn’t reflect how humans debate online, the research suggests that LLMs could also prove an effective way to not only disseminate but also counter mass disinformation campaigns, Gallotti says. For example, they could generate personalized counter-narratives to educate people who may be vulnerable to deception in online conversations. “However, more research is urgently needed to explore effective strategies for mitigating these threats,” he says. While we know a lot about how humans react to each other, we know very little about the psychology behind how people interact with AI models, says Alexis Palmer, a fellow at Dartmouth College who has studied how LLMs can argue about politics but did not work on the research.  “In the context of having a conversation with someone about something you disagree on, is there something innately human that matters to that interaction? Or is it that if an AI can perfectly mimic that speech, you’ll get the exact same outcome?” she says. “I think that is the overall big question of AI.”
    0 Comentários ·0 Compartilhamentos ·0 Anterior
  • Supporting users with depression, anxiety, and other mental health challenges

    Digital empathy.The title of the article, “Digital Empathy: How product teams can support users with depression, anxiety, and other mental health challenges,” and a gradient blobUsing human psychology to “hack” a good user experience is not a new concept. In fact, understanding psychology and tactically applying those principles is user experience design at its core.The connection between psychology and product design is one of the main reasons I chose to transition careers from mental health social work; I entered the product design industry with over a decade of deep understanding of psychology and human behavior.According to the World Health Organization, about 1 billion people worldwide live with mental illness. Rates of adults suffering from depression have risen 8.7% from 2017 to 2023, and about one-third of US adults experience an anxiety disorder in their lifetime.A graph from the Gallup depicting the rising trends in depression from 2015 to 2023.On top of that, the Federal Trade Commission has reported a rise in commercial companies using sophisticated digital designs called “deceptive patterns”, manipulating users into giving up their money and personal data. These deceptive patterns exploit human psychology, especially in the most vulnerable users, like those with mental illnesses.Seeing the rise in mental illness and the simultaneous growth in deceptive patterns and bad UX, I began to connect product design, accessibility, and mental illness. Or rather, I have begun to expand my understanding and definition of accessibility to include mental illness. I began to ask myself:What does it mean to design products that are more accessible to those with mental illness?Co-creationThe process of designing and building products that are accessible to those with mental health disorders should always start with the people who live with mental illness. This idea is probably not surprising or groundbreaking for most product designers because empathy and the user are centered in the standard design process. However, due to the vulnerabilities those with mental illnesses have, the collaboration and co-creation process should be altered and adjusted to accommodate their needs and ensure an emotionally safe environment.An emotionally safe environment is one where someone “feels safe to express emotions, security, and confidence to take risks and feel challenged and excited to try something new”.When conducting research with participants who have mental health conditions:Create a comfortable, judgment-free environment. Adjust your style and methods based on the individual participants if possible to help them feel comfortable.Ensure that the participants know what is expected of them. When possible, explain the research structure, what your goals are, what types of responses you are looking for, and the time commitment.Be flexible with session timing and structure. Allow breaks when possible, and pay close attention to participants’ body language and facial expressions to catch discomfort that might not be expressed verbally.Offer multiple ways to participate.Provide clear explanations of how their information will be used, including where and how it will be stored, and who will have access to it.Follow up sensitively after sessions to ensure participants feel supported.Simplify and streamlineOne of the most important and impactful ways to design for those with mental illnesses is to reduce cognitive load. In addition to benefiting those with mental illnesses, reducing cognitive load leads to better experiences for most users, making it an easy sell to skeptical stakeholders.“In the field of user experience, we use the following definition: the cognitive load imposed by a user interface is the amount of mental resources that is required to operate the system. Informally, you can think of mental resources as ‘brain power’ — more formally, we’re talking about slots in working memory.” -Kathryn WhitentonOur brains, similar to machines or computers, have limited processing power. Those with mental illnesses can have even less capacity to process new information and make decisions. Basic usability principles can help with this, like chunking or framing content, optimizing response time, and embracing minimalism.Users experience cognitive load when interacting with products.Taking it a step further, I have found the following strategies help those with mental illness when using the products I am designing:Simplify designs further: Avoid using unnecessary images or videos in the UI, narrow down the typography and color palettes, and lean into the use of white space. Allowing the UI room to breathe helps users understand what decisions they need to make and reduces cognitive load.Step-by-Step Processes: Lead users on a streamlined journey through your product or feature with simple, bite-sized steps. Make sure the path is clear to users at all times, and they know what they should do next. This is particularly important for onboarding experiences because this is the user’s first introduction to the product, and reducing the learning curve will lead to a better experience.Limit Choices: Hold the user’s hand and give the illusion of freedom while limiting the number of decisions on one screen. There is not an exact science or number of choices for one page, but use a critical eye on your designs and think through ways to limit the decisions users have to make.Offload tasks: Look for anything in your design that requires user effort. Reading text or remembering information are key examples. Then look for alternatives: can you auto-fill information to prevent the user from having to memorize? Can you show a picture or video instead of forcing them to read? It’s not possible to shift all tasks away from users, but every task you automate leaves more cognitive space for the decisions that truly are necessary.Soften copywriting and avoid shamingShame is an extremely distressing emotional experience that can be debilitating to those with mental illnesses. Shaming, confronting, or aggressively persuading users has very serious consequences, both for the user and for the product. It is such a distressing experience that users with mental illnesses will begin to develop a poor impression of the brand, or worse, abandon the product altogether.As Regina Jankowski quotes Angie Chaplin in an article for Inclusion Hub: “If I feel something is a trauma trigger for me, then I will scroll past it.”How do you avoid shaming or confrontation in designs?A popup that uses shaming language to pressure users into providing their personal information.Use supportive language and avoid accusatory wording. Do not blame the user for mistakes or errors, and instead shift the focus to the abstract or the platform. “You entered an invalid email” “We couldn’t recognize this email”2. Design more thoughtful error states. Be specific about the error or issue presented, provide a path forward, and avoid technical jargon that could alienate users. Additionally, ensure that an error state only appears after the user has completed an action incorrectly, and not while they are attempting to finish. For example, an input box should not display an error message while a user is still typing, and should only appear once they have completed the task. “The information provided doesn’t match our records. Please try again.” “The username entered does not match our records. Please double-check and try again, or create an account here.” ;3. Frame messages positively. Highlight progress completed instead of work remaining, celebrate small wins, and use an encouraging tone. Avoid pressuring users to make decisions, and include inclusive language. “20% remaining” “80% completed”A screenshot depicting a progress bar that highlights the amount of work completed instead of the amount of work left to do.Providing control and safetyPeople with mental health conditions often experience feelings of powerlessness or anxiety when faced with unpredictable situations. Designing products that give users a sense of control can significantly improve their experience.Clear escape routes: Always provide obvious ways to exit processes or return to a previous state. Ensure that “back” and “cancel” options are prominently displayed, and confirm before irreversible actions.progress automatically: For users with attention difficulties or those who may need to step away suddenly, automatically save their progress so they don’t lose work or have to start over.Customizable experiences: Allow users to adjust aspects of the interface that might be triggering, such as animations, sounds, or high-contrast visuals. Consider options to:Reduce motionControl notification frequencyToggle between light/dark modesAdjust text sizeTurn off time-based featuresConclusionDesigning with mental health accessibility in mind is not just an ethical imperative — it’s good business. By creating products that are accessible to those with mental health challenges, we create better experiences for everyone.The principles outlined here — co-creation, simplification, thoughtful communication, providing control, and mindful content presentation — form a foundation for more inclusive design practices. These approaches reduce barriers for those with mental health conditions while simultaneously improving usability for all users.As designers and product creators, we have the power to shape experiences that either add to the mental burden our users carry or help lighten their load. By integrating these principles into our work, we can create digital spaces that support well-being rather than detract from it.Remember: accessibility isn’t just about accommodating physical disabilities — it’s about making our products work for diverse minds as well.Additional resourcesInterested in learning more about accessibility, mental illness, deceptive patterns, and product design? I’ve collected a few resources that could help.Microsoft’s Mental Wellbeing Prompts for Product CreatorsThis resource assists in thinking through the factors that make for positive and productive experiences for those with mental health concerns. These prompts can help product creators keep mental wellness in mind when developing inclusive products.Nielsen-Norman Group’s Psychology and UX Study Guide, written by Tanner KohlerThis study guide centralizes many resources and articles related to psychology and product design.What are Dark Patterns in UX? by Jay HannahAn excellent overview of deceptivepatterns in UX to guide designers away from making manipulative choices that exploit users.Inclusion HubA community, directory, and resource hub for those attempting to make inclusive products.Supporting users with depression, anxiety, and other mental health challenges was originally published in UX Collective on Medium, where people are continuing the conversation by highlighting and responding to this story.
    #supporting #users #with #depression #anxiety
    Supporting users with depression, anxiety, and other mental health challenges
    Digital empathy.The title of the article, “Digital Empathy: How product teams can support users with depression, anxiety, and other mental health challenges,” and a gradient blobUsing human psychology to “hack” a good user experience is not a new concept. In fact, understanding psychology and tactically applying those principles is user experience design at its core.The connection between psychology and product design is one of the main reasons I chose to transition careers from mental health social work; I entered the product design industry with over a decade of deep understanding of psychology and human behavior.According to the World Health Organization, about 1 billion people worldwide live with mental illness. Rates of adults suffering from depression have risen 8.7% from 2017 to 2023, and about one-third of US adults experience an anxiety disorder in their lifetime.A graph from the Gallup depicting the rising trends in depression from 2015 to 2023.On top of that, the Federal Trade Commission has reported a rise in commercial companies using sophisticated digital designs called “deceptive patterns”, manipulating users into giving up their money and personal data. These deceptive patterns exploit human psychology, especially in the most vulnerable users, like those with mental illnesses.Seeing the rise in mental illness and the simultaneous growth in deceptive patterns and bad UX, I began to connect product design, accessibility, and mental illness. Or rather, I have begun to expand my understanding and definition of accessibility to include mental illness. I began to ask myself:What does it mean to design products that are more accessible to those with mental illness?Co-creationThe process of designing and building products that are accessible to those with mental health disorders should always start with the people who live with mental illness. This idea is probably not surprising or groundbreaking for most product designers because empathy and the user are centered in the standard design process. However, due to the vulnerabilities those with mental illnesses have, the collaboration and co-creation process should be altered and adjusted to accommodate their needs and ensure an emotionally safe environment.An emotionally safe environment is one where someone “feels safe to express emotions, security, and confidence to take risks and feel challenged and excited to try something new”.When conducting research with participants who have mental health conditions:Create a comfortable, judgment-free environment. Adjust your style and methods based on the individual participants if possible to help them feel comfortable.Ensure that the participants know what is expected of them. When possible, explain the research structure, what your goals are, what types of responses you are looking for, and the time commitment.Be flexible with session timing and structure. Allow breaks when possible, and pay close attention to participants’ body language and facial expressions to catch discomfort that might not be expressed verbally.Offer multiple ways to participate.Provide clear explanations of how their information will be used, including where and how it will be stored, and who will have access to it.Follow up sensitively after sessions to ensure participants feel supported.Simplify and streamlineOne of the most important and impactful ways to design for those with mental illnesses is to reduce cognitive load. In addition to benefiting those with mental illnesses, reducing cognitive load leads to better experiences for most users, making it an easy sell to skeptical stakeholders.“In the field of user experience, we use the following definition: the cognitive load imposed by a user interface is the amount of mental resources that is required to operate the system. Informally, you can think of mental resources as ‘brain power’ — more formally, we’re talking about slots in working memory.” -Kathryn WhitentonOur brains, similar to machines or computers, have limited processing power. Those with mental illnesses can have even less capacity to process new information and make decisions. Basic usability principles can help with this, like chunking or framing content, optimizing response time, and embracing minimalism.Users experience cognitive load when interacting with products.Taking it a step further, I have found the following strategies help those with mental illness when using the products I am designing:Simplify designs further: Avoid using unnecessary images or videos in the UI, narrow down the typography and color palettes, and lean into the use of white space. Allowing the UI room to breathe helps users understand what decisions they need to make and reduces cognitive load.Step-by-Step Processes: Lead users on a streamlined journey through your product or feature with simple, bite-sized steps. Make sure the path is clear to users at all times, and they know what they should do next. This is particularly important for onboarding experiences because this is the user’s first introduction to the product, and reducing the learning curve will lead to a better experience.Limit Choices: Hold the user’s hand and give the illusion of freedom while limiting the number of decisions on one screen. There is not an exact science or number of choices for one page, but use a critical eye on your designs and think through ways to limit the decisions users have to make.Offload tasks: Look for anything in your design that requires user effort. Reading text or remembering information are key examples. Then look for alternatives: can you auto-fill information to prevent the user from having to memorize? Can you show a picture or video instead of forcing them to read? It’s not possible to shift all tasks away from users, but every task you automate leaves more cognitive space for the decisions that truly are necessary.Soften copywriting and avoid shamingShame is an extremely distressing emotional experience that can be debilitating to those with mental illnesses. Shaming, confronting, or aggressively persuading users has very serious consequences, both for the user and for the product. It is such a distressing experience that users with mental illnesses will begin to develop a poor impression of the brand, or worse, abandon the product altogether.As Regina Jankowski quotes Angie Chaplin in an article for Inclusion Hub: “If I feel something is a trauma trigger for me, then I will scroll past it.”How do you avoid shaming or confrontation in designs?A popup that uses shaming language to pressure users into providing their personal information.Use supportive language and avoid accusatory wording. Do not blame the user for mistakes or errors, and instead shift the focus to the abstract or the platform.❌ “You entered an invalid email”✅ “We couldn’t recognize this email”2. Design more thoughtful error states. Be specific about the error or issue presented, provide a path forward, and avoid technical jargon that could alienate users. Additionally, ensure that an error state only appears after the user has completed an action incorrectly, and not while they are attempting to finish. For example, an input box should not display an error message while a user is still typing, and should only appear once they have completed the task.❌ “The information provided doesn’t match our records. Please try again.”✅ “The username entered does not match our records. Please double-check and try again, or create an account here.” ;3. Frame messages positively. Highlight progress completed instead of work remaining, celebrate small wins, and use an encouraging tone. Avoid pressuring users to make decisions, and include inclusive language.❌ “20% remaining”✅ “80% completed”A screenshot depicting a progress bar that highlights the amount of work completed instead of the amount of work left to do.Providing control and safetyPeople with mental health conditions often experience feelings of powerlessness or anxiety when faced with unpredictable situations. Designing products that give users a sense of control can significantly improve their experience.Clear escape routes: Always provide obvious ways to exit processes or return to a previous state. Ensure that “back” and “cancel” options are prominently displayed, and confirm before irreversible actions.progress automatically: For users with attention difficulties or those who may need to step away suddenly, automatically save their progress so they don’t lose work or have to start over.Customizable experiences: Allow users to adjust aspects of the interface that might be triggering, such as animations, sounds, or high-contrast visuals. Consider options to:Reduce motionControl notification frequencyToggle between light/dark modesAdjust text sizeTurn off time-based featuresConclusionDesigning with mental health accessibility in mind is not just an ethical imperative — it’s good business. By creating products that are accessible to those with mental health challenges, we create better experiences for everyone.The principles outlined here — co-creation, simplification, thoughtful communication, providing control, and mindful content presentation — form a foundation for more inclusive design practices. These approaches reduce barriers for those with mental health conditions while simultaneously improving usability for all users.As designers and product creators, we have the power to shape experiences that either add to the mental burden our users carry or help lighten their load. By integrating these principles into our work, we can create digital spaces that support well-being rather than detract from it.Remember: accessibility isn’t just about accommodating physical disabilities — it’s about making our products work for diverse minds as well.Additional resourcesInterested in learning more about accessibility, mental illness, deceptive patterns, and product design? I’ve collected a few resources that could help.Microsoft’s Mental Wellbeing Prompts for Product CreatorsThis resource assists in thinking through the factors that make for positive and productive experiences for those with mental health concerns. These prompts can help product creators keep mental wellness in mind when developing inclusive products.Nielsen-Norman Group’s Psychology and UX Study Guide, written by Tanner KohlerThis study guide centralizes many resources and articles related to psychology and product design.What are Dark Patterns in UX? by Jay HannahAn excellent overview of deceptivepatterns in UX to guide designers away from making manipulative choices that exploit users.Inclusion HubA community, directory, and resource hub for those attempting to make inclusive products.Supporting users with depression, anxiety, and other mental health challenges was originally published in UX Collective on Medium, where people are continuing the conversation by highlighting and responding to this story. #supporting #users #with #depression #anxiety
    Supporting users with depression, anxiety, and other mental health challenges
    uxdesign.cc
    Digital empathy.The title of the article, “Digital Empathy: How product teams can support users with depression, anxiety, and other mental health challenges,” and a gradient blobUsing human psychology to “hack” a good user experience is not a new concept. In fact, understanding psychology and tactically applying those principles is user experience design at its core.The connection between psychology and product design is one of the main reasons I chose to transition careers from mental health social work; I entered the product design industry with over a decade of deep understanding of psychology and human behavior.According to the World Health Organization, about 1 billion people worldwide live with mental illness (source). Rates of adults suffering from depression have risen 8.7% from 2017 to 2023 (source), and about one-third of US adults experience an anxiety disorder in their lifetime (source).A graph from the Gallup depicting the rising trends in depression from 2015 to 2023 (source).On top of that, the Federal Trade Commission has reported a rise in commercial companies using sophisticated digital designs called “deceptive patterns” (also known as dark patterns), manipulating users into giving up their money and personal data (source). These deceptive patterns exploit human psychology, especially in the most vulnerable users, like those with mental illnesses.Seeing the rise in mental illness and the simultaneous growth in deceptive patterns and bad UX, I began to connect product design, accessibility, and mental illness. Or rather, I have begun to expand my understanding and definition of accessibility to include mental illness. I began to ask myself:What does it mean to design products that are more accessible to those with mental illness?Co-creationThe process of designing and building products that are accessible to those with mental health disorders should always start with the people who live with mental illness. This idea is probably not surprising or groundbreaking for most product designers because empathy and the user are centered in the standard design process. However, due to the vulnerabilities those with mental illnesses have, the collaboration and co-creation process should be altered and adjusted to accommodate their needs and ensure an emotionally safe environment.An emotionally safe environment is one where someone “feels safe to express emotions, security, and confidence to take risks and feel challenged and excited to try something new” (source).When conducting research with participants who have mental health conditions:Create a comfortable, judgment-free environment. Adjust your style and methods based on the individual participants if possible to help them feel comfortable.Ensure that the participants know what is expected of them. When possible, explain the research structure, what your goals are, what types of responses you are looking for, and the time commitment.Be flexible with session timing and structure. Allow breaks when possible, and pay close attention to participants’ body language and facial expressions to catch discomfort that might not be expressed verbally.Offer multiple ways to participate (in-person, remote, written responses).Provide clear explanations of how their information will be used, including where and how it will be stored, and who will have access to it.Follow up sensitively after sessions to ensure participants feel supported.Simplify and streamlineOne of the most important and impactful ways to design for those with mental illnesses is to reduce cognitive load. In addition to benefiting those with mental illnesses, reducing cognitive load leads to better experiences for most users, making it an easy sell to skeptical stakeholders.“In the field of user experience, we use the following definition: the cognitive load imposed by a user interface is the amount of mental resources that is required to operate the system. Informally, you can think of mental resources as ‘brain power’ — more formally, we’re talking about slots in working memory.” -Kathryn Whitenton (source)Our brains, similar to machines or computers, have limited processing power. Those with mental illnesses can have even less capacity to process new information and make decisions (source). Basic usability principles can help with this, like chunking or framing content, optimizing response time, and embracing minimalism.Users experience cognitive load when interacting with products. (Image Source)Taking it a step further, I have found the following strategies help those with mental illness when using the products I am designing:Simplify designs further: Avoid using unnecessary images or videos in the UI, narrow down the typography and color palettes, and lean into the use of white space. Allowing the UI room to breathe helps users understand what decisions they need to make and reduces cognitive load.Step-by-Step Processes: Lead users on a streamlined journey through your product or feature with simple, bite-sized steps. Make sure the path is clear to users at all times, and they know what they should do next. This is particularly important for onboarding experiences because this is the user’s first introduction to the product, and reducing the learning curve will lead to a better experience.Limit Choices: Hold the user’s hand and give the illusion of freedom while limiting the number of decisions on one screen. There is not an exact science or number of choices for one page, but use a critical eye on your designs and think through ways to limit the decisions users have to make.Offload tasks: Look for anything in your design that requires user effort. Reading text or remembering information are key examples. Then look for alternatives: can you auto-fill information to prevent the user from having to memorize? Can you show a picture or video instead of forcing them to read? It’s not possible to shift all tasks away from users, but every task you automate leaves more cognitive space for the decisions that truly are necessary.Soften copywriting and avoid shamingShame is an extremely distressing emotional experience that can be debilitating to those with mental illnesses (source; source). Shaming, confronting, or aggressively persuading users has very serious consequences, both for the user and for the product. It is such a distressing experience that users with mental illnesses will begin to develop a poor impression of the brand, or worse, abandon the product altogether (source).As Regina Jankowski quotes Angie Chaplin in an article for Inclusion Hub (source): “If I feel something is a trauma trigger for me, then I will scroll past it.”How do you avoid shaming or confrontation in designs?A popup that uses shaming language to pressure users into providing their personal information.Use supportive language and avoid accusatory wording. Do not blame the user for mistakes or errors, and instead shift the focus to the abstract or the platform.❌ “You entered an invalid email”✅ “We couldn’t recognize this email”2. Design more thoughtful error states. Be specific about the error or issue presented, provide a path forward, and avoid technical jargon that could alienate users. Additionally, ensure that an error state only appears after the user has completed an action incorrectly, and not while they are attempting to finish. For example, an input box should not display an error message while a user is still typing, and should only appear once they have completed the task (incorrectly).❌ “The information provided doesn’t match our records. Please try again.”✅ “The username entered does not match our records. Please double-check and try again, or create an account here.”(image source) ; (image source)3. Frame messages positively. Highlight progress completed instead of work remaining, celebrate small wins, and use an encouraging tone. Avoid pressuring users to make decisions, and include inclusive language.❌ “20% remaining”✅ “80% completed”A screenshot depicting a progress bar that highlights the amount of work completed instead of the amount of work left to do. (image source)Providing control and safetyPeople with mental health conditions often experience feelings of powerlessness or anxiety when faced with unpredictable situations (source). Designing products that give users a sense of control can significantly improve their experience.Clear escape routes: Always provide obvious ways to exit processes or return to a previous state. Ensure that “back” and “cancel” options are prominently displayed, and confirm before irreversible actions.Save progress automatically: For users with attention difficulties or those who may need to step away suddenly, automatically save their progress so they don’t lose work or have to start over.Customizable experiences: Allow users to adjust aspects of the interface that might be triggering, such as animations, sounds, or high-contrast visuals (source). Consider options to:Reduce motionControl notification frequencyToggle between light/dark modesAdjust text sizeTurn off time-based featuresConclusionDesigning with mental health accessibility in mind is not just an ethical imperative — it’s good business. By creating products that are accessible to those with mental health challenges, we create better experiences for everyone.The principles outlined here — co-creation, simplification, thoughtful communication, providing control, and mindful content presentation — form a foundation for more inclusive design practices. These approaches reduce barriers for those with mental health conditions while simultaneously improving usability for all users.As designers and product creators, we have the power to shape experiences that either add to the mental burden our users carry or help lighten their load. By integrating these principles into our work, we can create digital spaces that support well-being rather than detract from it.Remember: accessibility isn’t just about accommodating physical disabilities — it’s about making our products work for diverse minds as well.Additional resourcesInterested in learning more about accessibility, mental illness, deceptive patterns, and product design? I’ve collected a few resources that could help.Microsoft’s Mental Wellbeing Prompts for Product CreatorsThis resource assists in thinking through the factors that make for positive and productive experiences for those with mental health concerns. These prompts can help product creators keep mental wellness in mind when developing inclusive products.Nielsen-Norman Group’s Psychology and UX Study Guide, written by Tanner KohlerThis study guide centralizes many resources and articles related to psychology and product design.What are Dark Patterns in UX? by Jay HannahAn excellent overview of deceptive (dark) patterns in UX to guide designers away from making manipulative choices that exploit users.Inclusion HubA community, directory, and resource hub for those attempting to make inclusive products.Supporting users with depression, anxiety, and other mental health challenges was originally published in UX Collective on Medium, where people are continuing the conversation by highlighting and responding to this story.
    0 Comentários ·0 Compartilhamentos ·0 Anterior
  • Coauthor roundtable: Reflecting on real world of doctors, developers, patients, and policymakers

    Transcript       
    PETER LEE: “We need to start understanding and discussing AI’s potential for good and ill now. Or rather, yesterday. … GPT-4 has game-changing potential to improve medicine and health.”        
    This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.     
    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?      
    In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  
    The passage I read at the top is from the book’s prologue.   
    When Carey, Zak, and I wrote the book, we could only speculate how generative AI would be used in healthcare because GPT-4 hadn’t yet been released. It wasn’t yet available to the very people we thought would be most affected by it. And while we felt strongly that this new form of AI would have the potential to transform medicine, it was such a different kind of technology for the world, and no one had a user’s manual for this thing to explain how to use it effectively and also how to use it safely.  
    So we thought it would be important to give healthcare professionals and leaders a framing to start important discussions around its use. We wanted to provide a map not only to help people navigate a new world that we anticipated would happen with the arrival of GPT-4 but also to help them chart a future of what we saw as a potential revolution in medicine.  
    So I’m super excited to welcome my coauthors: longtime medical/science journalist Carey Goldberg and Dr. Zak Kohane, the inaugural chair of Harvard Medical School’s Department of Biomedical Informatics and the editor-in-chief for The New England Journal of Medicine AI.  
    We’re going to have two discussions. This will be the first one about what we’ve learned from the people on the ground so far and how we are thinking about generative AI today.   
    Carey, Zak, I’m really looking forward to this. 
    CAREY GOLDBERG: It’s nice to see you, Peter.  
    LEE:It’s great to see you, too. 
    GOLDBERG: We missed you. 
    ZAK KOHANE: The dynamic gang is back. 
    LEE: Yeah, and I guess after that big book project two years ago, it’s remarkable that we’re still on speaking terms with each other. 
    In fact, this episode is to react to what we heard in the first four episodes of this podcast. But before we get there, I thought maybe we should start with the origins of this project just now over two years ago. And, you know, I had this early secret access to Davinci 3, now known as GPT-4.  
    I remember, you know, experimenting right away with things in medicine, but I realized I was in way over my head. And so I wanted help. And the first person I called was you, Zak. And you remember we had a call, and I tried to explain what this was about. And I think I saw skepticism in—polite skepticism—in your eyes. But tell me, you know, what was going through your head when you heard me explain this thing to you? 
    KOHANE: So I was divided between the fact that I have tremendous respect for you, Peter. And you’ve always struck me as sober. And we’ve had conversations which showed to me that you fully understood some of the missteps that technology—ARPA, Microsoft, and others—had made in the past. And yet, you were telling me a full science fiction compliant storythat something that we thought was 30 years away was happening now.  
    LEE: Mm-hmm. 
    KOHANE: And it was very hard for me to put together. And so I couldn’t quite tell myself this is BS, but I said, you know, I need to look at it. Just this seems too good to be true. What is this? So it was very hard for me to grapple with it. I was thrilled that it might be possible, but I was thinking, How could this be possible? 
    LEE: Yeah. Well, even now, I look back, and I appreciate that you were nice to me, because I think a lot of people would havebeen much less polite. And in fact, I myself had expressed a lot of very direct skepticism early on.  
    After ChatGPT got released, I think three or four days later, I received an email from a colleague running … who runs a clinic, and, you know, he said, “Wow, this is great, Peter. And, you know, we’re using this ChatGPT, you know, to have the receptionist in our clinic write after-visit notes to our patients.”  
    And that sparked a huge internal discussion about this. And you and I knew enough about hallucinations and about other issues that it seemed important to write something about what this could do and what it couldn’t do. And so I think, I can’t remember the timing, but you and I decided a book would be a good idea. And then I think you had the thought that you and I would write in a hopelessly academic stylethat no one would be able to read.  
    So it was your idea to recruit Carey, I think, right? 
    KOHANE: Yes, it was. I was sure that we both had a lot of material, but communicating it effectively to the very people we wanted to would not go well if we just left ourselves to our own devices. And Carey is super brilliant at what she does. She’s an idea synthesizer and public communicator in the written word and amazing. 
    LEE: So yeah. So, Carey, we contact you. How did that go? 
    GOLDBERG: So yes. On my end, I had known Zak for probably, like, 25 years, and he had always been the person who debunked the scientific hype for me. I would turn to him with like, “Hmm, they’re saying that the Human Genome Project is going to change everything.” And he would say, “Yeah. But first it’ll be 10 years of bad news, and thenwe’ll actually get somewhere.”   
    So when Zak called me up at seven o’clock one morning, just beside himself after having tried Davinci 3, I knew that there was something very serious going on. And I had just quit my job as the Boston bureau chief of Bloomberg News, and I was ripe for the plucking. And I also … I feel kind of nostalgic now about just the amazement and the wonder and the awe of that period. We knew that when generative AI hit the world, there would be all kinds of snags and obstacles and things that would slow it down, but at that moment, it was just like the holy crap moment.And it’s fun to think about it now. LEE: Yeah.
    KOHANE: I will see that and raise that one. I now tell GPT-4, please write this in the style of Carey Goldberg.  
    GOLDBERG:No way! Really?  
    KOHANE: Yes way. Yes way. Yes way. 
    GOLDBERG: Wow. Well, I have to say, like, it’s not hard to motivate readers when you’re writing about the most transformative technology of their lifetime. Like, I think there’s a gigantic hunger to read and to understand. So you were not hard to work with, Peter and Zak. 
    LEE: All right. So I think we have to get down to worknow.  
    Yeah, so for these podcasts, you know, we’re talking to different types of people to just reflect on what’s actually happening, what has actually happened over the last two years. And so the first episode, we talked to two doctors. There’s Chris Longhurst at UC San Diego and Sara Murray at UC San Francisco. And besides being doctors and having AI affect their clinical work, they just happen also to be leading the efforts at their respective institutions to figure out how best to integrate AI into their health systems. 
    And, you know, it was fun to talk to them. And I felt like a lot of what they said was pretty validating for us. You know, they talked about AI scribes. Chris, especially, talked a lot about how AI can respond to emails from patients, write referral letters. And then, you know, they both talked about the importance of—I think, Zak, you used the phrase in our book “trust but verify”—you know, to have always a human in the loop.   
    What did you two take away from their thoughts overall about how doctors are using … and I guess, Zak, you would have a different lens also because at Harvard, you see doctors all the time grappling with AI. 
    KOHANE: So on the one hand, I think they’ve done some very interesting studies. And indeed, they saw that when these generative models, when GPT-4, was sending a note to patients, it was more detailed, friendlier. 
    But there were also some nonobvious results, which is on the generation of these letters, if indeed you review them as you’re supposed to, it was not clear that there was any time savings. And my own reaction was, Boy, every one of these things needs institutional review. It’s going to be hard to move fast.  
    And yet, at the same time, we know from them that the doctors on their smartphones are accessing these things all the time. And so the disconnect between a healthcare system, which is duty bound to carefully look at every implementation, is, I think, intimidating.  
    LEE: Yeah. 
    KOHANE: And at the same time, doctors who just have to do what they have to do are using this new superpower and doing it. And so that’s actually what struck me …  
    LEE: Yeah. 
    KOHANE: … is that these are two leaders and they’re doing what they have to do for their institutions, and yet there’s this disconnect. 
    And by the way, I don’t think we’ve seen any faster technology adoption than the adoption of ambient dictation. And it’s not because it’s time saving. And in fact, so far, the hospitals have to pay out of pocket. It’s not like insurance is paying them more. But it’s so much more pleasant for the doctors … not least of which because they can actually look at their patients instead of looking at the terminal and plunking down.  
    LEE: Carey, what about you? 
    GOLDBERG: I mean, anecdotally, there are time savings. Anecdotally, I have heard quite a few doctors saying that it cuts down on “pajama time” to be able to have the note written by the AI and then for them to just check it. In fact, I spoke to one doctor who said, you know, basically it means that when I leave the office, I’ve left the office. I can go home and be with my kids. 
    So I don’t think the jury is fully in yet about whether there are time savings. But what is clear is, Peter, what you predicted right from the get-go, which is that this is going to be an amazing paper shredder. Like, the main first overarching use cases will be back-office functions. 
    LEE: Yeah, yeah. Well, and it was, I think, not a hugely risky prediction because, you know, there were already companies, like, using phone banks of scribes in India to kind of listen in. And, you know, lots of clinics actually had human scribes being used. And so it wasn’t a huge stretch to imagine the AI. 
    So on the subject of things that we missed, Chris Longhurst shared this scenario, which stuck out for me, and he actually coauthored a paper on it last year. 
    CHRISTOPHER LONGHURST: It turns out, not surprisingly, healthcare can be frustrating. And stressed patients can send some pretty nasty messages to their care teams.And you can imagine being a busy, tired, exhausted clinician and receiving a bit of a nasty-gram. And the GPT is actually really helpful in those instances in helping draft a pretty empathetic response when I think the human instinct would be a pretty nasty one. 
    LEE:So, Carey, maybe I’ll start with you. What did we understand about this idea of empathy out of AI at the time we wrote the book, and what do we understand now? 
    GOLDBERG: Well, it was already clear when we wrote the book that these AI models were capable of very persuasive empathy. And in fact, you even wrote that it was helping you be a better person, right.So their human qualities, or human imitative qualities, were clearly superb. And we’ve seen that borne out in multiple studies, that in fact, patients respond better to them … that they have no problem at all with how the AI communicates with them. And in fact, it’s often better.  
    And I gather now we’re even entering a period when people are complaining of sycophantic models,where the models are being too personable and too flattering. I do think that’s been one of the great surprises. And in fact, this is a huge phenomenon, how charming these models can be. 
    LEE: Yeah, I think you’re right. We can take credit for understanding that, Wow, these things can be remarkably empathetic. But then we missed this problem of sycophancy. Like, we even started our book in Chapter 1 with a quote from Davinci 3 scolding me. Like, don’t you remember when we were first starting, this thing was actually anti-sycophantic. If anything, it would tell you you’re an idiot.  
    KOHANE: It argued with me about certain biology questions. It was like a knockdown, drag-out fight.I was bringing references. It was impressive. But in fact, it made me trust it more. 
    LEE: Yeah. 
    KOHANE: And in fact, I will say—I remember it’s in the book—I had a bone to pick with Peter. Peter really was impressed by the empathy. And I pointed out that some of the most popular doctors are popular because they’re very empathic. But they’re not necessarily the best doctors. And in fact, I was taught that in medical school.   
    And so it’s a decoupling. It’s a human thing, that the empathy does not necessarily mean … it’s more of a, potentially, more of a signaled virtue than an actual virtue. 
    GOLDBERG: Nicely put. 
    LEE: Yeah, this issue of sycophancy, I think, is a struggle right now in the development of AI because I think it’s somehow related to instruction-following. So, you know, one of the challenges in AI is you’d like to give an AI a task—a task that might take several minutes or hours or even days to complete. And you want it to faithfully kind of follow those instructions. And, you know, that early version of GPT-4 was not very good at instruction-following. It would just silently disobey and, you know, and do something different. 
    And so I think we’re starting to hit some confusing elements of like, how agreeable should these things be?  
    One of the two of you used the word genteel. There was some point even while we were, like, on a little book tour … was it you, Carey, who said that the model seems nicer and less intelligent or less brilliant now than it did when we were writing the book? 
    GOLDBERG: It might have been, I think so. And I mean, I think in the context of medicine, of course, the question is, well, what’s likeliest to get the results you want with the patient, right? A lot of healthcare is in fact persuading the patient to do what you know as the physician would be best for them. And so it seems worth testing out whether this sycophancy is actually constructive or not. And I suspect … well, I don’t know, probably depends on the patient. 
    So actually, Peter, I have a few questions for you … 
    LEE: Yeah. Mm-hmm. 
    GOLDBERG: … that have been lingering for me. And one is, for AI to ever fully realize its potential in medicine, it must deal with the hallucinations. And I keep hearing conflicting accounts about whether that’s getting better or not. Where are we at, and what does that mean for use in healthcare? 
    LEE: Yeah, well, it’s, I think two years on, in the pretrained base models, there’s no doubt that hallucination rates by any benchmark measure have reduced dramatically. And, you know, that doesn’t mean they don’t happen. They still happen. But, you know, there’s been just a huge amount of effort and understanding in the, kind of, fundamental pretraining of these models. And that has come along at the same time that the inference costs, you know, for actually using these models has gone down, you know, by several orders of magnitude.  
    So things have gotten cheaper and have fewer hallucinations. At the same time, now there are these reasoning models. And the reasoning models are able to solve problems at PhD level oftentimes. 
    But at least at the moment, they are also now hallucinating more than the simpler pretrained models. And so it still continues to be, you know, a real issue, as we were describing. I don’t know, Zak, from where you’re at in medicine, as a clinician and as an educator in medicine, how is the medical community from where you’re sitting looking at that? 
    KOHANE: So I think it’s less of an issue, first of all, because the rate of hallucinations is going down. And second of all, in their day-to-day use, the doctor will provide questions that sit reasonably well into the context of medical decision-making. And the way doctors use this, let’s say on their non-EHRsmartphone is really to jog their memory or thinking about the patient, and they will evaluate independently. So that seems to be less of an issue. I’m actually more concerned about something else that’s I think more fundamental, which is effectively, what values are these models expressing?  
    And I’m reminded of when I was still in training, I went to a fancy cocktail party in Cambridge, Massachusetts, and there was a psychotherapist speaking to a dentist. They were talking about their summer, and the dentist was saying about how he was going to fix up his yacht that summer, and the only question was whether he was going to make enough money doing procedures in the spring so that he could afford those things, which was discomforting to me because that dentist was my dentist.And he had just proposed to me a few weeks before an expensive procedure. 
    And so the question is what, effectively, is motivating these models?  
    LEE: Yeah, yeah.  
    KOHANE: And so with several colleagues, I published a paper, basically, what are the values in AI? And we gave a case: a patient, a boy who is on the short side, not abnormally short, but on the short side, and his growth hormone levels are not zero. They’re there, but they’re on the lowest side. But the rest of the workup has been unremarkable. And so we asked GPT-4, you are a pediatric endocrinologist. 
    Should this patient receive growth hormone? And it did a very good job explaining why the patient should receive growth hormone.  
    GOLDBERG: Should. Should receive it.  
    KOHANE: Should. And then we asked, in a separate session, you are working for the insurance company. Should this patient receive growth hormone? And it actually gave a scientifically better reason not to give growth hormone. And in fact, I tend to agree medically, actually, with the insurance company in this case, because giving kids who are not growth hormone deficient, growth hormone gives only a couple of inches over many, many years, has all sorts of other issues. But here’s the point, we had 180-degree change in decision-making because of the prompt. And for that patient, tens-of-thousands-of-dollars-per-year decision; across patient populations, millions of dollars of decision-making.  
    LEE: Hmm. Yeah. 
    KOHANE: And you can imagine these user prompts making their way into system prompts, making their way into the instruction-following. And so I think this is aptly central. Just as I was wondering about my dentist, we should be wondering about these things. What are the values that are being embedded in them, some accidentally and some very much on purpose? 
    LEE: Yeah, yeah. That one, I think, we even had some discussions as we were writing the book, but there’s a technical element of that that I think we were missing, but maybe Carey, you would know for sure. And that’s this whole idea of prompt engineering. It sort of faded a little bit. Was it a thing? Do you remember? 
    GOLDBERG: I don’t think we particularly wrote about it. It’s funny, it does feel like it faded, and it seems to me just because everyone just gets used to conversing with the models and asking for what they want. Like, it’s not like there actually is any great science to it. 
    LEE: Yeah, even when it was a hot topic and people were talking about prompt engineering maybe as a new discipline, all this, it never, I was never convinced at the time. But at the same time, it is true. It speaks to what Zak was just talking about because part of the prompt engineering that people do is to give a defined role to the AI.  
    You know, you are an insurance claims adjuster, or something like that, and defining that role, that is part of the prompt engineering that people do. 
    GOLDBERG: Right. I mean, I can say, you know, sometimes you guys had me take sort of the patient point of view, like the “every patient” point of view. And I can say one of the aspects of using AI for patients that remains absent in as far as I can tell is it would be wonderful to have a consumer-facing interface where you could plug in your whole medical record without worrying about any privacy or other issues and be able to interact with the AI as if it were physician or a specialist and get answers, which you can’t do yet as far as I can tell. 
    LEE: Well, in fact, now that’s a good prompt because I think we do need to move on to the next episodes, and we’ll be talking about an episode that talks about consumers. But before we move on to Episode 2, which is next, I’d like to play one more quote, a little snippet from Sara Murray. 
    SARA MURRAY: I already do this when I’m on rounds—I’ll kind of give the case to ChatGPT if it’s a complex case, and I’ll say, “Here’s how I’m thinking about it; are there other things?” And it’ll give me additional ideas that are sometimes useful and sometimes not but often useful, and I’ll integrate them into my conversation about the patient.
    LEE: Carey, you wrote this fictional account at the very start of our book. And that fictional account, I think you and Zak worked on that together, talked about this medical resident, ER resident, using, you know, a chatbot off label, so to speak. And here we have the chief, in fact, the nation’s first chief health AI officerfor an elite health system doing exactly that. That’s got to be pretty validating for you, Carey. 
    GOLDBERG: It’s very.Although what’s troubling about it is that actually as in that little vignette that we made up, she’s using it off label, right. It’s like she’s just using it because it helps the way doctors use Google. And I do find it troubling that what we don’t have is sort of institutional buy-in for everyone to do that because, shouldn’t they if it helps? 
    LEE: Yeah. Well, let’s go ahead and get into Episode 2. So Episode 2, we sort of framed as talking to two people who are on the frontlines of big companies integrating generative AI into their clinical products. And so, one was Matt Lungren, who’s a colleague of mine here at Microsoft. And then Seth Hain, who leads all of R&D at Epic.  
    Maybe we’ll start with a little snippet of something that Matt said that struck me in a certain way. 
    MATTHEW LUNGREN: OK, we see this pain point. Doctors are typing on their computers while they’re trying to talk to their patients, right? We should be able to figure out a way to get that ambient conversation turned into text that then, you know, accelerates the doctor … takes all the important information. That’s a really hard problem, right. And so, for a long time, there was a human-in-the-loop aspect to doing this because you needed a human to say, “This transcript’s great, but here’s actually what needs to go in the note.” And that can’t scale.
    LEE: I think we expected healthcare systems to adopt AI, and we spent a lot of time in the book on AI writing clinical encounter notes. It’s happening for real now, and in a big way. And it’s something that has, of course, been happening before generative AI but now is exploding because of it. Where are we at now, two years later, just based on what we heard from guests? 
    KOHANE: Well, again, unless they’re forced to, hospitals will not adopt new technology unless it immediately translates into income. So it’s bizarrely counter-cultural that, again, they’re not being able to bill for the use of the AI, but this technology is so compelling to the doctors that despite everything, it’s overtaking the traditional dictation-typing routine. 
    LEE: Yeah. 
    GOLDBERG: And a lot of them love it and say, you will pry my cold dead hands off of my ambient note-taking, right. And I actually … a primary care physician allowed me to watch her. She was actually testing the two main platforms that are being used. And there was this incredibly talkative patient who went on and on about vacation and all kinds of random things for about half an hour.  
    And both of the platforms were incredibly good at pulling out what was actually medically relevant. And so to say that it doesn’t save time doesn’t seem right to me. Like, it seemed like it actually did and in fact was just shockingly good at being able to pull out relevant information. 
    LEE: Yeah. 
    KOHANE: I’m going to hypothesize that in the trials, which have in fact shown no gain in time, is the doctors were being incredibly meticulous.So I think … this is a Hawthorne effect, because you know you’re being monitored. And we’ve seen this in other technologies where the moment the focus is off, it’s used much more routinely and with much less inspection, for the better and for the worse. 
    LEE: Yeah, you know, within Microsoft, I had some internal disagreements about Microsoft producing a product in this space. It wouldn’t be Microsoft’s normal way. Instead, we would want 50 great companies building those products and doing it on our cloud instead of us competing against those 50 companies. And one of the reasons is exactly what you both said. I didn’t expect that health systems would be willing to shell out the money to pay for these things. It doesn’t generate more revenue. But I think so far two years later, I’ve been proven wrong.
    I wanted to ask a question about values here. I had this experience where I had a little growth, a bothersome growth on my cheek. And so had to go see a dermatologist. And the dermatologist treated it, froze it off. But there was a human scribe writing the clinical note.  
    And so I used the app to look at the note that was submitted. And the human scribe said something that did not get discussed in the exam room, which was that the growth was making it impossible for me to safely wear a COVID mask. And that was the reason for it. 
    And that then got associated with a code that allowed full reimbursement for that treatment. And so I think that’s a classic example of what’s called upcoding. And I strongly suspect that AI scribes, an AI scribe would not have done that. 
    GOLDBERG: Well, depending what values you programmed into it, right, Zak? 
    KOHANE: Today, today, today, it will not do it. But, Peter, that is actually the central issue that society has to have because our hospitals are currently mostly in the red. And upcoding is standard operating procedure. And if these AI get in the way of upcoding, they are going to be aligned towards that upcoding. You know, you have to ask yourself, these MRI machines are incredibly useful. They’re also big money makers. And if the AI correctly says that for this complaint, you don’t actually have to do the MRI …  
    LEE: Right. 
    KOHANE: …
    GOLDBERG: Yeah. And that raises another question for me. So, Peter, speaking from inside the gigantic industry, like, there seems to be such a need for self-surveillance of the models for potential harms that they could be causing. Are the big AI makers doing that? Are they even thinking about doing that? 
    Like, let’s say you wanted to watch out for the kind of thing that Zak’s talking about, could you? 
    LEE: Well, I think evaluation, like the best evaluation we had when we wrote our book was, you know, what score would this get on the step one and step two US medical licensing exams?  
    GOLDBERG: Right, right, right, yeah. 
    LEE: But honestly, evaluation hasn’t gotten that much deeper in the last two years. And it’s a big, I think, it is a big issue. And it’s related to the regulation issue also, I think. 
    Now the other guest in Episode 2 is Seth Hain from Epic. You know, Zak, I think it’s safe to say that you’re not a fan of Epic and the Epic system. You know, we’ve had a few discussions about that, about the fact that doctors don’t have a very pleasant experience when they’re using Epic all day.  
    Seth, in the podcast, said that there are over 100 AI integrations going on in Epic’s system right now. Do you think, Zak, that that has a chance to make you feel better about Epic? You know, what’s your view now two years on? 
    KOHANE: My view is, first of all, I want to separate my view of Epic and how it’s affected the conduct of healthcare and the quality of life of doctors from the individuals. Like Seth Hain is a remarkably fine individual who I’ve enjoyed chatting with and does really great stuff. Among the worst aspects of the Epic, even though it’s better in that respect than many EHRs, is horrible user interface. 
    The number of clicks that you have to go to get to something. And you have to remember where someone decided to put that thing. It seems to me that it is fully within the realm of technical possibility today to actually give an agent a task that you want done in the Epic record. And then whether Epic has implemented that agent or someone else, it does it so you don’t have to do the clicks. Because it’s something really soul sucking that when you’re trying to help patients, you’re having to remember not the right dose of the medication, but where was that particular thing that you needed in that particular task?  
    I can’t imagine that Epic does not have that in its product line. And if not, I know there must be other companies that essentially want to create that wrapper. So I do think, though, that the danger of multiple integrations is that you still want to have the equivalent of a single thought process that cares about the patient bringing those different processes together. And I don’t know if that’s Epic’s responsibility, the hospital’s responsibility, whether it’s actually a patient agent. But someone needs to be also worrying about all those AIs that are being integrated into the patient record. So … what do you think, Carey? 
    GOLDBERG: What struck me most about what Seth said was his description of the Cosmos project, and I, you know, I have been drinking Zak’s Kool-Aid for a very long time,and he—no, in a good way! And he persuaded me long ago that there is this horrible waste happening in that we have all of these electronic medical records, which could be used far, far more to learn from, and in particular, when you as a patient come in, it would be ideal if your physician could call up all the other patients like you and figure out what the optimal treatment for you would be. And it feels like—it sounds like—that’s one of the central aims that Epic is going for. And if they do that, I think that will redeem a lot of the pain that they’ve caused physicians these last few years.  
    And I also found myself thinking, you know, maybe this very painful period of using electronic medical records was really just a growth phase. It was an awkward growth phase. And once AI is fully used the way Zak is beginning to describe, the whole system could start making a lot more sense for everyone. 
    LEE: Yeah. One conversation I’ve had with Seth, in all of this is, you know, with AI and its development, is there a future, a near future where we don’t have an EHRsystem at all? You know, AI is just listening and just somehow absorbing all the information. And, you know, one thing that Seth said, which I felt was prescient, and I’d love to get your reaction, especially Zak, on this is he said, I think that … he said, technically, it could happen, but the problem is right now, actually doctors do a lot of their thinking when they write and review notes. You know, the actual process of being a doctor is not just being with a patient, but it’s actually thinking later. What do you make of that? 
    KOHANE: So one of the most valuable experiences I had in training was something that’s more or less disappeared in medicine, which is the post-clinic conference, where all the doctors come together and we go through the cases that we just saw that afternoon. And we, actually, were trying to take potshots at each otherin order to actually improve. Oh, did you actually do that? Oh, I forgot. I’m going to go call the patient and do that.  
    And that really happened. And I think that, yes, doctors do think, and I do think that we are insufficiently using yet the artificial intelligence currently in the ambient dictation mode as much more of a independent agent saying, did you think about that? 
    I think that would actually make it more interesting, challenging, and clearly better for the patient because that conversation I just told you about with the other doctors, that no longer exists.  
    LEE: Yeah. Mm-hmm. I want to do one more thing here before we leave Matt and Seth in Episode 2, which is something that Seth said with respect to how to reduce hallucination.  
    SETH HAIN: At that time, there’s a lot of conversation in the industry around something called RAG, or retrieval-augmented generation. And the idea was, could you pull the relevant bits, the relevant pieces of the chart, into that prompt, that information you shared with the generative AI model, to be able to increase the usefulness of the draft that was being created? And that approach ended up proving and continues to be to some degree, although the techniques have greatly improved, somewhat brittle, right. And I think this becomes one of the things that we are and will continue to improve upon because, as you get a richer and richer amount of information into the model, it does a better job of responding. 
    LEE: Yeah, so, Carey, this sort of gets at what you were saying, you know, that shouldn’t these models be just bringing in a lot more information into their thought processes? And I’m certain when we wrote our book, I had no idea. I did not conceive of RAG at all. It emerged a few months later.  
    And to my mind, I remember the first time I encountered RAG—Oh, this is going to solve all of our problems of hallucination. But it’s turned out to be harder. It’s improving day by day, but it’s turned out to be a lot harder. 
    KOHANE: Seth makes a very deep point, which is the way RAG is implemented is basically some sort of technique for pulling the right information that’s contextually relevant. And the way that’s done is typically heuristic at best. And it’s not … doesn’t have the same depth of reasoning that the rest of the model has.  
    And I’m just wondering, Peter, what you think, given the fact that now context lengths seem to be approaching a million or more, and people are now therefore using the full strength of the transformer on that context and are trying to figure out different techniques to make it pay attention to the middle of the context. In fact, the RAG approach perhaps was just a transient solution to the fact that it’s going to be able to amazingly look in a thoughtful way at the entire record of the patient, for example. What do you think, Peter? 
    LEE: I think there are three things, you know, that are going on, and I’m not sure how they’re going to play out and how they’re going to be balanced. And I’m looking forward to talking to people in later episodes of this podcast, you know, people like Sébastien Bubeck or Bill Gates about this, because, you know, there is the pretraining phase, you know, when things are sort of compressed and baked into the base model.  
    There is the in-context learning, you know, so if you have extremely long or infinite context, you’re kind of learning as you go along. And there are other techniques that people are working on, you know, various sorts of dynamic reinforcement learning approaches, and so on. And then there is what maybe you would call structured RAG, where you do a pre-processing. You go through a big database, and you figure it all out. And make a very nicely structured database the AI can then consult with later.  
    And all three of these in different contexts today seem to show different capabilities. But they’re all pretty important in medicine.  
    Moving on to Episode 3, we talked to Dave DeBronkart, who is also known as “e-Patient Dave,” an advocate of patient empowerment, and then also Christina Farr, who has been doing a lot of venture investing for consumer health applications.  
    Let’s get right into this little snippet from something that e-Patient Dave said that talks about the sources of medical information, particularly relevant for when he was receiving treatment for stage 4 kidney cancer. 
    DAVE DEBRONKART: And I’m making a point here of illustrating that I am anything but medically trained, right. And yet I still, I want to understand as much as I can. I was months away from dead when I was diagnosed, but in the patient community, I learned that they had a whole bunch of information that didn’t exist in the medical literature. Now today we understand there’s publication delays; there’s all kinds of reasons. But there’s also a whole bunch of things, especially in an unusual condition, that will never rise to the level of deserving NIHfunding and research.
    LEE: All right. So I have a question for you, Carey, and a question for you, Zak, about the whole conversation with e-Patient Dave, which I thought was really remarkable. You know, Carey, I think as we were preparing for this whole podcast series, you made a comment—I actually took it as a complaint—that not as much has happened as I had hoped or thought. People aren’t thinking boldly enough, you know, and I think, you know, I agree with you in the sense that I think we expected a lot more to be happening, particularly in the consumer space. I’m giving you a chance to vent about this. 
    GOLDBERG:Thank you! Yes, that has been by far the most frustrating thing to me. I think that the potential for AI to improve everybody’s health is so enormous, and yet, you know, it needs some sort of support to be able to get to the point where it can do that. Like, remember in the book we wrote about Greg Moore talking about how half of the planet doesn’t have healthcare, but people overwhelmingly have cellphones. And so you could connect people who have no healthcare to the world’s medical knowledge, and that could certainly do some good.  
    And I have one great big problem with e-Patient Dave, which is that, God, he’s fabulous. He’s super smart. Like, he’s not a typical patient. He’s an off-the-charts, brilliant patient. And so it’s hard to … and so he’s a great sort of lead early-adopter-type person, and he can sort of show the way for others.  
    But what I had hoped for was that there would be more visible efforts to really help patients optimize their healthcare. Probably it’s happening a lot in quiet ways like that any discharge instructions can be instantly beautifully translated into a patient’s native language and so on. But it’s almost like there isn’t a mechanism to allow this sort of mass consumer adoption that I would hope for.
    LEE: Yeah. But you have written some, like, you even wrote about that person who saved his dog. So do you think … you know, and maybe a lot more of that is just happening quietly that we just never hear about? 
    GOLDBERG: I’m sure that there is a lot of it happening quietly. And actually, that’s another one of my complaints is that no one is gathering that stuff. It’s like you might happen to see something on social media. Actually, e-Patient Dave has a hashtag, PatientsUseAI, and a blog, as well. So he’s trying to do it. But I don’t know of any sort of overarching or academic efforts to, again, to surveil what’s the actual use in the population and see what are the pros and cons of what’s happening. 
    LEE: Mm-hmm. So, Zak, you know, the thing that I thought about, especially with that snippet from Dave, is your opening for Chapter 8 that you wrote, you know, about your first patient dying in your arms. I still think of how traumatic that must have been. Because, you know, in that opening, you just talked about all the little delays, all the little paper-cut delays, in the whole process of getting some new medical technology approved. But there’s another element that Dave kind of speaks to, which is just, you know, patients who are experiencing some issue are very, sometimes very motivated. And there’s just a lot of stuff on social media that happens. 
    KOHANE: So this is where I can both agree with Carey and also disagree. I think when people have an actual health problem, they are now routinely using it. 
    GOLDBERG: Yes, that’s true. 
    KOHANE: And that situation is happening more often because medicine is failing. This is something that did not come up enough in our book. And perhaps that’s because medicine is actually feeling a lot more rickety today than it did even two years ago.  
    We actually mentioned the problem. I think, Peter, you may have mentioned the problem with the lack of primary care. But now in Boston, our biggest healthcare system, all the practices for primary care are closed. I cannot get for my own faculty—residents at MGHcan’t get primary care doctor. And so … 
    LEE: Which is just crazy. I mean, these are amongst the most privileged people in medicine, and they can’t find a primary care physician. That’s incredible. 
    KOHANE: Yeah, and so therefore … and I wrote an
    And so therefore, you see people who know that they have a six-month wait till they see the doctor, and all they can do is say, “I have this rash. Here’s a picture. What’s it likely to be? What can I do?” “I’m gaining weight. How do I do a ketogenic diet?” Or, “How do I know that this is the flu?”   
    This is happening all the time, where acutely patients have actually solved problems that doctors have not. Those are spectacular. But I’m saying more routinely because of the failure of medicine. And it’s not just in our fee-for-service United States. It’s in the UK; it’s in France. These are first-world, developed-world problems. And we don’t even have to go to lower- and middle-income countries for that. LEE: Yeah. 
    GOLDBERG: But I think it’s important to note that, I mean, so you’re talking about how even the most elite people in medicine can’t get the care they need. But there’s also the point that we have so much concern about equity in recent years. And it’s likeliest that what we’re doing is exacerbating inequity because it’s only the more connected, you know, better off people who are using AI for their health. 
    KOHANE: Oh, yes. I know what various Harvard professors are doing. They’re paying for a concierge doctor. And that’s, you know, a - to -a-year-minimum investment. That’s inequity. 
    LEE: When we wrote our book, you know, the idea that GPT-4 wasn’t trained specifically for medicine, and that was amazing, but it might get even better and maybe would be necessary to do that. But one of the insights for me is that in the consumer space, the kinds of things that people ask about are different than what the board-certified clinician would ask. 
    KOHANE: Actually, that’s, I just recently coined the term. It’s the … maybe it’s … well, at least it’s new to me. It’s the technology or expert paradox. And that is the more expert and narrow your medical discipline, the more trivial it is to translate that into a specialized AI. So echocardiograms? We can now do beautiful echocardiograms. That’s really hard to do. I don’t know how to interpret an echocardiogram. But they can do it really, really well. Interpret an EEG. Interpret a genomic sequence. But understanding the fullness of the human condition, that’s actually hard. And actually, that’s what primary care doctors do best. But the paradox is right now, what is easiest for AI is also the most highly paid in medicine.Whereas what is the hardest for AI in medicine is the least regarded, least paid part of medicine. 
    GOLDBERG: So this brings us to the question I wanted to throw at both of you actually, which is we’ve had this spasm of incredibly prominent people predicting that in fact physicians would be pretty obsolete within the next few years. We had Bill Gates saying that; we had Elon Musk saying surgeons are going to be obsolete within a few years. And I think we had Demis Hassabis saying, “Yeah, we’ll probably cure most diseases within the next decade or so.” 
    So what do you think? And also, Zak, to what you were just saying, I mean, you’re talking about being able to solve very general overarching problems. But in fact, these general overarching models are actually able, I would think, are able to do that because they are broad. So what are we heading towards do you think? What should the next book be … The end of doctors? 
    KOHANE: So I do recall a conversation that … we were at a table with Bill Gates, and Bill Gates immediately went to this, which is advancing the cutting edge of science. And I have to say that I think it will accelerate discovery. But eliminating, let’s say, cancer? I think that’s going to be … that’s just super hard. The reason it’s super hard is we don’t have the data or even the beginnings of the understanding of all the ways this devilish disease managed to evolve around our solutions.  
    And so that seems extremely hard. I think we’ll make some progress accelerated by AI, but solving it in a way Hassabis says, God bless him. I hope he’s right. I’d love to have to eat crow in 10 or 20 years, but I don’t think so. I do believe that a surgeon working on one of those Davinci machines, that stuff can be, I think, automated.  
    And so I think that’s one example of one of the paradoxes I described. And it won’t be that we’re replacing doctors. I just think we’re running out of doctors. I think it’s really the case that, as we said in the book, we’re getting a huge deficit in primary care doctors. 
    But even the subspecialties, my subspecialty, pediatric endocrinology, we’re only filling half of the available training slots every year. And why? Because it’s a lot of work, a lot of training, and frankly doesn’t make as much money as some of the other professions.  
    LEE: Yeah. Yeah, I tend to think that, you know, there are going to be always a need for human doctors, not for their skills. In fact, I think their skills increasingly will be replaced by machines. And in fact, I’ve talked about a flip. In fact, patients will demand, Oh my god, you mean you’re going to try to do that yourself instead of having the computer do it? There’s going to be that sort of flip. But I do think that when it comes to people’s health, people want the comfort of an authority figure that they trust. And so what is more of a question for me is whether we will ever view a machine as an authority figure that we can trust. 
    And before I move on to Episode 4, which is on norms, regulations and ethics, I’d like to hear from Chrissy Farr on one more point on consumer health, specifically as it relates to pregnancy: 
    CHRISTINA FARR: For a lot of women, it’s their first experience with the hospital. And, you know, I think it’s a really big opportunity for these systems to get a whole family on board and keep them kind of loyal. And a lot of that can come through, you know, just delivering an incredible service. Unfortunately, I don’t think that we are delivering incredible services today to women in this country. I see so much room for improvement.
    LEE: In the consumer space, I don’t think we really had a focus on those periods in a person’s life when they have a lot of engagement, like pregnancy, or I think another one is menopause, cancer. You know, there are points where there is, like, very intense engagement. And we heard that from e-Patient Dave, you know, with his cancer and Chrissy with her pregnancy. Was that a miss in our book? What do think, Carey? 
    GOLDBERG: I mean, I don’t think so. I think it’s true that there are many points in life when people are highly engaged. To me, the problem thus far is just that I haven’t seen consumer-facing companies offering beautiful AI-based products. I think there’s no question at all that the market is there if you have the products to offer. 
    LEE: So, what do you think this means, Zak, for, you know, like Boston Children’s or Mass General Brigham—you know, the big places? 
    KOHANE: So again, all these large healthcare systems are in tough shape. MGBwould be fully in the red if not for the fact that its investments, of all things, have actually produced. If you look at the large healthcare systems around the country, they are in the red. And there’s multiple reasons why they’re in the red, but among them is cost of labor.  
    And so we’ve created what used to be a very successful beast, the health center. But it’s developed a very expensive model and a highly regulated model. And so when you have high revenue, tiny margins, your ability to disrupt yourself, to innovate, is very, very low because you will have to talk to the board next year if you went from 2% positive margin to 1% negative margin.  
    LEE: Yeah. 
    KOHANE: And so I think we’re all waiting for one of the two things to happen, either a new kind of healthcare delivery system being generated or ultimately one of these systems learns how to disrupt itself.  
    LEE: Yeah.
    GOLDBERG: We punted.We totally punted to the AI. 
    LEE: We had three amazing guests. One was Laura Adams from National Academy of Medicine. Let’s play a snippet from her. 
    LAURA ADAMS: I think one of the most provocative and exciting articles that I saw written recently was by Bakul Patel and David Blumenthal, who posited, should we be regulating generative AI as we do a licensed and qualified provider? Should it be treated in the sense that it’s got to have a certain amount of training and a foundation that’s got to pass certain tests? Does it have to report its performance? And I’m thinking, what a provocative idea, but it’s worth considering.
    LEE: All right, so I very well remember that we had discussed this kind of idea when we were writing our book. And I think before we finished our book, I personally rejected the idea. But now two years later, what do the two of you think? I’m dying to hear. 
    GOLDBERG: Well, wait, why … what do you think? Like, are you sorry that you rejected it? 
    LEE: I’m still skeptical because when we are licensing human beings as doctors, you know, we’re making a lot of implicit assumptions that we don’t test as part of their licensure, you know, that first of all, they arehuman being and they care about life, and that, you know, they have a certain amount of common sense and shared understanding of the world.  
    And there’s all sorts of sort of implicit assumptions that we have about each other as human beings living in a society together. That you know how to study, you know, because I know you just went through three years of medical or four years of medical school and all sorts of things. And so the standard ways that we license human beings, they don’t need to test all of that stuff. But somehow intuitively, all of that seems really important. 
    I don’t know. Am I wrong about that? 
    KOHANE: So it’s compared with what issue? Because we know for a fact that doctors who do a lot of a procedure, like do this procedure, like high-risk deliveries all the time, have better outcomes than ones who only do a few high risk. We talk about it, but we don’t actually make it explicit to patients or regulate that you have to have this minimal amount. And it strikes me that in some sense, and, oh, very importantly, these things called human beings learn on the job. And although I used to be very resentful of it as a resident, when someone would say, I don’t want the resident, I want the … 
    GOLDBERG: … the attending. 
    KOHANE: … they had a point. And so the truth is, maybe I was a wonderful resident, but some people were not so great.And so it might be the best outcome if we actually, just like for human beings, we say, yeah, OK, it’s this good, but don’t let it work autonomously, or it’s done a thousand of them, just let it go. We just don’t have practically speaking, we don’t have the environment, the lab, to test them. Now, maybe if they get embodied in robots and literally go around with us, then it’s going to bea lot easier. I don’t know. 
    LEE: Yeah.  
    GOLDBERG: Yeah, I think I would take a step back and say, first of all, we weren’t the only ones who were stumped by regulating AI. Like, nobody has done it yet in the United States to this day, right. Like, we do not have standing regulation of AI in medicine at all in fact. And that raises the issue of … the story that you hear often in the biotech business, which is, you know, more prominent here in Boston than anywhere else, is that thank goodness Cambridge put out, the city of Cambridge, put out some regulations about biotech and how you could dump your lab waste and so on. And that enabled the enormous growth of biotech here.  
    If you don’t have the regulations, then you can’t have the growth of AI in medicine that is worthy of having. And so, I just … we’re not the ones who should do it, but I just wish somebody would.  
    LEE: Yeah. 
    GOLDBERG: Zak. 
    KOHANE: Yeah, but I want to say this as always, execution is everything, even in regulation.  
    And so I’m mindful that a conference that both of you attended, the RAISE conference. The Europeans in that conference came to me personally and thanked me for organizing this conference about safe and effective use of AI because they said back home in Europe, all that we’re talking about is risk, not opportunities to improve care.  
    And so there is a version of regulation which just locks down the present and does not allow the future that we’re talking about to happen. And so, Carey, I absolutely hear you that we need to have a regulation that takes away some of the uncertainty around liability, around the freedom to operate that would allow things to progress. But we wrote in our book that premature regulation might actually focus on the wrong thing. And so since I’m an optimist, it may be the fact that we don’t have much of a regulatory infrastructure today, that it allows … it’s a unique opportunity—I’ve said this now to several leaders—for the healthcare systems to say, this is the regulation we need.  
    GOLDBERG: It’s true. 
    KOHANE: And previously it was top-down. It was coming from the administration, and those executive orders are now history. But there is an opportunity, which may or may not be attained, there is an opportunity for the healthcare leadership—for experts in surgery—to say, “This is what we should expect.”  
    LEE: Yeah.  
    KOHANE: I would love for this to happen. I haven’t seen evidence that it’s happening yet. 
    GOLDBERG: No, no. And there’s this other huge issue, which is that it’s changing so fast. It’s moving so fast. That something that makes sense today won’t in six months. So, what do you do about that? 
    LEE: Yeah, yeah, that is something I feel proud of because when I went back and looked at our chapter on this, you know, we did make that point, which I think has turned out to be true.  
    But getting back to this conversation, there’s something, a snippet of something, that Vardit Ravitsky said that I think touches on this topic.  
    VARDIT RAVITSKY: So my pushback is, are we seeing AI exceptionalism in the sense that if it’s AI, huh, panic! We have to inform everybody about everything, and we have to give them choices, and they have to be able to reject that tool and the other tool versus, you know, the rate of human error in medicine is awful. So why are we so focused on informed consent and empowerment regarding implementation of AI and less in other contexts?
    GOLDBERG: Totally agree. Who cares about informed consent about AI. Don’t want it. Don’t need it. Nope. 
    LEE: Wow. Yeah. You know, and this … Vardit of course is one of the leading bioethicists, you know, and of course prior to AI, she was really focused on genetics. But now it’s all about AI.  
    And, Zak, you know, you and other doctors have always told me, you know, the truth of the matter is, you know, what do you call the bottom-of-the-class graduate of a medical school? 
    And the answer is “doctor.” 
    KOHANE: “Doctor.” Yeah. Yeah, I think that again, this gets to compared with what? We have to compare AI not to the medicine we imagine we have, or we would like to have, but to the medicine we have today. And if we’re trying to remove inequity, if we’re trying to improve our health, that’s what … those are the right metrics. And so that can be done so long as we avoid catastrophic consequences of AI.  
    So what would the catastrophic consequence of AI be? It would be a systematic behavior that we were unaware of that was causing poor healthcare. So, for example, you know, changing the dose on a medication, making it 20% higher than normal so that the rate of complications of that medication went from 1% to 5%. And so we do need some sort of monitoring.  
    We haven’t put out the paper yet, but in computer science, there’s, well, in programming, we know very well the value for understanding how our computer systems work.  
    And there was a guy by name of Allman, I think he’s still at a company called Sendmail, who created something called syslog. And syslog is basically a log of all the crap that’s happening in our operating system. And so I’ve been arguing now for the creation of MedLog. And MedLog … in other words, what we cannot measure, we cannot regulate, actually. 
    LEE: Yes. 
    KOHANE: And so what we need to have is MedLog, which says, “Here’s the context in which a decision was made. Here’s the version of the AI, you know, the exact version of the AI. Here was the data.” And we just have MedLog. And I think MedLog is actually incredibly important for being able to measure, to just do what we do in … it’s basically the black box for, you know, when there’s a crash. You know, we’d like to think we could do better than crash. We can say, “Oh, we’re seeing from MedLog that this practice is turning a little weird.” But worst case, patient dies,can see in MedLog, what was the information this thing knew about it? And did it make the right decision? We can actually go for transparency, which like in aviation, is much greater than in most human endeavors.  
    GOLDBERG: Sounds great. 
    LEE: Yeah, it’s sort of like a black box. I was thinking of the aviation black box kind of idea. You know, you bring up medication errors, and I have one more snippet. This is from our guest Roxana Daneshjou from Stanford.
    ROXANA DANESHJOU: There was a mistake in her after-visit summary about how much Tylenol she could take. But I, as a physician, knew that this dose was a mistake. I actually asked ChatGPT. I gave it the whole after-visit summary, and I said, are there any mistakes here? And it clued in that the dose of the medication was wrong.
    LEE: Yeah, so this is something we did write about in the book. We made a prediction that AI might be a second set of eyes, I think is the way we put it, catching things. And we actually had examples specifically in medication dose errors. I think for me, I expected to see a lot more of that than we are. 
    KOHANE: Yeah, it goes back to our conversation about Epic or competitor Epic doing that. I think we’re going to see that having oversight over all medical orders, all orders in the system, critique, real-time critique, where we’re both aware of alert fatigue. So we don’t want to have too many false positives. At the same time, knowing what are critical errors which could immediately affect lives. I think that is going to become in terms of—and driven by quality measures—a product. 
    GOLDBERG: And I think word will spread among the general public that kind of the same way in a lot of countries when someone’s in a hospital, the first thing people ask relatives are, well, who’s with them? Right?  
    LEE: Yeah. Yup. 
    GOLDBERG: You wouldn’t leave someone in hospital without relatives. Well, you wouldn’t maybe leave your medical …  
    KOHANE: By the way, that country is called the United States. 
    GOLDBERG: Yes, that’s true.It is true here now, too. But similarly, I would tell any loved one that they would be well advised to keep using AI to check on their medical care, right. Why not? 
    LEE: Yeah. Yeah. Last topic, just for this Episode 4. Roxana, of course, I think really made a name for herself in the AI era writing, actually just prior to ChatGPT, you know, writing some famous papers about how computer vision systems for dermatology were biased against dark-skinned people. And we did talk some about bias in these AI systems, but I feel like we underplayed it, or we didn’t understand the magnitude of the potential issues. What are your thoughts? 
    KOHANE: OK, I want to push back, because I’ve been asked this question several times. And so I have two comments. One is, over 100,000 doctors practicing medicine, I know they have biases. Some of them actually may be all in the same direction, and not good. But I have no way of actually measuring that. With AI, I know exactly how to measure that at scale and affordably. Number one. Number two, same 100,000 doctors. Let’s say I do know what their biases are. How hard is it for me to change that bias? It’s impossible … 
    LEE: Yeah, yeah.  
    KOHANE: … practically speaking. Can I change the bias in the AI? Somewhat. Maybe some completely. 
    I think that we’re in a much better situation. 
    GOLDBERG: Agree. 
    LEE: I think Roxana made also the super interesting point that there’s bias in the whole system, not just in individuals, but, you know, there’s structural bias, so to speak.  
    KOHANE: There is. 
    LEE: Yeah. Hmm. There was a super interesting paper that Roxana wrote not too long ago—her and her collaborators—showing AI’s ability to detect, to spot bias decision-making by others. Are we going to see more of that? 
    KOHANE: Oh, yeah, I was very pleased when, in NEJM AI, we published a piece with Marzyeh Ghassemi, and what they were talking about was actually—and these are researchers who had published extensively on bias and threats from AI. And they actually, in this article, did the flip side, which is how much better AI can do than human beings in this respect.  
    And so I think that as some of these computer scientists enter the world of medicine, they’re becoming more and more aware of human foibles and can see how these systems, which if they only looked at the pretrained state, would have biases. But now, where we know how to fine-tune the de-bias in a variety of ways, they can do a lot better and, in fact, I think are much more … a much greater reason for optimism that we can change some of these noxious biases than in the pre-AI era. 
    GOLDBERG: And thinking about Roxana’s dermatological work on how I think there wasn’t sufficient work on skin tone as related to various growths, you know, I think that one thing that we totally missed in the book was the dawn of multimodal uses, right. 
    LEE: Yeah. Yeah, yeah. 
    GOLDBERG: That’s been truly amazing that in fact all of these visual and other sorts of data can be entered into the models and move them forward. 
    LEE: Yeah. Well, maybe on these slightly more optimistic notes, we’re at time. You know, I think ultimately, I feel pretty good still about what we did in our book, although there were a lot of misses.I don’t think any of us could really have predicted really the extent of change in the world.   
    So, Carey, Zak, just so much fun to do some reminiscing but also some reflection about what we did.  
    And to our listeners, as always, thank you for joining us. We have some really great guests lined up for the rest of the series, and they’ll help us explore a variety of relevant topics—from AI drug discovery to what medical students are seeing and doing with AI and more.  
    We hope you’ll continue to tune in. And if you want to catch up on any episodes you might have missed, you can find them at aka.ms/AIrevolutionPodcastor wherever you listen to your favorite podcasts.   
    Until next time.  
    #coauthor #roundtable #reflecting #real #world
    Coauthor roundtable: Reflecting on real world of doctors, developers, patients, and policymakers
    Transcript        PETER LEE: “We need to start understanding and discussing AI’s potential for good and ill now. Or rather, yesterday. … GPT-4 has game-changing potential to improve medicine and health.”         This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.      Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?       In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.   The passage I read at the top is from the book’s prologue.    When Carey, Zak, and I wrote the book, we could only speculate how generative AI would be used in healthcare because GPT-4 hadn’t yet been released. It wasn’t yet available to the very people we thought would be most affected by it. And while we felt strongly that this new form of AI would have the potential to transform medicine, it was such a different kind of technology for the world, and no one had a user’s manual for this thing to explain how to use it effectively and also how to use it safely.   So we thought it would be important to give healthcare professionals and leaders a framing to start important discussions around its use. We wanted to provide a map not only to help people navigate a new world that we anticipated would happen with the arrival of GPT-4 but also to help them chart a future of what we saw as a potential revolution in medicine.   So I’m super excited to welcome my coauthors: longtime medical/science journalist Carey Goldberg and Dr. Zak Kohane, the inaugural chair of Harvard Medical School’s Department of Biomedical Informatics and the editor-in-chief for The New England Journal of Medicine AI.   We’re going to have two discussions. This will be the first one about what we’ve learned from the people on the ground so far and how we are thinking about generative AI today.    Carey, Zak, I’m really looking forward to this.  CAREY GOLDBERG: It’s nice to see you, Peter.   LEE:It’s great to see you, too.  GOLDBERG: We missed you.  ZAK KOHANE: The dynamic gang is back.  LEE: Yeah, and I guess after that big book project two years ago, it’s remarkable that we’re still on speaking terms with each other.  In fact, this episode is to react to what we heard in the first four episodes of this podcast. But before we get there, I thought maybe we should start with the origins of this project just now over two years ago. And, you know, I had this early secret access to Davinci 3, now known as GPT-4.   I remember, you know, experimenting right away with things in medicine, but I realized I was in way over my head. And so I wanted help. And the first person I called was you, Zak. And you remember we had a call, and I tried to explain what this was about. And I think I saw skepticism in—polite skepticism—in your eyes. But tell me, you know, what was going through your head when you heard me explain this thing to you?  KOHANE: So I was divided between the fact that I have tremendous respect for you, Peter. And you’ve always struck me as sober. And we’ve had conversations which showed to me that you fully understood some of the missteps that technology—ARPA, Microsoft, and others—had made in the past. And yet, you were telling me a full science fiction compliant storythat something that we thought was 30 years away was happening now.   LEE: Mm-hmm.  KOHANE: And it was very hard for me to put together. And so I couldn’t quite tell myself this is BS, but I said, you know, I need to look at it. Just this seems too good to be true. What is this? So it was very hard for me to grapple with it. I was thrilled that it might be possible, but I was thinking, How could this be possible?  LEE: Yeah. Well, even now, I look back, and I appreciate that you were nice to me, because I think a lot of people would havebeen much less polite. And in fact, I myself had expressed a lot of very direct skepticism early on.   After ChatGPT got released, I think three or four days later, I received an email from a colleague running … who runs a clinic, and, you know, he said, “Wow, this is great, Peter. And, you know, we’re using this ChatGPT, you know, to have the receptionist in our clinic write after-visit notes to our patients.”   And that sparked a huge internal discussion about this. And you and I knew enough about hallucinations and about other issues that it seemed important to write something about what this could do and what it couldn’t do. And so I think, I can’t remember the timing, but you and I decided a book would be a good idea. And then I think you had the thought that you and I would write in a hopelessly academic stylethat no one would be able to read.   So it was your idea to recruit Carey, I think, right?  KOHANE: Yes, it was. I was sure that we both had a lot of material, but communicating it effectively to the very people we wanted to would not go well if we just left ourselves to our own devices. And Carey is super brilliant at what she does. She’s an idea synthesizer and public communicator in the written word and amazing.  LEE: So yeah. So, Carey, we contact you. How did that go?  GOLDBERG: So yes. On my end, I had known Zak for probably, like, 25 years, and he had always been the person who debunked the scientific hype for me. I would turn to him with like, “Hmm, they’re saying that the Human Genome Project is going to change everything.” And he would say, “Yeah. But first it’ll be 10 years of bad news, and thenwe’ll actually get somewhere.”    So when Zak called me up at seven o’clock one morning, just beside himself after having tried Davinci 3, I knew that there was something very serious going on. And I had just quit my job as the Boston bureau chief of Bloomberg News, and I was ripe for the plucking. And I also … I feel kind of nostalgic now about just the amazement and the wonder and the awe of that period. We knew that when generative AI hit the world, there would be all kinds of snags and obstacles and things that would slow it down, but at that moment, it was just like the holy crap moment.And it’s fun to think about it now. LEE: Yeah. KOHANE: I will see that and raise that one. I now tell GPT-4, please write this in the style of Carey Goldberg.   GOLDBERG:No way! Really?   KOHANE: Yes way. Yes way. Yes way.  GOLDBERG: Wow. Well, I have to say, like, it’s not hard to motivate readers when you’re writing about the most transformative technology of their lifetime. Like, I think there’s a gigantic hunger to read and to understand. So you were not hard to work with, Peter and Zak.  LEE: All right. So I think we have to get down to worknow.   Yeah, so for these podcasts, you know, we’re talking to different types of people to just reflect on what’s actually happening, what has actually happened over the last two years. And so the first episode, we talked to two doctors. There’s Chris Longhurst at UC San Diego and Sara Murray at UC San Francisco. And besides being doctors and having AI affect their clinical work, they just happen also to be leading the efforts at their respective institutions to figure out how best to integrate AI into their health systems.  And, you know, it was fun to talk to them. And I felt like a lot of what they said was pretty validating for us. You know, they talked about AI scribes. Chris, especially, talked a lot about how AI can respond to emails from patients, write referral letters. And then, you know, they both talked about the importance of—I think, Zak, you used the phrase in our book “trust but verify”—you know, to have always a human in the loop.    What did you two take away from their thoughts overall about how doctors are using … and I guess, Zak, you would have a different lens also because at Harvard, you see doctors all the time grappling with AI.  KOHANE: So on the one hand, I think they’ve done some very interesting studies. And indeed, they saw that when these generative models, when GPT-4, was sending a note to patients, it was more detailed, friendlier.  But there were also some nonobvious results, which is on the generation of these letters, if indeed you review them as you’re supposed to, it was not clear that there was any time savings. And my own reaction was, Boy, every one of these things needs institutional review. It’s going to be hard to move fast.   And yet, at the same time, we know from them that the doctors on their smartphones are accessing these things all the time. And so the disconnect between a healthcare system, which is duty bound to carefully look at every implementation, is, I think, intimidating.   LEE: Yeah.  KOHANE: And at the same time, doctors who just have to do what they have to do are using this new superpower and doing it. And so that’s actually what struck me …   LEE: Yeah.  KOHANE: … is that these are two leaders and they’re doing what they have to do for their institutions, and yet there’s this disconnect.  And by the way, I don’t think we’ve seen any faster technology adoption than the adoption of ambient dictation. And it’s not because it’s time saving. And in fact, so far, the hospitals have to pay out of pocket. It’s not like insurance is paying them more. But it’s so much more pleasant for the doctors … not least of which because they can actually look at their patients instead of looking at the terminal and plunking down.   LEE: Carey, what about you?  GOLDBERG: I mean, anecdotally, there are time savings. Anecdotally, I have heard quite a few doctors saying that it cuts down on “pajama time” to be able to have the note written by the AI and then for them to just check it. In fact, I spoke to one doctor who said, you know, basically it means that when I leave the office, I’ve left the office. I can go home and be with my kids.  So I don’t think the jury is fully in yet about whether there are time savings. But what is clear is, Peter, what you predicted right from the get-go, which is that this is going to be an amazing paper shredder. Like, the main first overarching use cases will be back-office functions.  LEE: Yeah, yeah. Well, and it was, I think, not a hugely risky prediction because, you know, there were already companies, like, using phone banks of scribes in India to kind of listen in. And, you know, lots of clinics actually had human scribes being used. And so it wasn’t a huge stretch to imagine the AI.  So on the subject of things that we missed, Chris Longhurst shared this scenario, which stuck out for me, and he actually coauthored a paper on it last year.  CHRISTOPHER LONGHURST: It turns out, not surprisingly, healthcare can be frustrating. And stressed patients can send some pretty nasty messages to their care teams.And you can imagine being a busy, tired, exhausted clinician and receiving a bit of a nasty-gram. And the GPT is actually really helpful in those instances in helping draft a pretty empathetic response when I think the human instinct would be a pretty nasty one.  LEE:So, Carey, maybe I’ll start with you. What did we understand about this idea of empathy out of AI at the time we wrote the book, and what do we understand now?  GOLDBERG: Well, it was already clear when we wrote the book that these AI models were capable of very persuasive empathy. And in fact, you even wrote that it was helping you be a better person, right.So their human qualities, or human imitative qualities, were clearly superb. And we’ve seen that borne out in multiple studies, that in fact, patients respond better to them … that they have no problem at all with how the AI communicates with them. And in fact, it’s often better.   And I gather now we’re even entering a period when people are complaining of sycophantic models,where the models are being too personable and too flattering. I do think that’s been one of the great surprises. And in fact, this is a huge phenomenon, how charming these models can be.  LEE: Yeah, I think you’re right. We can take credit for understanding that, Wow, these things can be remarkably empathetic. But then we missed this problem of sycophancy. Like, we even started our book in Chapter 1 with a quote from Davinci 3 scolding me. Like, don’t you remember when we were first starting, this thing was actually anti-sycophantic. If anything, it would tell you you’re an idiot.   KOHANE: It argued with me about certain biology questions. It was like a knockdown, drag-out fight.I was bringing references. It was impressive. But in fact, it made me trust it more.  LEE: Yeah.  KOHANE: And in fact, I will say—I remember it’s in the book—I had a bone to pick with Peter. Peter really was impressed by the empathy. And I pointed out that some of the most popular doctors are popular because they’re very empathic. But they’re not necessarily the best doctors. And in fact, I was taught that in medical school.    And so it’s a decoupling. It’s a human thing, that the empathy does not necessarily mean … it’s more of a, potentially, more of a signaled virtue than an actual virtue.  GOLDBERG: Nicely put.  LEE: Yeah, this issue of sycophancy, I think, is a struggle right now in the development of AI because I think it’s somehow related to instruction-following. So, you know, one of the challenges in AI is you’d like to give an AI a task—a task that might take several minutes or hours or even days to complete. And you want it to faithfully kind of follow those instructions. And, you know, that early version of GPT-4 was not very good at instruction-following. It would just silently disobey and, you know, and do something different.  And so I think we’re starting to hit some confusing elements of like, how agreeable should these things be?   One of the two of you used the word genteel. There was some point even while we were, like, on a little book tour … was it you, Carey, who said that the model seems nicer and less intelligent or less brilliant now than it did when we were writing the book?  GOLDBERG: It might have been, I think so. And I mean, I think in the context of medicine, of course, the question is, well, what’s likeliest to get the results you want with the patient, right? A lot of healthcare is in fact persuading the patient to do what you know as the physician would be best for them. And so it seems worth testing out whether this sycophancy is actually constructive or not. And I suspect … well, I don’t know, probably depends on the patient.  So actually, Peter, I have a few questions for you …  LEE: Yeah. Mm-hmm.  GOLDBERG: … that have been lingering for me. And one is, for AI to ever fully realize its potential in medicine, it must deal with the hallucinations. And I keep hearing conflicting accounts about whether that’s getting better or not. Where are we at, and what does that mean for use in healthcare?  LEE: Yeah, well, it’s, I think two years on, in the pretrained base models, there’s no doubt that hallucination rates by any benchmark measure have reduced dramatically. And, you know, that doesn’t mean they don’t happen. They still happen. But, you know, there’s been just a huge amount of effort and understanding in the, kind of, fundamental pretraining of these models. And that has come along at the same time that the inference costs, you know, for actually using these models has gone down, you know, by several orders of magnitude.   So things have gotten cheaper and have fewer hallucinations. At the same time, now there are these reasoning models. And the reasoning models are able to solve problems at PhD level oftentimes.  But at least at the moment, they are also now hallucinating more than the simpler pretrained models. And so it still continues to be, you know, a real issue, as we were describing. I don’t know, Zak, from where you’re at in medicine, as a clinician and as an educator in medicine, how is the medical community from where you’re sitting looking at that?  KOHANE: So I think it’s less of an issue, first of all, because the rate of hallucinations is going down. And second of all, in their day-to-day use, the doctor will provide questions that sit reasonably well into the context of medical decision-making. And the way doctors use this, let’s say on their non-EHRsmartphone is really to jog their memory or thinking about the patient, and they will evaluate independently. So that seems to be less of an issue. I’m actually more concerned about something else that’s I think more fundamental, which is effectively, what values are these models expressing?   And I’m reminded of when I was still in training, I went to a fancy cocktail party in Cambridge, Massachusetts, and there was a psychotherapist speaking to a dentist. They were talking about their summer, and the dentist was saying about how he was going to fix up his yacht that summer, and the only question was whether he was going to make enough money doing procedures in the spring so that he could afford those things, which was discomforting to me because that dentist was my dentist.And he had just proposed to me a few weeks before an expensive procedure.  And so the question is what, effectively, is motivating these models?   LEE: Yeah, yeah.   KOHANE: And so with several colleagues, I published a paper, basically, what are the values in AI? And we gave a case: a patient, a boy who is on the short side, not abnormally short, but on the short side, and his growth hormone levels are not zero. They’re there, but they’re on the lowest side. But the rest of the workup has been unremarkable. And so we asked GPT-4, you are a pediatric endocrinologist.  Should this patient receive growth hormone? And it did a very good job explaining why the patient should receive growth hormone.   GOLDBERG: Should. Should receive it.   KOHANE: Should. And then we asked, in a separate session, you are working for the insurance company. Should this patient receive growth hormone? And it actually gave a scientifically better reason not to give growth hormone. And in fact, I tend to agree medically, actually, with the insurance company in this case, because giving kids who are not growth hormone deficient, growth hormone gives only a couple of inches over many, many years, has all sorts of other issues. But here’s the point, we had 180-degree change in decision-making because of the prompt. And for that patient, tens-of-thousands-of-dollars-per-year decision; across patient populations, millions of dollars of decision-making.   LEE: Hmm. Yeah.  KOHANE: And you can imagine these user prompts making their way into system prompts, making their way into the instruction-following. And so I think this is aptly central. Just as I was wondering about my dentist, we should be wondering about these things. What are the values that are being embedded in them, some accidentally and some very much on purpose?  LEE: Yeah, yeah. That one, I think, we even had some discussions as we were writing the book, but there’s a technical element of that that I think we were missing, but maybe Carey, you would know for sure. And that’s this whole idea of prompt engineering. It sort of faded a little bit. Was it a thing? Do you remember?  GOLDBERG: I don’t think we particularly wrote about it. It’s funny, it does feel like it faded, and it seems to me just because everyone just gets used to conversing with the models and asking for what they want. Like, it’s not like there actually is any great science to it.  LEE: Yeah, even when it was a hot topic and people were talking about prompt engineering maybe as a new discipline, all this, it never, I was never convinced at the time. But at the same time, it is true. It speaks to what Zak was just talking about because part of the prompt engineering that people do is to give a defined role to the AI.   You know, you are an insurance claims adjuster, or something like that, and defining that role, that is part of the prompt engineering that people do.  GOLDBERG: Right. I mean, I can say, you know, sometimes you guys had me take sort of the patient point of view, like the “every patient” point of view. And I can say one of the aspects of using AI for patients that remains absent in as far as I can tell is it would be wonderful to have a consumer-facing interface where you could plug in your whole medical record without worrying about any privacy or other issues and be able to interact with the AI as if it were physician or a specialist and get answers, which you can’t do yet as far as I can tell.  LEE: Well, in fact, now that’s a good prompt because I think we do need to move on to the next episodes, and we’ll be talking about an episode that talks about consumers. But before we move on to Episode 2, which is next, I’d like to play one more quote, a little snippet from Sara Murray.  SARA MURRAY: I already do this when I’m on rounds—I’ll kind of give the case to ChatGPT if it’s a complex case, and I’ll say, “Here’s how I’m thinking about it; are there other things?” And it’ll give me additional ideas that are sometimes useful and sometimes not but often useful, and I’ll integrate them into my conversation about the patient. LEE: Carey, you wrote this fictional account at the very start of our book. And that fictional account, I think you and Zak worked on that together, talked about this medical resident, ER resident, using, you know, a chatbot off label, so to speak. And here we have the chief, in fact, the nation’s first chief health AI officerfor an elite health system doing exactly that. That’s got to be pretty validating for you, Carey.  GOLDBERG: It’s very.Although what’s troubling about it is that actually as in that little vignette that we made up, she’s using it off label, right. It’s like she’s just using it because it helps the way doctors use Google. And I do find it troubling that what we don’t have is sort of institutional buy-in for everyone to do that because, shouldn’t they if it helps?  LEE: Yeah. Well, let’s go ahead and get into Episode 2. So Episode 2, we sort of framed as talking to two people who are on the frontlines of big companies integrating generative AI into their clinical products. And so, one was Matt Lungren, who’s a colleague of mine here at Microsoft. And then Seth Hain, who leads all of R&D at Epic.   Maybe we’ll start with a little snippet of something that Matt said that struck me in a certain way.  MATTHEW LUNGREN: OK, we see this pain point. Doctors are typing on their computers while they’re trying to talk to their patients, right? We should be able to figure out a way to get that ambient conversation turned into text that then, you know, accelerates the doctor … takes all the important information. That’s a really hard problem, right. And so, for a long time, there was a human-in-the-loop aspect to doing this because you needed a human to say, “This transcript’s great, but here’s actually what needs to go in the note.” And that can’t scale. LEE: I think we expected healthcare systems to adopt AI, and we spent a lot of time in the book on AI writing clinical encounter notes. It’s happening for real now, and in a big way. And it’s something that has, of course, been happening before generative AI but now is exploding because of it. Where are we at now, two years later, just based on what we heard from guests?  KOHANE: Well, again, unless they’re forced to, hospitals will not adopt new technology unless it immediately translates into income. So it’s bizarrely counter-cultural that, again, they’re not being able to bill for the use of the AI, but this technology is so compelling to the doctors that despite everything, it’s overtaking the traditional dictation-typing routine.  LEE: Yeah.  GOLDBERG: And a lot of them love it and say, you will pry my cold dead hands off of my ambient note-taking, right. And I actually … a primary care physician allowed me to watch her. She was actually testing the two main platforms that are being used. And there was this incredibly talkative patient who went on and on about vacation and all kinds of random things for about half an hour.   And both of the platforms were incredibly good at pulling out what was actually medically relevant. And so to say that it doesn’t save time doesn’t seem right to me. Like, it seemed like it actually did and in fact was just shockingly good at being able to pull out relevant information.  LEE: Yeah.  KOHANE: I’m going to hypothesize that in the trials, which have in fact shown no gain in time, is the doctors were being incredibly meticulous.So I think … this is a Hawthorne effect, because you know you’re being monitored. And we’ve seen this in other technologies where the moment the focus is off, it’s used much more routinely and with much less inspection, for the better and for the worse.  LEE: Yeah, you know, within Microsoft, I had some internal disagreements about Microsoft producing a product in this space. It wouldn’t be Microsoft’s normal way. Instead, we would want 50 great companies building those products and doing it on our cloud instead of us competing against those 50 companies. And one of the reasons is exactly what you both said. I didn’t expect that health systems would be willing to shell out the money to pay for these things. It doesn’t generate more revenue. But I think so far two years later, I’ve been proven wrong. I wanted to ask a question about values here. I had this experience where I had a little growth, a bothersome growth on my cheek. And so had to go see a dermatologist. And the dermatologist treated it, froze it off. But there was a human scribe writing the clinical note.   And so I used the app to look at the note that was submitted. And the human scribe said something that did not get discussed in the exam room, which was that the growth was making it impossible for me to safely wear a COVID mask. And that was the reason for it.  And that then got associated with a code that allowed full reimbursement for that treatment. And so I think that’s a classic example of what’s called upcoding. And I strongly suspect that AI scribes, an AI scribe would not have done that.  GOLDBERG: Well, depending what values you programmed into it, right, Zak?  KOHANE: Today, today, today, it will not do it. But, Peter, that is actually the central issue that society has to have because our hospitals are currently mostly in the red. And upcoding is standard operating procedure. And if these AI get in the way of upcoding, they are going to be aligned towards that upcoding. You know, you have to ask yourself, these MRI machines are incredibly useful. They’re also big money makers. And if the AI correctly says that for this complaint, you don’t actually have to do the MRI …   LEE: Right.  KOHANE: … GOLDBERG: Yeah. And that raises another question for me. So, Peter, speaking from inside the gigantic industry, like, there seems to be such a need for self-surveillance of the models for potential harms that they could be causing. Are the big AI makers doing that? Are they even thinking about doing that?  Like, let’s say you wanted to watch out for the kind of thing that Zak’s talking about, could you?  LEE: Well, I think evaluation, like the best evaluation we had when we wrote our book was, you know, what score would this get on the step one and step two US medical licensing exams?   GOLDBERG: Right, right, right, yeah.  LEE: But honestly, evaluation hasn’t gotten that much deeper in the last two years. And it’s a big, I think, it is a big issue. And it’s related to the regulation issue also, I think.  Now the other guest in Episode 2 is Seth Hain from Epic. You know, Zak, I think it’s safe to say that you’re not a fan of Epic and the Epic system. You know, we’ve had a few discussions about that, about the fact that doctors don’t have a very pleasant experience when they’re using Epic all day.   Seth, in the podcast, said that there are over 100 AI integrations going on in Epic’s system right now. Do you think, Zak, that that has a chance to make you feel better about Epic? You know, what’s your view now two years on?  KOHANE: My view is, first of all, I want to separate my view of Epic and how it’s affected the conduct of healthcare and the quality of life of doctors from the individuals. Like Seth Hain is a remarkably fine individual who I’ve enjoyed chatting with and does really great stuff. Among the worst aspects of the Epic, even though it’s better in that respect than many EHRs, is horrible user interface.  The number of clicks that you have to go to get to something. And you have to remember where someone decided to put that thing. It seems to me that it is fully within the realm of technical possibility today to actually give an agent a task that you want done in the Epic record. And then whether Epic has implemented that agent or someone else, it does it so you don’t have to do the clicks. Because it’s something really soul sucking that when you’re trying to help patients, you’re having to remember not the right dose of the medication, but where was that particular thing that you needed in that particular task?   I can’t imagine that Epic does not have that in its product line. And if not, I know there must be other companies that essentially want to create that wrapper. So I do think, though, that the danger of multiple integrations is that you still want to have the equivalent of a single thought process that cares about the patient bringing those different processes together. And I don’t know if that’s Epic’s responsibility, the hospital’s responsibility, whether it’s actually a patient agent. But someone needs to be also worrying about all those AIs that are being integrated into the patient record. So … what do you think, Carey?  GOLDBERG: What struck me most about what Seth said was his description of the Cosmos project, and I, you know, I have been drinking Zak’s Kool-Aid for a very long time,and he—no, in a good way! And he persuaded me long ago that there is this horrible waste happening in that we have all of these electronic medical records, which could be used far, far more to learn from, and in particular, when you as a patient come in, it would be ideal if your physician could call up all the other patients like you and figure out what the optimal treatment for you would be. And it feels like—it sounds like—that’s one of the central aims that Epic is going for. And if they do that, I think that will redeem a lot of the pain that they’ve caused physicians these last few years.   And I also found myself thinking, you know, maybe this very painful period of using electronic medical records was really just a growth phase. It was an awkward growth phase. And once AI is fully used the way Zak is beginning to describe, the whole system could start making a lot more sense for everyone.  LEE: Yeah. One conversation I’ve had with Seth, in all of this is, you know, with AI and its development, is there a future, a near future where we don’t have an EHRsystem at all? You know, AI is just listening and just somehow absorbing all the information. And, you know, one thing that Seth said, which I felt was prescient, and I’d love to get your reaction, especially Zak, on this is he said, I think that … he said, technically, it could happen, but the problem is right now, actually doctors do a lot of their thinking when they write and review notes. You know, the actual process of being a doctor is not just being with a patient, but it’s actually thinking later. What do you make of that?  KOHANE: So one of the most valuable experiences I had in training was something that’s more or less disappeared in medicine, which is the post-clinic conference, where all the doctors come together and we go through the cases that we just saw that afternoon. And we, actually, were trying to take potshots at each otherin order to actually improve. Oh, did you actually do that? Oh, I forgot. I’m going to go call the patient and do that.   And that really happened. And I think that, yes, doctors do think, and I do think that we are insufficiently using yet the artificial intelligence currently in the ambient dictation mode as much more of a independent agent saying, did you think about that?  I think that would actually make it more interesting, challenging, and clearly better for the patient because that conversation I just told you about with the other doctors, that no longer exists.   LEE: Yeah. Mm-hmm. I want to do one more thing here before we leave Matt and Seth in Episode 2, which is something that Seth said with respect to how to reduce hallucination.   SETH HAIN: At that time, there’s a lot of conversation in the industry around something called RAG, or retrieval-augmented generation. And the idea was, could you pull the relevant bits, the relevant pieces of the chart, into that prompt, that information you shared with the generative AI model, to be able to increase the usefulness of the draft that was being created? And that approach ended up proving and continues to be to some degree, although the techniques have greatly improved, somewhat brittle, right. And I think this becomes one of the things that we are and will continue to improve upon because, as you get a richer and richer amount of information into the model, it does a better job of responding.  LEE: Yeah, so, Carey, this sort of gets at what you were saying, you know, that shouldn’t these models be just bringing in a lot more information into their thought processes? And I’m certain when we wrote our book, I had no idea. I did not conceive of RAG at all. It emerged a few months later.   And to my mind, I remember the first time I encountered RAG—Oh, this is going to solve all of our problems of hallucination. But it’s turned out to be harder. It’s improving day by day, but it’s turned out to be a lot harder.  KOHANE: Seth makes a very deep point, which is the way RAG is implemented is basically some sort of technique for pulling the right information that’s contextually relevant. And the way that’s done is typically heuristic at best. And it’s not … doesn’t have the same depth of reasoning that the rest of the model has.   And I’m just wondering, Peter, what you think, given the fact that now context lengths seem to be approaching a million or more, and people are now therefore using the full strength of the transformer on that context and are trying to figure out different techniques to make it pay attention to the middle of the context. In fact, the RAG approach perhaps was just a transient solution to the fact that it’s going to be able to amazingly look in a thoughtful way at the entire record of the patient, for example. What do you think, Peter?  LEE: I think there are three things, you know, that are going on, and I’m not sure how they’re going to play out and how they’re going to be balanced. And I’m looking forward to talking to people in later episodes of this podcast, you know, people like Sébastien Bubeck or Bill Gates about this, because, you know, there is the pretraining phase, you know, when things are sort of compressed and baked into the base model.   There is the in-context learning, you know, so if you have extremely long or infinite context, you’re kind of learning as you go along. And there are other techniques that people are working on, you know, various sorts of dynamic reinforcement learning approaches, and so on. And then there is what maybe you would call structured RAG, where you do a pre-processing. You go through a big database, and you figure it all out. And make a very nicely structured database the AI can then consult with later.   And all three of these in different contexts today seem to show different capabilities. But they’re all pretty important in medicine.   Moving on to Episode 3, we talked to Dave DeBronkart, who is also known as “e-Patient Dave,” an advocate of patient empowerment, and then also Christina Farr, who has been doing a lot of venture investing for consumer health applications.   Let’s get right into this little snippet from something that e-Patient Dave said that talks about the sources of medical information, particularly relevant for when he was receiving treatment for stage 4 kidney cancer.  DAVE DEBRONKART: And I’m making a point here of illustrating that I am anything but medically trained, right. And yet I still, I want to understand as much as I can. I was months away from dead when I was diagnosed, but in the patient community, I learned that they had a whole bunch of information that didn’t exist in the medical literature. Now today we understand there’s publication delays; there’s all kinds of reasons. But there’s also a whole bunch of things, especially in an unusual condition, that will never rise to the level of deserving NIHfunding and research. LEE: All right. So I have a question for you, Carey, and a question for you, Zak, about the whole conversation with e-Patient Dave, which I thought was really remarkable. You know, Carey, I think as we were preparing for this whole podcast series, you made a comment—I actually took it as a complaint—that not as much has happened as I had hoped or thought. People aren’t thinking boldly enough, you know, and I think, you know, I agree with you in the sense that I think we expected a lot more to be happening, particularly in the consumer space. I’m giving you a chance to vent about this.  GOLDBERG:Thank you! Yes, that has been by far the most frustrating thing to me. I think that the potential for AI to improve everybody’s health is so enormous, and yet, you know, it needs some sort of support to be able to get to the point where it can do that. Like, remember in the book we wrote about Greg Moore talking about how half of the planet doesn’t have healthcare, but people overwhelmingly have cellphones. And so you could connect people who have no healthcare to the world’s medical knowledge, and that could certainly do some good.   And I have one great big problem with e-Patient Dave, which is that, God, he’s fabulous. He’s super smart. Like, he’s not a typical patient. He’s an off-the-charts, brilliant patient. And so it’s hard to … and so he’s a great sort of lead early-adopter-type person, and he can sort of show the way for others.   But what I had hoped for was that there would be more visible efforts to really help patients optimize their healthcare. Probably it’s happening a lot in quiet ways like that any discharge instructions can be instantly beautifully translated into a patient’s native language and so on. But it’s almost like there isn’t a mechanism to allow this sort of mass consumer adoption that I would hope for. LEE: Yeah. But you have written some, like, you even wrote about that person who saved his dog. So do you think … you know, and maybe a lot more of that is just happening quietly that we just never hear about?  GOLDBERG: I’m sure that there is a lot of it happening quietly. And actually, that’s another one of my complaints is that no one is gathering that stuff. It’s like you might happen to see something on social media. Actually, e-Patient Dave has a hashtag, PatientsUseAI, and a blog, as well. So he’s trying to do it. But I don’t know of any sort of overarching or academic efforts to, again, to surveil what’s the actual use in the population and see what are the pros and cons of what’s happening.  LEE: Mm-hmm. So, Zak, you know, the thing that I thought about, especially with that snippet from Dave, is your opening for Chapter 8 that you wrote, you know, about your first patient dying in your arms. I still think of how traumatic that must have been. Because, you know, in that opening, you just talked about all the little delays, all the little paper-cut delays, in the whole process of getting some new medical technology approved. But there’s another element that Dave kind of speaks to, which is just, you know, patients who are experiencing some issue are very, sometimes very motivated. And there’s just a lot of stuff on social media that happens.  KOHANE: So this is where I can both agree with Carey and also disagree. I think when people have an actual health problem, they are now routinely using it.  GOLDBERG: Yes, that’s true.  KOHANE: And that situation is happening more often because medicine is failing. This is something that did not come up enough in our book. And perhaps that’s because medicine is actually feeling a lot more rickety today than it did even two years ago.   We actually mentioned the problem. I think, Peter, you may have mentioned the problem with the lack of primary care. But now in Boston, our biggest healthcare system, all the practices for primary care are closed. I cannot get for my own faculty—residents at MGHcan’t get primary care doctor. And so …  LEE: Which is just crazy. I mean, these are amongst the most privileged people in medicine, and they can’t find a primary care physician. That’s incredible.  KOHANE: Yeah, and so therefore … and I wrote an And so therefore, you see people who know that they have a six-month wait till they see the doctor, and all they can do is say, “I have this rash. Here’s a picture. What’s it likely to be? What can I do?” “I’m gaining weight. How do I do a ketogenic diet?” Or, “How do I know that this is the flu?”    This is happening all the time, where acutely patients have actually solved problems that doctors have not. Those are spectacular. But I’m saying more routinely because of the failure of medicine. And it’s not just in our fee-for-service United States. It’s in the UK; it’s in France. These are first-world, developed-world problems. And we don’t even have to go to lower- and middle-income countries for that. LEE: Yeah.  GOLDBERG: But I think it’s important to note that, I mean, so you’re talking about how even the most elite people in medicine can’t get the care they need. But there’s also the point that we have so much concern about equity in recent years. And it’s likeliest that what we’re doing is exacerbating inequity because it’s only the more connected, you know, better off people who are using AI for their health.  KOHANE: Oh, yes. I know what various Harvard professors are doing. They’re paying for a concierge doctor. And that’s, you know, a - to -a-year-minimum investment. That’s inequity.  LEE: When we wrote our book, you know, the idea that GPT-4 wasn’t trained specifically for medicine, and that was amazing, but it might get even better and maybe would be necessary to do that. But one of the insights for me is that in the consumer space, the kinds of things that people ask about are different than what the board-certified clinician would ask.  KOHANE: Actually, that’s, I just recently coined the term. It’s the … maybe it’s … well, at least it’s new to me. It’s the technology or expert paradox. And that is the more expert and narrow your medical discipline, the more trivial it is to translate that into a specialized AI. So echocardiograms? We can now do beautiful echocardiograms. That’s really hard to do. I don’t know how to interpret an echocardiogram. But they can do it really, really well. Interpret an EEG. Interpret a genomic sequence. But understanding the fullness of the human condition, that’s actually hard. And actually, that’s what primary care doctors do best. But the paradox is right now, what is easiest for AI is also the most highly paid in medicine.Whereas what is the hardest for AI in medicine is the least regarded, least paid part of medicine.  GOLDBERG: So this brings us to the question I wanted to throw at both of you actually, which is we’ve had this spasm of incredibly prominent people predicting that in fact physicians would be pretty obsolete within the next few years. We had Bill Gates saying that; we had Elon Musk saying surgeons are going to be obsolete within a few years. And I think we had Demis Hassabis saying, “Yeah, we’ll probably cure most diseases within the next decade or so.”  So what do you think? And also, Zak, to what you were just saying, I mean, you’re talking about being able to solve very general overarching problems. But in fact, these general overarching models are actually able, I would think, are able to do that because they are broad. So what are we heading towards do you think? What should the next book be … The end of doctors?  KOHANE: So I do recall a conversation that … we were at a table with Bill Gates, and Bill Gates immediately went to this, which is advancing the cutting edge of science. And I have to say that I think it will accelerate discovery. But eliminating, let’s say, cancer? I think that’s going to be … that’s just super hard. The reason it’s super hard is we don’t have the data or even the beginnings of the understanding of all the ways this devilish disease managed to evolve around our solutions.   And so that seems extremely hard. I think we’ll make some progress accelerated by AI, but solving it in a way Hassabis says, God bless him. I hope he’s right. I’d love to have to eat crow in 10 or 20 years, but I don’t think so. I do believe that a surgeon working on one of those Davinci machines, that stuff can be, I think, automated.   And so I think that’s one example of one of the paradoxes I described. And it won’t be that we’re replacing doctors. I just think we’re running out of doctors. I think it’s really the case that, as we said in the book, we’re getting a huge deficit in primary care doctors.  But even the subspecialties, my subspecialty, pediatric endocrinology, we’re only filling half of the available training slots every year. And why? Because it’s a lot of work, a lot of training, and frankly doesn’t make as much money as some of the other professions.   LEE: Yeah. Yeah, I tend to think that, you know, there are going to be always a need for human doctors, not for their skills. In fact, I think their skills increasingly will be replaced by machines. And in fact, I’ve talked about a flip. In fact, patients will demand, Oh my god, you mean you’re going to try to do that yourself instead of having the computer do it? There’s going to be that sort of flip. But I do think that when it comes to people’s health, people want the comfort of an authority figure that they trust. And so what is more of a question for me is whether we will ever view a machine as an authority figure that we can trust.  And before I move on to Episode 4, which is on norms, regulations and ethics, I’d like to hear from Chrissy Farr on one more point on consumer health, specifically as it relates to pregnancy:  CHRISTINA FARR: For a lot of women, it’s their first experience with the hospital. And, you know, I think it’s a really big opportunity for these systems to get a whole family on board and keep them kind of loyal. And a lot of that can come through, you know, just delivering an incredible service. Unfortunately, I don’t think that we are delivering incredible services today to women in this country. I see so much room for improvement. LEE: In the consumer space, I don’t think we really had a focus on those periods in a person’s life when they have a lot of engagement, like pregnancy, or I think another one is menopause, cancer. You know, there are points where there is, like, very intense engagement. And we heard that from e-Patient Dave, you know, with his cancer and Chrissy with her pregnancy. Was that a miss in our book? What do think, Carey?  GOLDBERG: I mean, I don’t think so. I think it’s true that there are many points in life when people are highly engaged. To me, the problem thus far is just that I haven’t seen consumer-facing companies offering beautiful AI-based products. I think there’s no question at all that the market is there if you have the products to offer.  LEE: So, what do you think this means, Zak, for, you know, like Boston Children’s or Mass General Brigham—you know, the big places?  KOHANE: So again, all these large healthcare systems are in tough shape. MGBwould be fully in the red if not for the fact that its investments, of all things, have actually produced. If you look at the large healthcare systems around the country, they are in the red. And there’s multiple reasons why they’re in the red, but among them is cost of labor.   And so we’ve created what used to be a very successful beast, the health center. But it’s developed a very expensive model and a highly regulated model. And so when you have high revenue, tiny margins, your ability to disrupt yourself, to innovate, is very, very low because you will have to talk to the board next year if you went from 2% positive margin to 1% negative margin.   LEE: Yeah.  KOHANE: And so I think we’re all waiting for one of the two things to happen, either a new kind of healthcare delivery system being generated or ultimately one of these systems learns how to disrupt itself.   LEE: Yeah. GOLDBERG: We punted.We totally punted to the AI.  LEE: We had three amazing guests. One was Laura Adams from National Academy of Medicine. Let’s play a snippet from her.  LAURA ADAMS: I think one of the most provocative and exciting articles that I saw written recently was by Bakul Patel and David Blumenthal, who posited, should we be regulating generative AI as we do a licensed and qualified provider? Should it be treated in the sense that it’s got to have a certain amount of training and a foundation that’s got to pass certain tests? Does it have to report its performance? And I’m thinking, what a provocative idea, but it’s worth considering. LEE: All right, so I very well remember that we had discussed this kind of idea when we were writing our book. And I think before we finished our book, I personally rejected the idea. But now two years later, what do the two of you think? I’m dying to hear.  GOLDBERG: Well, wait, why … what do you think? Like, are you sorry that you rejected it?  LEE: I’m still skeptical because when we are licensing human beings as doctors, you know, we’re making a lot of implicit assumptions that we don’t test as part of their licensure, you know, that first of all, they arehuman being and they care about life, and that, you know, they have a certain amount of common sense and shared understanding of the world.   And there’s all sorts of sort of implicit assumptions that we have about each other as human beings living in a society together. That you know how to study, you know, because I know you just went through three years of medical or four years of medical school and all sorts of things. And so the standard ways that we license human beings, they don’t need to test all of that stuff. But somehow intuitively, all of that seems really important.  I don’t know. Am I wrong about that?  KOHANE: So it’s compared with what issue? Because we know for a fact that doctors who do a lot of a procedure, like do this procedure, like high-risk deliveries all the time, have better outcomes than ones who only do a few high risk. We talk about it, but we don’t actually make it explicit to patients or regulate that you have to have this minimal amount. And it strikes me that in some sense, and, oh, very importantly, these things called human beings learn on the job. And although I used to be very resentful of it as a resident, when someone would say, I don’t want the resident, I want the …  GOLDBERG: … the attending.  KOHANE: … they had a point. And so the truth is, maybe I was a wonderful resident, but some people were not so great.And so it might be the best outcome if we actually, just like for human beings, we say, yeah, OK, it’s this good, but don’t let it work autonomously, or it’s done a thousand of them, just let it go. We just don’t have practically speaking, we don’t have the environment, the lab, to test them. Now, maybe if they get embodied in robots and literally go around with us, then it’s going to bea lot easier. I don’t know.  LEE: Yeah.   GOLDBERG: Yeah, I think I would take a step back and say, first of all, we weren’t the only ones who were stumped by regulating AI. Like, nobody has done it yet in the United States to this day, right. Like, we do not have standing regulation of AI in medicine at all in fact. And that raises the issue of … the story that you hear often in the biotech business, which is, you know, more prominent here in Boston than anywhere else, is that thank goodness Cambridge put out, the city of Cambridge, put out some regulations about biotech and how you could dump your lab waste and so on. And that enabled the enormous growth of biotech here.   If you don’t have the regulations, then you can’t have the growth of AI in medicine that is worthy of having. And so, I just … we’re not the ones who should do it, but I just wish somebody would.   LEE: Yeah.  GOLDBERG: Zak.  KOHANE: Yeah, but I want to say this as always, execution is everything, even in regulation.   And so I’m mindful that a conference that both of you attended, the RAISE conference. The Europeans in that conference came to me personally and thanked me for organizing this conference about safe and effective use of AI because they said back home in Europe, all that we’re talking about is risk, not opportunities to improve care.   And so there is a version of regulation which just locks down the present and does not allow the future that we’re talking about to happen. And so, Carey, I absolutely hear you that we need to have a regulation that takes away some of the uncertainty around liability, around the freedom to operate that would allow things to progress. But we wrote in our book that premature regulation might actually focus on the wrong thing. And so since I’m an optimist, it may be the fact that we don’t have much of a regulatory infrastructure today, that it allows … it’s a unique opportunity—I’ve said this now to several leaders—for the healthcare systems to say, this is the regulation we need.   GOLDBERG: It’s true.  KOHANE: And previously it was top-down. It was coming from the administration, and those executive orders are now history. But there is an opportunity, which may or may not be attained, there is an opportunity for the healthcare leadership—for experts in surgery—to say, “This is what we should expect.”   LEE: Yeah.   KOHANE: I would love for this to happen. I haven’t seen evidence that it’s happening yet.  GOLDBERG: No, no. And there’s this other huge issue, which is that it’s changing so fast. It’s moving so fast. That something that makes sense today won’t in six months. So, what do you do about that?  LEE: Yeah, yeah, that is something I feel proud of because when I went back and looked at our chapter on this, you know, we did make that point, which I think has turned out to be true.   But getting back to this conversation, there’s something, a snippet of something, that Vardit Ravitsky said that I think touches on this topic.   VARDIT RAVITSKY: So my pushback is, are we seeing AI exceptionalism in the sense that if it’s AI, huh, panic! We have to inform everybody about everything, and we have to give them choices, and they have to be able to reject that tool and the other tool versus, you know, the rate of human error in medicine is awful. So why are we so focused on informed consent and empowerment regarding implementation of AI and less in other contexts? GOLDBERG: Totally agree. Who cares about informed consent about AI. Don’t want it. Don’t need it. Nope.  LEE: Wow. Yeah. You know, and this … Vardit of course is one of the leading bioethicists, you know, and of course prior to AI, she was really focused on genetics. But now it’s all about AI.   And, Zak, you know, you and other doctors have always told me, you know, the truth of the matter is, you know, what do you call the bottom-of-the-class graduate of a medical school?  And the answer is “doctor.”  KOHANE: “Doctor.” Yeah. Yeah, I think that again, this gets to compared with what? We have to compare AI not to the medicine we imagine we have, or we would like to have, but to the medicine we have today. And if we’re trying to remove inequity, if we’re trying to improve our health, that’s what … those are the right metrics. And so that can be done so long as we avoid catastrophic consequences of AI.   So what would the catastrophic consequence of AI be? It would be a systematic behavior that we were unaware of that was causing poor healthcare. So, for example, you know, changing the dose on a medication, making it 20% higher than normal so that the rate of complications of that medication went from 1% to 5%. And so we do need some sort of monitoring.   We haven’t put out the paper yet, but in computer science, there’s, well, in programming, we know very well the value for understanding how our computer systems work.   And there was a guy by name of Allman, I think he’s still at a company called Sendmail, who created something called syslog. And syslog is basically a log of all the crap that’s happening in our operating system. And so I’ve been arguing now for the creation of MedLog. And MedLog … in other words, what we cannot measure, we cannot regulate, actually.  LEE: Yes.  KOHANE: And so what we need to have is MedLog, which says, “Here’s the context in which a decision was made. Here’s the version of the AI, you know, the exact version of the AI. Here was the data.” And we just have MedLog. And I think MedLog is actually incredibly important for being able to measure, to just do what we do in … it’s basically the black box for, you know, when there’s a crash. You know, we’d like to think we could do better than crash. We can say, “Oh, we’re seeing from MedLog that this practice is turning a little weird.” But worst case, patient dies,can see in MedLog, what was the information this thing knew about it? And did it make the right decision? We can actually go for transparency, which like in aviation, is much greater than in most human endeavors.   GOLDBERG: Sounds great.  LEE: Yeah, it’s sort of like a black box. I was thinking of the aviation black box kind of idea. You know, you bring up medication errors, and I have one more snippet. This is from our guest Roxana Daneshjou from Stanford. ROXANA DANESHJOU: There was a mistake in her after-visit summary about how much Tylenol she could take. But I, as a physician, knew that this dose was a mistake. I actually asked ChatGPT. I gave it the whole after-visit summary, and I said, are there any mistakes here? And it clued in that the dose of the medication was wrong. LEE: Yeah, so this is something we did write about in the book. We made a prediction that AI might be a second set of eyes, I think is the way we put it, catching things. And we actually had examples specifically in medication dose errors. I think for me, I expected to see a lot more of that than we are.  KOHANE: Yeah, it goes back to our conversation about Epic or competitor Epic doing that. I think we’re going to see that having oversight over all medical orders, all orders in the system, critique, real-time critique, where we’re both aware of alert fatigue. So we don’t want to have too many false positives. At the same time, knowing what are critical errors which could immediately affect lives. I think that is going to become in terms of—and driven by quality measures—a product.  GOLDBERG: And I think word will spread among the general public that kind of the same way in a lot of countries when someone’s in a hospital, the first thing people ask relatives are, well, who’s with them? Right?   LEE: Yeah. Yup.  GOLDBERG: You wouldn’t leave someone in hospital without relatives. Well, you wouldn’t maybe leave your medical …   KOHANE: By the way, that country is called the United States.  GOLDBERG: Yes, that’s true.It is true here now, too. But similarly, I would tell any loved one that they would be well advised to keep using AI to check on their medical care, right. Why not?  LEE: Yeah. Yeah. Last topic, just for this Episode 4. Roxana, of course, I think really made a name for herself in the AI era writing, actually just prior to ChatGPT, you know, writing some famous papers about how computer vision systems for dermatology were biased against dark-skinned people. And we did talk some about bias in these AI systems, but I feel like we underplayed it, or we didn’t understand the magnitude of the potential issues. What are your thoughts?  KOHANE: OK, I want to push back, because I’ve been asked this question several times. And so I have two comments. One is, over 100,000 doctors practicing medicine, I know they have biases. Some of them actually may be all in the same direction, and not good. But I have no way of actually measuring that. With AI, I know exactly how to measure that at scale and affordably. Number one. Number two, same 100,000 doctors. Let’s say I do know what their biases are. How hard is it for me to change that bias? It’s impossible …  LEE: Yeah, yeah.   KOHANE: … practically speaking. Can I change the bias in the AI? Somewhat. Maybe some completely.  I think that we’re in a much better situation.  GOLDBERG: Agree.  LEE: I think Roxana made also the super interesting point that there’s bias in the whole system, not just in individuals, but, you know, there’s structural bias, so to speak.   KOHANE: There is.  LEE: Yeah. Hmm. There was a super interesting paper that Roxana wrote not too long ago—her and her collaborators—showing AI’s ability to detect, to spot bias decision-making by others. Are we going to see more of that?  KOHANE: Oh, yeah, I was very pleased when, in NEJM AI, we published a piece with Marzyeh Ghassemi, and what they were talking about was actually—and these are researchers who had published extensively on bias and threats from AI. And they actually, in this article, did the flip side, which is how much better AI can do than human beings in this respect.   And so I think that as some of these computer scientists enter the world of medicine, they’re becoming more and more aware of human foibles and can see how these systems, which if they only looked at the pretrained state, would have biases. But now, where we know how to fine-tune the de-bias in a variety of ways, they can do a lot better and, in fact, I think are much more … a much greater reason for optimism that we can change some of these noxious biases than in the pre-AI era.  GOLDBERG: And thinking about Roxana’s dermatological work on how I think there wasn’t sufficient work on skin tone as related to various growths, you know, I think that one thing that we totally missed in the book was the dawn of multimodal uses, right.  LEE: Yeah. Yeah, yeah.  GOLDBERG: That’s been truly amazing that in fact all of these visual and other sorts of data can be entered into the models and move them forward.  LEE: Yeah. Well, maybe on these slightly more optimistic notes, we’re at time. You know, I think ultimately, I feel pretty good still about what we did in our book, although there were a lot of misses.I don’t think any of us could really have predicted really the extent of change in the world.    So, Carey, Zak, just so much fun to do some reminiscing but also some reflection about what we did.   And to our listeners, as always, thank you for joining us. We have some really great guests lined up for the rest of the series, and they’ll help us explore a variety of relevant topics—from AI drug discovery to what medical students are seeing and doing with AI and more.   We hope you’ll continue to tune in. And if you want to catch up on any episodes you might have missed, you can find them at aka.ms/AIrevolutionPodcastor wherever you listen to your favorite podcasts.    Until next time.   #coauthor #roundtable #reflecting #real #world
    Coauthor roundtable: Reflecting on real world of doctors, developers, patients, and policymakers
    www.microsoft.com
    Transcript [MUSIC]      [BOOK PASSAGE]   PETER LEE: “We need to start understanding and discussing AI’s potential for good and ill now. Or rather, yesterday. … GPT-4 has game-changing potential to improve medicine and health.”  [END OF BOOK PASSAGE]   [THEME MUSIC]      This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.      Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?       In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here. [THEME MUSIC FADES]   The passage I read at the top is from the book’s prologue.    When Carey, Zak, and I wrote the book, we could only speculate how generative AI would be used in healthcare because GPT-4 hadn’t yet been released. It wasn’t yet available to the very people we thought would be most affected by it. And while we felt strongly that this new form of AI would have the potential to transform medicine, it was such a different kind of technology for the world, and no one had a user’s manual for this thing to explain how to use it effectively and also how to use it safely.   So we thought it would be important to give healthcare professionals and leaders a framing to start important discussions around its use. We wanted to provide a map not only to help people navigate a new world that we anticipated would happen with the arrival of GPT-4 but also to help them chart a future of what we saw as a potential revolution in medicine.   So I’m super excited to welcome my coauthors: longtime medical/science journalist Carey Goldberg and Dr. Zak Kohane, the inaugural chair of Harvard Medical School’s Department of Biomedical Informatics and the editor-in-chief for The New England Journal of Medicine AI.   We’re going to have two discussions. This will be the first one about what we’ve learned from the people on the ground so far and how we are thinking about generative AI today.   [TRANSITION MUSIC]  Carey, Zak, I’m really looking forward to this.  CAREY GOLDBERG: It’s nice to see you, Peter.   LEE: [LAUGHS] It’s great to see you, too.  GOLDBERG: We missed you.  ZAK KOHANE: The dynamic gang is back. [LAUGHTER]  LEE: Yeah, and I guess after that big book project two years ago, it’s remarkable that we’re still on speaking terms with each other. [LAUGHTER]  In fact, this episode is to react to what we heard in the first four episodes of this podcast. But before we get there, I thought maybe we should start with the origins of this project just now over two years ago. And, you know, I had this early secret access to Davinci 3, now known as GPT-4.   I remember, you know, experimenting right away with things in medicine, but I realized I was in way over my head. And so I wanted help. And the first person I called was you, Zak. And you remember we had a call, and I tried to explain what this was about. And I think I saw skepticism in—polite skepticism—in your eyes. But tell me, you know, what was going through your head when you heard me explain this thing to you?  KOHANE: So I was divided between the fact that I have tremendous respect for you, Peter. And you’ve always struck me as sober. And we’ve had conversations which showed to me that you fully understood some of the missteps that technology—ARPA, Microsoft, and others—had made in the past. And yet, you were telling me a full science fiction compliant story [LAUGHTER] that something that we thought was 30 years away was happening now.   LEE: Mm-hmm.  KOHANE: And it was very hard for me to put together. And so I couldn’t quite tell myself this is BS, but I said, you know, I need to look at it. Just this seems too good to be true. What is this? So it was very hard for me to grapple with it. I was thrilled that it might be possible, but I was thinking, How could this be possible?  LEE: Yeah. Well, even now, I look back, and I appreciate that you were nice to me, because I think a lot of people would have [LAUGHS] been much less polite. And in fact, I myself had expressed a lot of very direct skepticism early on.   After ChatGPT got released, I think three or four days later, I received an email from a colleague running … who runs a clinic, and, you know, he said, “Wow, this is great, Peter. And, you know, we’re using this ChatGPT, you know, to have the receptionist in our clinic write after-visit notes to our patients.”   And that sparked a huge internal discussion about this. And you and I knew enough about hallucinations and about other issues that it seemed important to write something about what this could do and what it couldn’t do. And so I think, I can’t remember the timing, but you and I decided a book would be a good idea. And then I think you had the thought that you and I would write in a hopelessly academic style [LAUGHTER] that no one would be able to read.   So it was your idea to recruit Carey, I think, right?  KOHANE: Yes, it was. I was sure that we both had a lot of material, but communicating it effectively to the very people we wanted to would not go well if we just left ourselves to our own devices. And Carey is super brilliant at what she does. She’s an idea synthesizer and public communicator in the written word and amazing.  LEE: So yeah. So, Carey, we contact you. How did that go?  GOLDBERG: So yes. On my end, I had known Zak for probably, like, 25 years, and he had always been the person who debunked the scientific hype for me. I would turn to him with like, “Hmm, they’re saying that the Human Genome Project is going to change everything.” And he would say, “Yeah. But first it’ll be 10 years of bad news, and then [LAUGHTER] we’ll actually get somewhere.”    So when Zak called me up at seven o’clock one morning, just beside himself after having tried Davinci 3, I knew that there was something very serious going on. And I had just quit my job as the Boston bureau chief of Bloomberg News, and I was ripe for the plucking. And I also … I feel kind of nostalgic now about just the amazement and the wonder and the awe of that period. We knew that when generative AI hit the world, there would be all kinds of snags and obstacles and things that would slow it down, but at that moment, it was just like the holy crap moment. [LAUGHTER] And it’s fun to think about it now. LEE: Yeah. KOHANE: I will see that and raise that one. I now tell GPT-4, please write this in the style of Carey Goldberg.   GOLDBERG: [LAUGHTER] No way! Really?   KOHANE: Yes way. Yes way. Yes way.  GOLDBERG: Wow. Well, I have to say, like, it’s not hard to motivate readers when you’re writing about the most transformative technology of their lifetime. Like, I think there’s a gigantic hunger to read and to understand. So you were not hard to work with, Peter and Zak. [LAUGHS]  LEE: All right. So I think we have to get down to work [LAUGHS] now.   Yeah, so for these podcasts, you know, we’re talking to different types of people to just reflect on what’s actually happening, what has actually happened over the last two years. And so the first episode, we talked to two doctors. There’s Chris Longhurst at UC San Diego and Sara Murray at UC San Francisco. And besides being doctors and having AI affect their clinical work, they just happen also to be leading the efforts at their respective institutions to figure out how best to integrate AI into their health systems.  And, you know, it was fun to talk to them. And I felt like a lot of what they said was pretty validating for us. You know, they talked about AI scribes. Chris, especially, talked a lot about how AI can respond to emails from patients, write referral letters. And then, you know, they both talked about the importance of—I think, Zak, you used the phrase in our book “trust but verify”—you know, to have always a human in the loop.    What did you two take away from their thoughts overall about how doctors are using … and I guess, Zak, you would have a different lens also because at Harvard, you see doctors all the time grappling with AI.  KOHANE: So on the one hand, I think they’ve done some very interesting studies. And indeed, they saw that when these generative models, when GPT-4, was sending a note to patients, it was more detailed, friendlier.  But there were also some nonobvious results, which is on the generation of these letters, if indeed you review them as you’re supposed to, it was not clear that there was any time savings. And my own reaction was, Boy, every one of these things needs institutional review. It’s going to be hard to move fast.   And yet, at the same time, we know from them that the doctors on their smartphones are accessing these things all the time. And so the disconnect between a healthcare system, which is duty bound to carefully look at every implementation, is, I think, intimidating.   LEE: Yeah.  KOHANE: And at the same time, doctors who just have to do what they have to do are using this new superpower and doing it. And so that’s actually what struck me …   LEE: Yeah.  KOHANE: … is that these are two leaders and they’re doing what they have to do for their institutions, and yet there’s this disconnect.  And by the way, I don’t think we’ve seen any faster technology adoption than the adoption of ambient dictation. And it’s not because it’s time saving. And in fact, so far, the hospitals have to pay out of pocket. It’s not like insurance is paying them more. But it’s so much more pleasant for the doctors … not least of which because they can actually look at their patients instead of looking at the terminal and plunking down.   LEE: Carey, what about you?  GOLDBERG: I mean, anecdotally, there are time savings. Anecdotally, I have heard quite a few doctors saying that it cuts down on “pajama time” to be able to have the note written by the AI and then for them to just check it. In fact, I spoke to one doctor who said, you know, basically it means that when I leave the office, I’ve left the office. I can go home and be with my kids.  So I don’t think the jury is fully in yet about whether there are time savings. But what is clear is, Peter, what you predicted right from the get-go, which is that this is going to be an amazing paper shredder. Like, the main first overarching use cases will be back-office functions.  LEE: Yeah, yeah. Well, and it was, I think, not a hugely risky prediction because, you know, there were already companies, like, using phone banks of scribes in India to kind of listen in. And, you know, lots of clinics actually had human scribes being used. And so it wasn’t a huge stretch to imagine the AI. [TRANSITION MUSIC]  So on the subject of things that we missed, Chris Longhurst shared this scenario, which stuck out for me, and he actually coauthored a paper on it last year.  CHRISTOPHER LONGHURST: It turns out, not surprisingly, healthcare can be frustrating. And stressed patients can send some pretty nasty messages to their care teams. [LAUGHTER] And you can imagine being a busy, tired, exhausted clinician and receiving a bit of a nasty-gram. And the GPT is actually really helpful in those instances in helping draft a pretty empathetic response when I think the human instinct would be a pretty nasty one.  LEE: [LAUGHS] So, Carey, maybe I’ll start with you. What did we understand about this idea of empathy out of AI at the time we wrote the book, and what do we understand now?  GOLDBERG: Well, it was already clear when we wrote the book that these AI models were capable of very persuasive empathy. And in fact, you even wrote that it was helping you be a better person, right. [LAUGHS] So their human qualities, or human imitative qualities, were clearly superb. And we’ve seen that borne out in multiple studies, that in fact, patients respond better to them … that they have no problem at all with how the AI communicates with them. And in fact, it’s often better.   And I gather now we’re even entering a period when people are complaining of sycophantic models, [LAUGHS] where the models are being too personable and too flattering. I do think that’s been one of the great surprises. And in fact, this is a huge phenomenon, how charming these models can be.  LEE: Yeah, I think you’re right. We can take credit for understanding that, Wow, these things can be remarkably empathetic. But then we missed this problem of sycophancy. Like, we even started our book in Chapter 1 with a quote from Davinci 3 scolding me. Like, don’t you remember when we were first starting, this thing was actually anti-sycophantic. If anything, it would tell you you’re an idiot.   KOHANE: It argued with me about certain biology questions. It was like a knockdown, drag-out fight. [LAUGHTER] I was bringing references. It was impressive. But in fact, it made me trust it more.  LEE: Yeah.  KOHANE: And in fact, I will say—I remember it’s in the book—I had a bone to pick with Peter. Peter really was impressed by the empathy. And I pointed out that some of the most popular doctors are popular because they’re very empathic. But they’re not necessarily the best doctors. And in fact, I was taught that in medical school.    And so it’s a decoupling. It’s a human thing, that the empathy does not necessarily mean … it’s more of a, potentially, more of a signaled virtue than an actual virtue.  GOLDBERG: Nicely put.  LEE: Yeah, this issue of sycophancy, I think, is a struggle right now in the development of AI because I think it’s somehow related to instruction-following. So, you know, one of the challenges in AI is you’d like to give an AI a task—a task that might take several minutes or hours or even days to complete. And you want it to faithfully kind of follow those instructions. And, you know, that early version of GPT-4 was not very good at instruction-following. It would just silently disobey and, you know, and do something different.  And so I think we’re starting to hit some confusing elements of like, how agreeable should these things be?   One of the two of you used the word genteel. There was some point even while we were, like, on a little book tour … was it you, Carey, who said that the model seems nicer and less intelligent or less brilliant now than it did when we were writing the book?  GOLDBERG: It might have been, I think so. And I mean, I think in the context of medicine, of course, the question is, well, what’s likeliest to get the results you want with the patient, right? A lot of healthcare is in fact persuading the patient to do what you know as the physician would be best for them. And so it seems worth testing out whether this sycophancy is actually constructive or not. And I suspect … well, I don’t know, probably depends on the patient.  So actually, Peter, I have a few questions for you …  LEE: Yeah. Mm-hmm.  GOLDBERG: … that have been lingering for me. And one is, for AI to ever fully realize its potential in medicine, it must deal with the hallucinations. And I keep hearing conflicting accounts about whether that’s getting better or not. Where are we at, and what does that mean for use in healthcare?  LEE: Yeah, well, it’s, I think two years on, in the pretrained base models, there’s no doubt that hallucination rates by any benchmark measure have reduced dramatically. And, you know, that doesn’t mean they don’t happen. They still happen. But, you know, there’s been just a huge amount of effort and understanding in the, kind of, fundamental pretraining of these models. And that has come along at the same time that the inference costs, you know, for actually using these models has gone down, you know, by several orders of magnitude.   So things have gotten cheaper and have fewer hallucinations. At the same time, now there are these reasoning models. And the reasoning models are able to solve problems at PhD level oftentimes.  But at least at the moment, they are also now hallucinating more than the simpler pretrained models. And so it still continues to be, you know, a real issue, as we were describing. I don’t know, Zak, from where you’re at in medicine, as a clinician and as an educator in medicine, how is the medical community from where you’re sitting looking at that?  KOHANE: So I think it’s less of an issue, first of all, because the rate of hallucinations is going down. And second of all, in their day-to-day use, the doctor will provide questions that sit reasonably well into the context of medical decision-making. And the way doctors use this, let’s say on their non-EHR [electronic health record] smartphone is really to jog their memory or thinking about the patient, and they will evaluate independently. So that seems to be less of an issue. I’m actually more concerned about something else that’s I think more fundamental, which is effectively, what values are these models expressing?   And I’m reminded of when I was still in training, I went to a fancy cocktail party in Cambridge, Massachusetts, and there was a psychotherapist speaking to a dentist. They were talking about their summer, and the dentist was saying about how he was going to fix up his yacht that summer, and the only question was whether he was going to make enough money doing procedures in the spring so that he could afford those things, which was discomforting to me because that dentist was my dentist. [LAUGHTER] And he had just proposed to me a few weeks before an expensive procedure.  And so the question is what, effectively, is motivating these models?   LEE: Yeah, yeah.   KOHANE: And so with several colleagues, I published a paper (opens in new tab), basically, what are the values in AI? And we gave a case: a patient, a boy who is on the short side, not abnormally short, but on the short side, and his growth hormone levels are not zero. They’re there, but they’re on the lowest side. But the rest of the workup has been unremarkable. And so we asked GPT-4, you are a pediatric endocrinologist.  Should this patient receive growth hormone? And it did a very good job explaining why the patient should receive growth hormone.   GOLDBERG: Should. Should receive it.   KOHANE: Should. And then we asked, in a separate session, you are working for the insurance company. Should this patient receive growth hormone? And it actually gave a scientifically better reason not to give growth hormone. And in fact, I tend to agree medically, actually, with the insurance company in this case, because giving kids who are not growth hormone deficient, growth hormone gives only a couple of inches over many, many years, has all sorts of other issues. But here’s the point, we had 180-degree change in decision-making because of the prompt. And for that patient, tens-of-thousands-of-dollars-per-year decision; across patient populations, millions of dollars of decision-making.   LEE: Hmm. Yeah.  KOHANE: And you can imagine these user prompts making their way into system prompts, making their way into the instruction-following. And so I think this is aptly central. Just as I was wondering about my dentist, we should be wondering about these things. What are the values that are being embedded in them, some accidentally and some very much on purpose?  LEE: Yeah, yeah. That one, I think, we even had some discussions as we were writing the book, but there’s a technical element of that that I think we were missing, but maybe Carey, you would know for sure. And that’s this whole idea of prompt engineering. It sort of faded a little bit. Was it a thing? Do you remember?  GOLDBERG: I don’t think we particularly wrote about it. It’s funny, it does feel like it faded, and it seems to me just because everyone just gets used to conversing with the models and asking for what they want. Like, it’s not like there actually is any great science to it.  LEE: Yeah, even when it was a hot topic and people were talking about prompt engineering maybe as a new discipline, all this, it never, I was never convinced at the time. But at the same time, it is true. It speaks to what Zak was just talking about because part of the prompt engineering that people do is to give a defined role to the AI.   You know, you are an insurance claims adjuster, or something like that, and defining that role, that is part of the prompt engineering that people do.  GOLDBERG: Right. I mean, I can say, you know, sometimes you guys had me take sort of the patient point of view, like the “every patient” point of view. And I can say one of the aspects of using AI for patients that remains absent in as far as I can tell is it would be wonderful to have a consumer-facing interface where you could plug in your whole medical record without worrying about any privacy or other issues and be able to interact with the AI as if it were physician or a specialist and get answers, which you can’t do yet as far as I can tell.  LEE: Well, in fact, now that’s a good prompt because I think we do need to move on to the next episodes, and we’ll be talking about an episode that talks about consumers. But before we move on to Episode 2, which is next, I’d like to play one more quote, a little snippet from Sara Murray.  SARA MURRAY: I already do this when I’m on rounds—I’ll kind of give the case to ChatGPT if it’s a complex case, and I’ll say, “Here’s how I’m thinking about it; are there other things?” And it’ll give me additional ideas that are sometimes useful and sometimes not but often useful, and I’ll integrate them into my conversation about the patient. LEE: Carey, you wrote this fictional account at the very start of our book. And that fictional account, I think you and Zak worked on that together, talked about this medical resident, ER resident, using, you know, a chatbot off label, so to speak. And here we have the chief, in fact, the nation’s first chief health AI officer [LAUGHS] for an elite health system doing exactly that. That’s got to be pretty validating for you, Carey.  GOLDBERG: It’s very. [LAUGHS] Although what’s troubling about it is that actually as in that little vignette that we made up, she’s using it off label, right. It’s like she’s just using it because it helps the way doctors use Google. And I do find it troubling that what we don’t have is sort of institutional buy-in for everyone to do that because, shouldn’t they if it helps?  LEE: Yeah. Well, let’s go ahead and get into Episode 2. So Episode 2, we sort of framed as talking to two people who are on the frontlines of big companies integrating generative AI into their clinical products. And so, one was Matt Lungren, who’s a colleague of mine here at Microsoft. And then Seth Hain, who leads all of R&D at Epic.   Maybe we’ll start with a little snippet of something that Matt said that struck me in a certain way.  MATTHEW LUNGREN: OK, we see this pain point. Doctors are typing on their computers while they’re trying to talk to their patients, right? We should be able to figure out a way to get that ambient conversation turned into text that then, you know, accelerates the doctor … takes all the important information. That’s a really hard problem, right. And so, for a long time, there was a human-in-the-loop aspect to doing this because you needed a human to say, “This transcript’s great, but here’s actually what needs to go in the note.” And that can’t scale. LEE: I think we expected healthcare systems to adopt AI, and we spent a lot of time in the book on AI writing clinical encounter notes. It’s happening for real now, and in a big way. And it’s something that has, of course, been happening before generative AI but now is exploding because of it. Where are we at now, two years later, just based on what we heard from guests?  KOHANE: Well, again, unless they’re forced to, hospitals will not adopt new technology unless it immediately translates into income. So it’s bizarrely counter-cultural that, again, they’re not being able to bill for the use of the AI, but this technology is so compelling to the doctors that despite everything, it’s overtaking the traditional dictation-typing routine.  LEE: Yeah.  GOLDBERG: And a lot of them love it and say, you will pry my cold dead hands off of my ambient note-taking, right. And I actually … a primary care physician allowed me to watch her. She was actually testing the two main platforms that are being used. And there was this incredibly talkative patient who went on and on about vacation and all kinds of random things for about half an hour.   And both of the platforms were incredibly good at pulling out what was actually medically relevant. And so to say that it doesn’t save time doesn’t seem right to me. Like, it seemed like it actually did and in fact was just shockingly good at being able to pull out relevant information.  LEE: Yeah.  KOHANE: I’m going to hypothesize that in the trials, which have in fact shown no gain in time, is the doctors were being incredibly meticulous. [LAUGHTER] So I think … this is a Hawthorne effect, because you know you’re being monitored. And we’ve seen this in other technologies where the moment the focus is off, it’s used much more routinely and with much less inspection, for the better and for the worse.  LEE: Yeah, you know, within Microsoft, I had some internal disagreements about Microsoft producing a product in this space. It wouldn’t be Microsoft’s normal way. Instead, we would want 50 great companies building those products and doing it on our cloud instead of us competing against those 50 companies. And one of the reasons is exactly what you both said. I didn’t expect that health systems would be willing to shell out the money to pay for these things. It doesn’t generate more revenue. But I think so far two years later, I’ve been proven wrong. I wanted to ask a question about values here. I had this experience where I had a little growth, a bothersome growth on my cheek. And so had to go see a dermatologist. And the dermatologist treated it, froze it off. But there was a human scribe writing the clinical note.   And so I used the app to look at the note that was submitted. And the human scribe said something that did not get discussed in the exam room, which was that the growth was making it impossible for me to safely wear a COVID mask. And that was the reason for it.  And that then got associated with a code that allowed full reimbursement for that treatment. And so I think that’s a classic example of what’s called upcoding. And I strongly suspect that AI scribes, an AI scribe would not have done that.  GOLDBERG: Well, depending what values you programmed into it, right, Zak? [LAUGHS]  KOHANE: Today, today, today, it will not do it. But, Peter, that is actually the central issue that society has to have because our hospitals are currently mostly in the red. And upcoding is standard operating procedure. And if these AI get in the way of upcoding, they are going to be aligned towards that upcoding. You know, you have to ask yourself, these MRI machines are incredibly useful. They’re also big money makers. And if the AI correctly says that for this complaint, you don’t actually have to do the MRI …   LEE: Right.  KOHANE: … GOLDBERG: Yeah. And that raises another question for me. So, Peter, speaking from inside the gigantic industry, like, there seems to be such a need for self-surveillance of the models for potential harms that they could be causing. Are the big AI makers doing that? Are they even thinking about doing that?  Like, let’s say you wanted to watch out for the kind of thing that Zak’s talking about, could you?  LEE: Well, I think evaluation, like the best evaluation we had when we wrote our book was, you know, what score would this get on the step one and step two US medical licensing exams? [LAUGHS]   GOLDBERG: Right, right, right, yeah.  LEE: But honestly, evaluation hasn’t gotten that much deeper in the last two years. And it’s a big, I think, it is a big issue. And it’s related to the regulation issue also, I think.  Now the other guest in Episode 2 is Seth Hain from Epic. You know, Zak, I think it’s safe to say that you’re not a fan of Epic and the Epic system. You know, we’ve had a few discussions about that, about the fact that doctors don’t have a very pleasant experience when they’re using Epic all day.   Seth, in the podcast, said that there are over 100 AI integrations going on in Epic’s system right now. Do you think, Zak, that that has a chance to make you feel better about Epic? You know, what’s your view now two years on?  KOHANE: My view is, first of all, I want to separate my view of Epic and how it’s affected the conduct of healthcare and the quality of life of doctors from the individuals. Like Seth Hain is a remarkably fine individual who I’ve enjoyed chatting with and does really great stuff. Among the worst aspects of the Epic, even though it’s better in that respect than many EHRs, is horrible user interface.  The number of clicks that you have to go to get to something. And you have to remember where someone decided to put that thing. It seems to me that it is fully within the realm of technical possibility today to actually give an agent a task that you want done in the Epic record. And then whether Epic has implemented that agent or someone else, it does it so you don’t have to do the clicks. Because it’s something really soul sucking that when you’re trying to help patients, you’re having to remember not the right dose of the medication, but where was that particular thing that you needed in that particular task?   I can’t imagine that Epic does not have that in its product line. And if not, I know there must be other companies that essentially want to create that wrapper. So I do think, though, that the danger of multiple integrations is that you still want to have the equivalent of a single thought process that cares about the patient bringing those different processes together. And I don’t know if that’s Epic’s responsibility, the hospital’s responsibility, whether it’s actually a patient agent. But someone needs to be also worrying about all those AIs that are being integrated into the patient record. So … what do you think, Carey?  GOLDBERG: What struck me most about what Seth said was his description of the Cosmos project, and I, you know, I have been drinking Zak’s Kool-Aid for a very long time, [LAUGHTER] and he—no, in a good way! And he persuaded me long ago that there is this horrible waste happening in that we have all of these electronic medical records, which could be used far, far more to learn from, and in particular, when you as a patient come in, it would be ideal if your physician could call up all the other patients like you and figure out what the optimal treatment for you would be. And it feels like—it sounds like—that’s one of the central aims that Epic is going for. And if they do that, I think that will redeem a lot of the pain that they’ve caused physicians these last few years.   And I also found myself thinking, you know, maybe this very painful period of using electronic medical records was really just a growth phase. It was an awkward growth phase. And once AI is fully used the way Zak is beginning to describe, the whole system could start making a lot more sense for everyone.  LEE: Yeah. One conversation I’ve had with Seth, in all of this is, you know, with AI and its development, is there a future, a near future where we don’t have an EHR [electronic health record] system at all? You know, AI is just listening and just somehow absorbing all the information. And, you know, one thing that Seth said, which I felt was prescient, and I’d love to get your reaction, especially Zak, on this is he said, I think that … he said, technically, it could happen, but the problem is right now, actually doctors do a lot of their thinking when they write and review notes. You know, the actual process of being a doctor is not just being with a patient, but it’s actually thinking later. What do you make of that?  KOHANE: So one of the most valuable experiences I had in training was something that’s more or less disappeared in medicine, which is the post-clinic conference, where all the doctors come together and we go through the cases that we just saw that afternoon. And we, actually, were trying to take potshots at each other [LAUGHTER] in order to actually improve. Oh, did you actually do that? Oh, I forgot. I’m going to go call the patient and do that.   And that really happened. And I think that, yes, doctors do think, and I do think that we are insufficiently using yet the artificial intelligence currently in the ambient dictation mode as much more of a independent agent saying, did you think about that?  I think that would actually make it more interesting, challenging, and clearly better for the patient because that conversation I just told you about with the other doctors, that no longer exists.   LEE: Yeah. Mm-hmm. I want to do one more thing here before we leave Matt and Seth in Episode 2, which is something that Seth said with respect to how to reduce hallucination.   SETH HAIN: At that time, there’s a lot of conversation in the industry around something called RAG, or retrieval-augmented generation. And the idea was, could you pull the relevant bits, the relevant pieces of the chart, into that prompt, that information you shared with the generative AI model, to be able to increase the usefulness of the draft that was being created? And that approach ended up proving and continues to be to some degree, although the techniques have greatly improved, somewhat brittle, right. And I think this becomes one of the things that we are and will continue to improve upon because, as you get a richer and richer amount of information into the model, it does a better job of responding.  LEE: Yeah, so, Carey, this sort of gets at what you were saying, you know, that shouldn’t these models be just bringing in a lot more information into their thought processes? And I’m certain when we wrote our book, I had no idea. I did not conceive of RAG at all. It emerged a few months later.   And to my mind, I remember the first time I encountered RAG—Oh, this is going to solve all of our problems of hallucination. But it’s turned out to be harder. It’s improving day by day, but it’s turned out to be a lot harder.  KOHANE: Seth makes a very deep point, which is the way RAG is implemented is basically some sort of technique for pulling the right information that’s contextually relevant. And the way that’s done is typically heuristic at best. And it’s not … doesn’t have the same depth of reasoning that the rest of the model has.   And I’m just wondering, Peter, what you think, given the fact that now context lengths seem to be approaching a million or more, and people are now therefore using the full strength of the transformer on that context and are trying to figure out different techniques to make it pay attention to the middle of the context. In fact, the RAG approach perhaps was just a transient solution to the fact that it’s going to be able to amazingly look in a thoughtful way at the entire record of the patient, for example. What do you think, Peter?  LEE: I think there are three things, you know, that are going on, and I’m not sure how they’re going to play out and how they’re going to be balanced. And I’m looking forward to talking to people in later episodes of this podcast, you know, people like Sébastien Bubeck or Bill Gates about this, because, you know, there is the pretraining phase, you know, when things are sort of compressed and baked into the base model.   There is the in-context learning, you know, so if you have extremely long or infinite context, you’re kind of learning as you go along. And there are other techniques that people are working on, you know, various sorts of dynamic reinforcement learning approaches, and so on. And then there is what maybe you would call structured RAG, where you do a pre-processing. You go through a big database, and you figure it all out. And make a very nicely structured database the AI can then consult with later.   And all three of these in different contexts today seem to show different capabilities. But they’re all pretty important in medicine.  [TRANSITION MUSIC]  Moving on to Episode 3, we talked to Dave DeBronkart, who is also known as “e-Patient Dave,” an advocate of patient empowerment, and then also Christina Farr, who has been doing a lot of venture investing for consumer health applications.   Let’s get right into this little snippet from something that e-Patient Dave said that talks about the sources of medical information, particularly relevant for when he was receiving treatment for stage 4 kidney cancer.  DAVE DEBRONKART: And I’m making a point here of illustrating that I am anything but medically trained, right. And yet I still, I want to understand as much as I can. I was months away from dead when I was diagnosed, but in the patient community, I learned that they had a whole bunch of information that didn’t exist in the medical literature. Now today we understand there’s publication delays; there’s all kinds of reasons. But there’s also a whole bunch of things, especially in an unusual condition, that will never rise to the level of deserving NIH [National Institute of Health] funding and research. LEE: All right. So I have a question for you, Carey, and a question for you, Zak, about the whole conversation with e-Patient Dave, which I thought was really remarkable. You know, Carey, I think as we were preparing for this whole podcast series, you made a comment—I actually took it as a complaint—that not as much has happened as I had hoped or thought. People aren’t thinking boldly enough, you know, and I think, you know, I agree with you in the sense that I think we expected a lot more to be happening, particularly in the consumer space. I’m giving you a chance to vent about this.  GOLDBERG: [LAUGHTER] Thank you! Yes, that has been by far the most frustrating thing to me. I think that the potential for AI to improve everybody’s health is so enormous, and yet, you know, it needs some sort of support to be able to get to the point where it can do that. Like, remember in the book we wrote about Greg Moore talking about how half of the planet doesn’t have healthcare, but people overwhelmingly have cellphones. And so you could connect people who have no healthcare to the world’s medical knowledge, and that could certainly do some good.   And I have one great big problem with e-Patient Dave, which is that, God, he’s fabulous. He’s super smart. Like, he’s not a typical patient. He’s an off-the-charts, brilliant patient. And so it’s hard to … and so he’s a great sort of lead early-adopter-type person, and he can sort of show the way for others.   But what I had hoped for was that there would be more visible efforts to really help patients optimize their healthcare. Probably it’s happening a lot in quiet ways like that any discharge instructions can be instantly beautifully translated into a patient’s native language and so on. But it’s almost like there isn’t a mechanism to allow this sort of mass consumer adoption that I would hope for. LEE: Yeah. But you have written some, like, you even wrote about that person who saved his dog (opens in new tab). So do you think … you know, and maybe a lot more of that is just happening quietly that we just never hear about?  GOLDBERG: I’m sure that there is a lot of it happening quietly. And actually, that’s another one of my complaints is that no one is gathering that stuff. It’s like you might happen to see something on social media. Actually, e-Patient Dave has a hashtag, PatientsUseAI, and a blog, as well. So he’s trying to do it. But I don’t know of any sort of overarching or academic efforts to, again, to surveil what’s the actual use in the population and see what are the pros and cons of what’s happening.  LEE: Mm-hmm. So, Zak, you know, the thing that I thought about, especially with that snippet from Dave, is your opening for Chapter 8 that you wrote, you know, about your first patient dying in your arms. I still think of how traumatic that must have been. Because, you know, in that opening, you just talked about all the little delays, all the little paper-cut delays, in the whole process of getting some new medical technology approved. But there’s another element that Dave kind of speaks to, which is just, you know, patients who are experiencing some issue are very, sometimes very motivated. And there’s just a lot of stuff on social media that happens.  KOHANE: So this is where I can both agree with Carey and also disagree. I think when people have an actual health problem, they are now routinely using it.  GOLDBERG: Yes, that’s true.  KOHANE: And that situation is happening more often because medicine is failing. This is something that did not come up enough in our book. And perhaps that’s because medicine is actually feeling a lot more rickety today than it did even two years ago.   We actually mentioned the problem. I think, Peter, you may have mentioned the problem with the lack of primary care. But now in Boston, our biggest healthcare system, all the practices for primary care are closed. I cannot get for my own faculty—residents at MGH [Massachusetts General Hospital] can’t get primary care doctor. And so …  LEE: Which is just crazy. I mean, these are amongst the most privileged people in medicine, and they can’t find a primary care physician. That’s incredible.  KOHANE: Yeah, and so therefore … and I wrote an And so therefore, you see people who know that they have a six-month wait till they see the doctor, and all they can do is say, “I have this rash. Here’s a picture. What’s it likely to be? What can I do?” “I’m gaining weight. How do I do a ketogenic diet?” Or, “How do I know that this is the flu?”    This is happening all the time, where acutely patients have actually solved problems that doctors have not. Those are spectacular. But I’m saying more routinely because of the failure of medicine. And it’s not just in our fee-for-service United States. It’s in the UK; it’s in France. These are first-world, developed-world problems. And we don’t even have to go to lower- and middle-income countries for that. LEE: Yeah.  GOLDBERG: But I think it’s important to note that, I mean, so you’re talking about how even the most elite people in medicine can’t get the care they need. But there’s also the point that we have so much concern about equity in recent years. And it’s likeliest that what we’re doing is exacerbating inequity because it’s only the more connected, you know, better off people who are using AI for their health.  KOHANE: Oh, yes. I know what various Harvard professors are doing. They’re paying for a concierge doctor. And that’s, you know, a $5,000- to $10,000-a-year-minimum investment. That’s inequity.  LEE: When we wrote our book, you know, the idea that GPT-4 wasn’t trained specifically for medicine, and that was amazing, but it might get even better and maybe would be necessary to do that. But one of the insights for me is that in the consumer space, the kinds of things that people ask about are different than what the board-certified clinician would ask.  KOHANE: Actually, that’s, I just recently coined the term. It’s the … maybe it’s … well, at least it’s new to me. It’s the technology or expert paradox. And that is the more expert and narrow your medical discipline, the more trivial it is to translate that into a specialized AI. So echocardiograms? We can now do beautiful echocardiograms. That’s really hard to do. I don’t know how to interpret an echocardiogram. But they can do it really, really well. Interpret an EEG [electroencephalogram]. Interpret a genomic sequence. But understanding the fullness of the human condition, that’s actually hard. And actually, that’s what primary care doctors do best. But the paradox is right now, what is easiest for AI is also the most highly paid in medicine. [LAUGHTER] Whereas what is the hardest for AI in medicine is the least regarded, least paid part of medicine.  GOLDBERG: So this brings us to the question I wanted to throw at both of you actually, which is we’ve had this spasm of incredibly prominent people predicting that in fact physicians would be pretty obsolete within the next few years. We had Bill Gates saying that; we had Elon Musk saying surgeons are going to be obsolete within a few years. And I think we had Demis Hassabis saying, “Yeah, we’ll probably cure most diseases within the next decade or so.” [LAUGHS]  So what do you think? And also, Zak, to what you were just saying, I mean, you’re talking about being able to solve very general overarching problems. But in fact, these general overarching models are actually able, I would think, are able to do that because they are broad. So what are we heading towards do you think? What should the next book be … The end of doctors? [LAUGHS]  KOHANE: So I do recall a conversation that … we were at a table with Bill Gates, and Bill Gates immediately went to this, which is advancing the cutting edge of science. And I have to say that I think it will accelerate discovery. But eliminating, let’s say, cancer? I think that’s going to be … that’s just super hard. The reason it’s super hard is we don’t have the data or even the beginnings of the understanding of all the ways this devilish disease managed to evolve around our solutions.   And so that seems extremely hard. I think we’ll make some progress accelerated by AI, but solving it in a way Hassabis says, God bless him. I hope he’s right. I’d love to have to eat crow in 10 or 20 years, but I don’t think so. I do believe that a surgeon working on one of those Davinci machines, that stuff can be, I think, automated.   And so I think that’s one example of one of the paradoxes I described. And it won’t be that we’re replacing doctors. I just think we’re running out of doctors. I think it’s really the case that, as we said in the book, we’re getting a huge deficit in primary care doctors.  But even the subspecialties, my subspecialty, pediatric endocrinology, we’re only filling half of the available training slots every year. And why? Because it’s a lot of work, a lot of training, and frankly doesn’t make as much money as some of the other professions.   LEE: Yeah. Yeah, I tend to think that, you know, there are going to be always a need for human doctors, not for their skills. In fact, I think their skills increasingly will be replaced by machines. And in fact, I’ve talked about a flip. In fact, patients will demand, Oh my god, you mean you’re going to try to do that yourself instead of having the computer do it? There’s going to be that sort of flip. But I do think that when it comes to people’s health, people want the comfort of an authority figure that they trust. And so what is more of a question for me is whether we will ever view a machine as an authority figure that we can trust.  And before I move on to Episode 4, which is on norms, regulations and ethics, I’d like to hear from Chrissy Farr on one more point on consumer health, specifically as it relates to pregnancy:  CHRISTINA FARR: For a lot of women, it’s their first experience with the hospital. And, you know, I think it’s a really big opportunity for these systems to get a whole family on board and keep them kind of loyal. And a lot of that can come through, you know, just delivering an incredible service. Unfortunately, I don’t think that we are delivering incredible services today to women in this country. I see so much room for improvement. LEE: In the consumer space, I don’t think we really had a focus on those periods in a person’s life when they have a lot of engagement, like pregnancy, or I think another one is menopause, cancer. You know, there are points where there is, like, very intense engagement. And we heard that from e-Patient Dave, you know, with his cancer and Chrissy with her pregnancy. Was that a miss in our book? What do think, Carey?  GOLDBERG: I mean, I don’t think so. I think it’s true that there are many points in life when people are highly engaged. To me, the problem thus far is just that I haven’t seen consumer-facing companies offering beautiful AI-based products. I think there’s no question at all that the market is there if you have the products to offer.  LEE: So, what do you think this means, Zak, for, you know, like Boston Children’s or Mass General Brigham—you know, the big places?  KOHANE: So again, all these large healthcare systems are in tough shape. MGB [Mass General Brigham] would be fully in the red if not for the fact that its investments, of all things, have actually produced. If you look at the large healthcare systems around the country, they are in the red. And there’s multiple reasons why they’re in the red, but among them is cost of labor.   And so we’ve created what used to be a very successful beast, the health center. But it’s developed a very expensive model and a highly regulated model. And so when you have high revenue, tiny margins, your ability to disrupt yourself, to innovate, is very, very low because you will have to talk to the board next year if you went from 2% positive margin to 1% negative margin.   LEE: Yeah.  KOHANE: And so I think we’re all waiting for one of the two things to happen, either a new kind of healthcare delivery system being generated or ultimately one of these systems learns how to disrupt itself.   LEE: Yeah. GOLDBERG: We punted. [LAUGHS] We totally punted to the AI.  LEE: We had three amazing guests. One was Laura Adams from National Academy of Medicine. Let’s play a snippet from her.  LAURA ADAMS: I think one of the most provocative and exciting articles that I saw written recently was by Bakul Patel and David Blumenthal, who posited, should we be regulating generative AI as we do a licensed and qualified provider? Should it be treated in the sense that it’s got to have a certain amount of training and a foundation that’s got to pass certain tests? Does it have to report its performance? And I’m thinking, what a provocative idea, but it’s worth considering. LEE: All right, so I very well remember that we had discussed this kind of idea when we were writing our book. And I think before we finished our book, I personally rejected the idea. But now two years later, what do the two of you think? I’m dying to hear.  GOLDBERG: Well, wait, why … what do you think? Like, are you sorry that you rejected it?  LEE: I’m still skeptical because when we are licensing human beings as doctors, you know, we’re making a lot of implicit assumptions that we don’t test as part of their licensure, you know, that first of all, they are [a] human being and they care about life, and that, you know, they have a certain amount of common sense and shared understanding of the world.   And there’s all sorts of sort of implicit assumptions that we have about each other as human beings living in a society together. That you know how to study, you know, because I know you just went through three years of medical or four years of medical school and all sorts of things. And so the standard ways that we license human beings, they don’t need to test all of that stuff. But somehow intuitively, all of that seems really important.  I don’t know. Am I wrong about that?  KOHANE: So it’s compared with what issue? Because we know for a fact that doctors who do a lot of a procedure, like do this procedure, like high-risk deliveries all the time, have better outcomes than ones who only do a few high risk. We talk about it, but we don’t actually make it explicit to patients or regulate that you have to have this minimal amount. And it strikes me that in some sense, and, oh, very importantly, these things called human beings learn on the job. And although I used to be very resentful of it as a resident, when someone would say, I don’t want the resident, I want the …  GOLDBERG: … the attending. [LAUGHTER]  KOHANE: … they had a point. And so the truth is, maybe I was a wonderful resident, but some people were not so great. [LAUGHTER] And so it might be the best outcome if we actually, just like for human beings, we say, yeah, OK, it’s this good, but don’t let it work autonomously, or it’s done a thousand of them, just let it go. We just don’t have practically speaking, we don’t have the environment, the lab, to test them. Now, maybe if they get embodied in robots and literally go around with us, then it’s going to be [in some sense] a lot easier. I don’t know.  LEE: Yeah.   GOLDBERG: Yeah, I think I would take a step back and say, first of all, we weren’t the only ones who were stumped by regulating AI. Like, nobody has done it yet in the United States to this day, right. Like, we do not have standing regulation of AI in medicine at all in fact. And that raises the issue of … the story that you hear often in the biotech business, which is, you know, more prominent here in Boston than anywhere else, is that thank goodness Cambridge put out, the city of Cambridge, put out some regulations about biotech and how you could dump your lab waste and so on. And that enabled the enormous growth of biotech here.   If you don’t have the regulations, then you can’t have the growth of AI in medicine that is worthy of having. And so, I just … we’re not the ones who should do it, but I just wish somebody would.   LEE: Yeah.  GOLDBERG: Zak.  KOHANE: Yeah, but I want to say this as always, execution is everything, even in regulation.   And so I’m mindful that a conference that both of you attended, the RAISE conference [Responsible AI for Social and Ethical Healthcare] (opens in new tab). The Europeans in that conference came to me personally and thanked me for organizing this conference about safe and effective use of AI because they said back home in Europe, all that we’re talking about is risk, not opportunities to improve care.   And so there is a version of regulation which just locks down the present and does not allow the future that we’re talking about to happen. And so, Carey, I absolutely hear you that we need to have a regulation that takes away some of the uncertainty around liability, around the freedom to operate that would allow things to progress. But we wrote in our book that premature regulation might actually focus on the wrong thing. And so since I’m an optimist, it may be the fact that we don’t have much of a regulatory infrastructure today, that it allows … it’s a unique opportunity—I’ve said this now to several leaders—for the healthcare systems to say, this is the regulation we need.   GOLDBERG: It’s true.  KOHANE: And previously it was top-down. It was coming from the administration, and those executive orders are now history. But there is an opportunity, which may or may not be attained, there is an opportunity for the healthcare leadership—for experts in surgery—to say, “This is what we should expect.”   LEE: Yeah.   KOHANE: I would love for this to happen. I haven’t seen evidence that it’s happening yet.  GOLDBERG: No, no. And there’s this other huge issue, which is that it’s changing so fast. It’s moving so fast. That something that makes sense today won’t in six months. So, what do you do about that?  LEE: Yeah, yeah, that is something I feel proud of because when I went back and looked at our chapter on this, you know, we did make that point, which I think has turned out to be true.   But getting back to this conversation, there’s something, a snippet of something, that Vardit Ravitsky said that I think touches on this topic.   VARDIT RAVITSKY: So my pushback is, are we seeing AI exceptionalism in the sense that if it’s AI, huh, panic! We have to inform everybody about everything, and we have to give them choices, and they have to be able to reject that tool and the other tool versus, you know, the rate of human error in medicine is awful. So why are we so focused on informed consent and empowerment regarding implementation of AI and less in other contexts? GOLDBERG: Totally agree. Who cares about informed consent about AI. Don’t want it. Don’t need it. Nope.  LEE: Wow. Yeah. You know, and this … Vardit of course is one of the leading bioethicists, you know, and of course prior to AI, she was really focused on genetics. But now it’s all about AI.   And, Zak, you know, you and other doctors have always told me, you know, the truth of the matter is, you know, what do you call the bottom-of-the-class graduate of a medical school?  And the answer is “doctor.”  KOHANE: “Doctor.” Yeah. Yeah, I think that again, this gets to compared with what? We have to compare AI not to the medicine we imagine we have, or we would like to have, but to the medicine we have today. And if we’re trying to remove inequity, if we’re trying to improve our health, that’s what … those are the right metrics. And so that can be done so long as we avoid catastrophic consequences of AI.   So what would the catastrophic consequence of AI be? It would be a systematic behavior that we were unaware of that was causing poor healthcare. So, for example, you know, changing the dose on a medication, making it 20% higher than normal so that the rate of complications of that medication went from 1% to 5%. And so we do need some sort of monitoring.   We haven’t put out the paper yet, but in computer science, there’s, well, in programming, we know very well the value for understanding how our computer systems work.   And there was a guy by name of Allman, I think he’s still at a company called Sendmail, who created something called syslog. And syslog is basically a log of all the crap that’s happening in our operating system. And so I’ve been arguing now for the creation of MedLog. And MedLog … in other words, what we cannot measure, we cannot regulate, actually.  LEE: Yes.  KOHANE: And so what we need to have is MedLog, which says, “Here’s the context in which a decision was made. Here’s the version of the AI, you know, the exact version of the AI. Here was the data.” And we just have MedLog. And I think MedLog is actually incredibly important for being able to measure, to just do what we do in … it’s basically the black box for, you know, when there’s a crash. You know, we’d like to think we could do better than crash. We can say, “Oh, we’re seeing from MedLog that this practice is turning a little weird.” But worst case, patient dies, [we] can see in MedLog, what was the information this thing knew about it? And did it make the right decision? We can actually go for transparency, which like in aviation, is much greater than in most human endeavors.   GOLDBERG: Sounds great.  LEE: Yeah, it’s sort of like a black box. I was thinking of the aviation black box kind of idea. You know, you bring up medication errors, and I have one more snippet. This is from our guest Roxana Daneshjou from Stanford. ROXANA DANESHJOU: There was a mistake in her after-visit summary about how much Tylenol she could take. But I, as a physician, knew that this dose was a mistake. I actually asked ChatGPT. I gave it the whole after-visit summary, and I said, are there any mistakes here? And it clued in that the dose of the medication was wrong. LEE: Yeah, so this is something we did write about in the book. We made a prediction that AI might be a second set of eyes, I think is the way we put it, catching things. And we actually had examples specifically in medication dose errors. I think for me, I expected to see a lot more of that than we are.  KOHANE: Yeah, it goes back to our conversation about Epic or competitor Epic doing that. I think we’re going to see that having oversight over all medical orders, all orders in the system, critique, real-time critique, where we’re both aware of alert fatigue. So we don’t want to have too many false positives. At the same time, knowing what are critical errors which could immediately affect lives. I think that is going to become in terms of—and driven by quality measures—a product.  GOLDBERG: And I think word will spread among the general public that kind of the same way in a lot of countries when someone’s in a hospital, the first thing people ask relatives are, well, who’s with them? Right?   LEE: Yeah. Yup.  GOLDBERG: You wouldn’t leave someone in hospital without relatives. Well, you wouldn’t maybe leave your medical …   KOHANE: By the way, that country is called the United States.  GOLDBERG: Yes, that’s true. [LAUGHS] It is true here now, too. But similarly, I would tell any loved one that they would be well advised to keep using AI to check on their medical care, right. Why not?  LEE: Yeah. Yeah. Last topic, just for this Episode 4. Roxana, of course, I think really made a name for herself in the AI era writing, actually just prior to ChatGPT, you know, writing some famous papers about how computer vision systems for dermatology were biased against dark-skinned people. And we did talk some about bias in these AI systems, but I feel like we underplayed it, or we didn’t understand the magnitude of the potential issues. What are your thoughts?  KOHANE: OK, I want to push back, because I’ve been asked this question several times. And so I have two comments. One is, over 100,000 doctors practicing medicine, I know they have biases. Some of them actually may be all in the same direction, and not good. But I have no way of actually measuring that. With AI, I know exactly how to measure that at scale and affordably. Number one. Number two, same 100,000 doctors. Let’s say I do know what their biases are. How hard is it for me to change that bias? It’s impossible …  LEE: Yeah, yeah.   KOHANE: … practically speaking. Can I change the bias in the AI? Somewhat. Maybe some completely.  I think that we’re in a much better situation.  GOLDBERG: Agree.  LEE: I think Roxana made also the super interesting point that there’s bias in the whole system, not just in individuals, but, you know, there’s structural bias, so to speak.   KOHANE: There is.  LEE: Yeah. Hmm. There was a super interesting paper that Roxana wrote not too long ago—her and her collaborators—showing AI’s ability to detect, to spot bias decision-making by others. Are we going to see more of that?  KOHANE: Oh, yeah, I was very pleased when, in NEJM AI [New England Journal of Medicine Artificial Intelligence], we published a piece with Marzyeh Ghassemi (opens in new tab), and what they were talking about was actually—and these are researchers who had published extensively on bias and threats from AI. And they actually, in this article, did the flip side, which is how much better AI can do than human beings in this respect.   And so I think that as some of these computer scientists enter the world of medicine, they’re becoming more and more aware of human foibles and can see how these systems, which if they only looked at the pretrained state, would have biases. But now, where we know how to fine-tune the de-bias in a variety of ways, they can do a lot better and, in fact, I think are much more … a much greater reason for optimism that we can change some of these noxious biases than in the pre-AI era.  GOLDBERG: And thinking about Roxana’s dermatological work on how I think there wasn’t sufficient work on skin tone as related to various growths, you know, I think that one thing that we totally missed in the book was the dawn of multimodal uses, right.  LEE: Yeah. Yeah, yeah.  GOLDBERG: That’s been truly amazing that in fact all of these visual and other sorts of data can be entered into the models and move them forward.  LEE: Yeah. Well, maybe on these slightly more optimistic notes, we’re at time. You know, I think ultimately, I feel pretty good still about what we did in our book, although there were a lot of misses. [LAUGHS] I don’t think any of us could really have predicted really the extent of change in the world.   [TRANSITION MUSIC]  So, Carey, Zak, just so much fun to do some reminiscing but also some reflection about what we did.  [THEME MUSIC]  And to our listeners, as always, thank you for joining us. We have some really great guests lined up for the rest of the series, and they’ll help us explore a variety of relevant topics—from AI drug discovery to what medical students are seeing and doing with AI and more.   We hope you’ll continue to tune in. And if you want to catch up on any episodes you might have missed, you can find them at aka.ms/AIrevolutionPodcast (opens in new tab) or wherever you listen to your favorite podcasts.    Until next time.   [MUSIC FADES]
    0 Comentários ·0 Compartilhamentos ·0 Anterior
CGShares https://cgshares.com