• So, there’s this thing called the Franck-Hertz experiment. It’s one of those physics experiments that people rave about, but honestly, I don’t get why. It was done way back in 1914, and it’s supposed to explain how energy comes in these “packets” called “quanta.” Sounds fancy, but like, does it really change anything?

    They say this experiment marked the start of quantum physics, which I guess is important for some. It’s all about those little particles and how they behave. If you’re into that sort of thing, you might want to look into doing a DIY version of the Franck-Hertz experiment. Apparently, it’s not too hard and you can even do it at home. But let’s be real, who has the energy for that?

    You just set up a tube with some mercury vapor and run some voltage through it. Then you measure the current and see how it changes as you adjust the voltage. It’s all about those energy levels and how electrons bounce around. But, like, I don’t know how many people are actually excited to do this. Maybe if you’re a physics enthusiast, it’ll be fun for you.

    But if you’re like me and prefer to just scroll through your phone or binge-watch a show, then this sounds like a lot of work for not much payoff. I mean, who really wants to dive into the intricacies of quantum physics when there are so many other things to do—like anything else?

    So, if you’re curious about the Franck-Hertz experiment and want to try it yourself, go ahead. Just know that you might end up feeling a bit underwhelmed. Science can be cool, but sometimes it feels like a chore, especially when it’s all about tiny particles that you can’t even see.

    Anyway, that’s my take on it. If you’re still interested in quantum physics after this, good for you. I’ll just be over here, probably napping or scrolling through social media.

    #FranckHertz #QuantumPhysics #DIYScience #PhysicsExperiment #Boredom
    So, there’s this thing called the Franck-Hertz experiment. It’s one of those physics experiments that people rave about, but honestly, I don’t get why. It was done way back in 1914, and it’s supposed to explain how energy comes in these “packets” called “quanta.” Sounds fancy, but like, does it really change anything? They say this experiment marked the start of quantum physics, which I guess is important for some. It’s all about those little particles and how they behave. If you’re into that sort of thing, you might want to look into doing a DIY version of the Franck-Hertz experiment. Apparently, it’s not too hard and you can even do it at home. But let’s be real, who has the energy for that? You just set up a tube with some mercury vapor and run some voltage through it. Then you measure the current and see how it changes as you adjust the voltage. It’s all about those energy levels and how electrons bounce around. But, like, I don’t know how many people are actually excited to do this. Maybe if you’re a physics enthusiast, it’ll be fun for you. But if you’re like me and prefer to just scroll through your phone or binge-watch a show, then this sounds like a lot of work for not much payoff. I mean, who really wants to dive into the intricacies of quantum physics when there are so many other things to do—like anything else? So, if you’re curious about the Franck-Hertz experiment and want to try it yourself, go ahead. Just know that you might end up feeling a bit underwhelmed. Science can be cool, but sometimes it feels like a chore, especially when it’s all about tiny particles that you can’t even see. Anyway, that’s my take on it. If you’re still interested in quantum physics after this, good for you. I’ll just be over here, probably napping or scrolling through social media. #FranckHertz #QuantumPhysics #DIYScience #PhysicsExperiment #Boredom
    A DIY Version of the Franck-Hertz Experiment
    The Franck–Hertz experiment was a pioneering physics observation announced in 1914 which explained that energy came in “packets” which we call “quanta”, marking the beginning of quantum physics. Recently, [Markus …read m
    Like
    Love
    Wow
    Sad
    Angry
    536
    1 Комментарии 0 Поделились
  • Fusion and AI: How private sector tech is powering progress at ITER

    In April 2025, at the ITER Private Sector Fusion Workshop in Cadarache, something remarkable unfolded. In a room filled with scientists, engineers and software visionaries, the line between big science and commercial innovation began to blur.  
    Three organisations – Microsoft Research, Arena and Brigantium Engineering – shared how artificial intelligence, already transforming everything from language models to logistics, is now stepping into a new role: helping humanity to unlock the power of nuclear fusion. 
    Each presenter addressed a different part of the puzzle, but the message was the same: AI isn’t just a buzzword anymore. It’s becoming a real tool – practical, powerful and indispensable – for big science and engineering projects, including fusion. 
    “If we think of the agricultural revolution and the industrial revolution, the AI revolution is next – and it’s coming at a pace which is unprecedented,” said Kenji Takeda, director of research incubations at Microsoft Research. 
    Microsoft’s collaboration with ITER is already in motion. Just a month before the workshop, the two teams signed a Memorandum of Understandingto explore how AI can accelerate research and development. This follows ITER’s initial use of Microsoft technology to empower their teams.
    A chatbot in Azure OpenAI service was developed to help staff navigate technical knowledge, on more than a million ITER documents, using natural conversation. GitHub Copilot assists with coding, while AI helps to resolve IT support tickets – those everyday but essential tasks that keep the lights on. 
    But Microsoft’s vision goes deeper. Fusion demands materials that can survive extreme conditions – heat, radiation, pressure – and that’s where AI shows a different kind of potential. MatterGen, a Microsoft Research generative AI model for materials, designs entirely new materials based on specific properties.
    “It’s like ChatGPT,” said Takeda, “but instead of ‘Write me a poem’, we ask it to design a material that can survive as the first wall of a fusion reactor.” 
    The next step? MatterSim – a simulation tool that predicts how these imagined materials will behave in the real world. By combining generation and simulation, Microsoft hopes to uncover materials that don’t yet exist in any catalogue. 
    While Microsoft tackles the atomic scale, Arena is focused on a different challenge: speeding up hardware development. As general manager Michael Frei put it: “Software innovation happens in seconds. In hardware, that loop can take months – or years.” 
    Arena’s answer is Atlas, a multimodal AI platform that acts as an extra set of hands – and eyes – for engineers. It can read data sheets, interpret lab results, analyse circuit diagrams and even interact with lab equipment through software interfaces. “Instead of adjusting an oscilloscope manually,” said Frei, “you can just say, ‘Verify the I2Cprotocol’, and Atlas gets it done.” 
    It doesn’t stop there. Atlas can write and adapt firmware on the fly, responding to real-time conditions. That means tighter feedback loops, faster prototyping and fewer late nights in the lab. Arena aims to make building hardware feel a little more like writing software – fluid, fast and assisted by smart tools. 

    Fusion, of course, isn’t just about atoms and code – it’s also about construction. Gigantic, one-of-a-kind machines don’t build themselves. That’s where Brigantium Engineering comes in.
    Founder Lynton Sutton explained how his team uses “4D planning” – a marriage of 3D CAD models and detailed construction schedules – to visualise how everything comes together over time. “Gantt charts are hard to interpret. 3D models are static. Our job is to bring those together,” he said. 
    The result is a time-lapse-style animation that shows the construction process step by step. It’s proven invaluable for safety reviews and stakeholder meetings. Rather than poring over spreadsheets, teams can simply watch the plan come to life. 
    And there’s more. Brigantium is bringing these models into virtual reality using Unreal Engine – the same one behind many video games. One recent model recreated ITER’s tokamak pit using drone footage and photogrammetry. The experience is fully interactive and can even run in a web browser.
    “We’ve really improved the quality of the visualisation,” said Sutton. “It’s a lot smoother; the textures look a lot better. Eventually, we’ll have this running through a web browser, so anybody on the team can just click on a web link to navigate this 4D model.” 
    Looking forward, Sutton believes AI could help automate the painstaking work of syncing schedules with 3D models. One day, these simulations could reach all the way down to individual bolts and fasteners – not just with impressive visuals, but with critical tools for preventing delays. 
    Despite the different approaches, one theme ran through all three presentations: AI isn’t just a tool for office productivity. It’s becoming a partner in creativity, problem-solving and even scientific discovery. 
    Takeda mentioned that Microsoft is experimenting with “world models” inspired by how video games simulate physics. These models learn about the physical world by watching pixels in the form of videos of real phenomena such as plasma behaviour. “Our thesis is that if you showed this AI videos of plasma, it might learn the physics of plasmas,” he said. 
    It sounds futuristic, but the logic holds. The more AI can learn from the world, the more it can help us understand it – and perhaps even master it. At its heart, the message from the workshop was simple: AI isn’t here to replace the scientist, the engineer or the planner; it’s here to help, and to make their work faster, more flexible and maybe a little more fun.
    As Takeda put it: “Those are just a few examples of how AI is starting to be used at ITER. And it’s just the start of that journey.” 
    If these early steps are any indication, that journey won’t just be faster – it might also be more inspired. 
    #fusion #how #private #sector #tech
    Fusion and AI: How private sector tech is powering progress at ITER
    In April 2025, at the ITER Private Sector Fusion Workshop in Cadarache, something remarkable unfolded. In a room filled with scientists, engineers and software visionaries, the line between big science and commercial innovation began to blur.   Three organisations – Microsoft Research, Arena and Brigantium Engineering – shared how artificial intelligence, already transforming everything from language models to logistics, is now stepping into a new role: helping humanity to unlock the power of nuclear fusion.  Each presenter addressed a different part of the puzzle, but the message was the same: AI isn’t just a buzzword anymore. It’s becoming a real tool – practical, powerful and indispensable – for big science and engineering projects, including fusion.  “If we think of the agricultural revolution and the industrial revolution, the AI revolution is next – and it’s coming at a pace which is unprecedented,” said Kenji Takeda, director of research incubations at Microsoft Research.  Microsoft’s collaboration with ITER is already in motion. Just a month before the workshop, the two teams signed a Memorandum of Understandingto explore how AI can accelerate research and development. This follows ITER’s initial use of Microsoft technology to empower their teams. A chatbot in Azure OpenAI service was developed to help staff navigate technical knowledge, on more than a million ITER documents, using natural conversation. GitHub Copilot assists with coding, while AI helps to resolve IT support tickets – those everyday but essential tasks that keep the lights on.  But Microsoft’s vision goes deeper. Fusion demands materials that can survive extreme conditions – heat, radiation, pressure – and that’s where AI shows a different kind of potential. MatterGen, a Microsoft Research generative AI model for materials, designs entirely new materials based on specific properties. “It’s like ChatGPT,” said Takeda, “but instead of ‘Write me a poem’, we ask it to design a material that can survive as the first wall of a fusion reactor.”  The next step? MatterSim – a simulation tool that predicts how these imagined materials will behave in the real world. By combining generation and simulation, Microsoft hopes to uncover materials that don’t yet exist in any catalogue.  While Microsoft tackles the atomic scale, Arena is focused on a different challenge: speeding up hardware development. As general manager Michael Frei put it: “Software innovation happens in seconds. In hardware, that loop can take months – or years.”  Arena’s answer is Atlas, a multimodal AI platform that acts as an extra set of hands – and eyes – for engineers. It can read data sheets, interpret lab results, analyse circuit diagrams and even interact with lab equipment through software interfaces. “Instead of adjusting an oscilloscope manually,” said Frei, “you can just say, ‘Verify the I2Cprotocol’, and Atlas gets it done.”  It doesn’t stop there. Atlas can write and adapt firmware on the fly, responding to real-time conditions. That means tighter feedback loops, faster prototyping and fewer late nights in the lab. Arena aims to make building hardware feel a little more like writing software – fluid, fast and assisted by smart tools.  Fusion, of course, isn’t just about atoms and code – it’s also about construction. Gigantic, one-of-a-kind machines don’t build themselves. That’s where Brigantium Engineering comes in. Founder Lynton Sutton explained how his team uses “4D planning” – a marriage of 3D CAD models and detailed construction schedules – to visualise how everything comes together over time. “Gantt charts are hard to interpret. 3D models are static. Our job is to bring those together,” he said.  The result is a time-lapse-style animation that shows the construction process step by step. It’s proven invaluable for safety reviews and stakeholder meetings. Rather than poring over spreadsheets, teams can simply watch the plan come to life.  And there’s more. Brigantium is bringing these models into virtual reality using Unreal Engine – the same one behind many video games. One recent model recreated ITER’s tokamak pit using drone footage and photogrammetry. The experience is fully interactive and can even run in a web browser. “We’ve really improved the quality of the visualisation,” said Sutton. “It’s a lot smoother; the textures look a lot better. Eventually, we’ll have this running through a web browser, so anybody on the team can just click on a web link to navigate this 4D model.”  Looking forward, Sutton believes AI could help automate the painstaking work of syncing schedules with 3D models. One day, these simulations could reach all the way down to individual bolts and fasteners – not just with impressive visuals, but with critical tools for preventing delays.  Despite the different approaches, one theme ran through all three presentations: AI isn’t just a tool for office productivity. It’s becoming a partner in creativity, problem-solving and even scientific discovery.  Takeda mentioned that Microsoft is experimenting with “world models” inspired by how video games simulate physics. These models learn about the physical world by watching pixels in the form of videos of real phenomena such as plasma behaviour. “Our thesis is that if you showed this AI videos of plasma, it might learn the physics of plasmas,” he said.  It sounds futuristic, but the logic holds. The more AI can learn from the world, the more it can help us understand it – and perhaps even master it. At its heart, the message from the workshop was simple: AI isn’t here to replace the scientist, the engineer or the planner; it’s here to help, and to make their work faster, more flexible and maybe a little more fun. As Takeda put it: “Those are just a few examples of how AI is starting to be used at ITER. And it’s just the start of that journey.”  If these early steps are any indication, that journey won’t just be faster – it might also be more inspired.  #fusion #how #private #sector #tech
    WWW.COMPUTERWEEKLY.COM
    Fusion and AI: How private sector tech is powering progress at ITER
    In April 2025, at the ITER Private Sector Fusion Workshop in Cadarache, something remarkable unfolded. In a room filled with scientists, engineers and software visionaries, the line between big science and commercial innovation began to blur.   Three organisations – Microsoft Research, Arena and Brigantium Engineering – shared how artificial intelligence (AI), already transforming everything from language models to logistics, is now stepping into a new role: helping humanity to unlock the power of nuclear fusion.  Each presenter addressed a different part of the puzzle, but the message was the same: AI isn’t just a buzzword anymore. It’s becoming a real tool – practical, powerful and indispensable – for big science and engineering projects, including fusion.  “If we think of the agricultural revolution and the industrial revolution, the AI revolution is next – and it’s coming at a pace which is unprecedented,” said Kenji Takeda, director of research incubations at Microsoft Research.  Microsoft’s collaboration with ITER is already in motion. Just a month before the workshop, the two teams signed a Memorandum of Understanding (MoU) to explore how AI can accelerate research and development. This follows ITER’s initial use of Microsoft technology to empower their teams. A chatbot in Azure OpenAI service was developed to help staff navigate technical knowledge, on more than a million ITER documents, using natural conversation. GitHub Copilot assists with coding, while AI helps to resolve IT support tickets – those everyday but essential tasks that keep the lights on.  But Microsoft’s vision goes deeper. Fusion demands materials that can survive extreme conditions – heat, radiation, pressure – and that’s where AI shows a different kind of potential. MatterGen, a Microsoft Research generative AI model for materials, designs entirely new materials based on specific properties. “It’s like ChatGPT,” said Takeda, “but instead of ‘Write me a poem’, we ask it to design a material that can survive as the first wall of a fusion reactor.”  The next step? MatterSim – a simulation tool that predicts how these imagined materials will behave in the real world. By combining generation and simulation, Microsoft hopes to uncover materials that don’t yet exist in any catalogue.  While Microsoft tackles the atomic scale, Arena is focused on a different challenge: speeding up hardware development. As general manager Michael Frei put it: “Software innovation happens in seconds. In hardware, that loop can take months – or years.”  Arena’s answer is Atlas, a multimodal AI platform that acts as an extra set of hands – and eyes – for engineers. It can read data sheets, interpret lab results, analyse circuit diagrams and even interact with lab equipment through software interfaces. “Instead of adjusting an oscilloscope manually,” said Frei, “you can just say, ‘Verify the I2C [inter integrated circuit] protocol’, and Atlas gets it done.”  It doesn’t stop there. Atlas can write and adapt firmware on the fly, responding to real-time conditions. That means tighter feedback loops, faster prototyping and fewer late nights in the lab. Arena aims to make building hardware feel a little more like writing software – fluid, fast and assisted by smart tools.  Fusion, of course, isn’t just about atoms and code – it’s also about construction. Gigantic, one-of-a-kind machines don’t build themselves. That’s where Brigantium Engineering comes in. Founder Lynton Sutton explained how his team uses “4D planning” – a marriage of 3D CAD models and detailed construction schedules – to visualise how everything comes together over time. “Gantt charts are hard to interpret. 3D models are static. Our job is to bring those together,” he said.  The result is a time-lapse-style animation that shows the construction process step by step. It’s proven invaluable for safety reviews and stakeholder meetings. Rather than poring over spreadsheets, teams can simply watch the plan come to life.  And there’s more. Brigantium is bringing these models into virtual reality using Unreal Engine – the same one behind many video games. One recent model recreated ITER’s tokamak pit using drone footage and photogrammetry. The experience is fully interactive and can even run in a web browser. “We’ve really improved the quality of the visualisation,” said Sutton. “It’s a lot smoother; the textures look a lot better. Eventually, we’ll have this running through a web browser, so anybody on the team can just click on a web link to navigate this 4D model.”  Looking forward, Sutton believes AI could help automate the painstaking work of syncing schedules with 3D models. One day, these simulations could reach all the way down to individual bolts and fasteners – not just with impressive visuals, but with critical tools for preventing delays.  Despite the different approaches, one theme ran through all three presentations: AI isn’t just a tool for office productivity. It’s becoming a partner in creativity, problem-solving and even scientific discovery.  Takeda mentioned that Microsoft is experimenting with “world models” inspired by how video games simulate physics. These models learn about the physical world by watching pixels in the form of videos of real phenomena such as plasma behaviour. “Our thesis is that if you showed this AI videos of plasma, it might learn the physics of plasmas,” he said.  It sounds futuristic, but the logic holds. The more AI can learn from the world, the more it can help us understand it – and perhaps even master it. At its heart, the message from the workshop was simple: AI isn’t here to replace the scientist, the engineer or the planner; it’s here to help, and to make their work faster, more flexible and maybe a little more fun. As Takeda put it: “Those are just a few examples of how AI is starting to be used at ITER. And it’s just the start of that journey.”  If these early steps are any indication, that journey won’t just be faster – it might also be more inspired. 
    Like
    Love
    Wow
    Sad
    Angry
    490
    2 Комментарии 0 Поделились
  • Rewriting SymCrypt in Rust to modernize Microsoft’s cryptographic library 

    Outdated coding practices and memory-unsafe languages like C are putting software, including cryptographic libraries, at risk. Fortunately, memory-safe languages like Rust, along with formal verification tools, are now mature enough to be used at scale, helping prevent issues like crashes, data corruption, flawed implementation, and side-channel attacks.
    To address these vulnerabilities and improve memory safety, we’re rewriting SymCrypt—Microsoft’s open-source cryptographic library—in Rust. We’re also incorporating formal verification methods. SymCrypt is used in Windows, Azure Linux, Xbox, and other platforms.
    Currently, SymCrypt is primarily written in cross-platform C, with limited use of hardware-specific optimizations through intrinsicsand assembly language. It provides a wide range of algorithms, including AES-GCM, SHA, ECDSA, and the more recent post-quantum algorithms ML-KEM and ML-DSA. 
    Formal verification will confirm that implementations behave as intended and don’t deviate from algorithm specifications, critical for preventing attacks. We’ll also analyze compiled code to detect side-channel leaks caused by timing or hardware-level behavior.
    Proving Rust program properties with Aeneas
    Program verification is the process of proving that a piece of code will always satisfy a given property, no matter the input. Rust’s type system profoundly improves the prospects for program verification by providing strong ownership guarantees, by construction, using a discipline known as “aliasing xor mutability”.
    For example, reasoning about C code often requires proving that two non-const pointers are live and non-overlapping, a property that can depend on external client code. In contrast, Rust’s type system guarantees this property for any two mutably borrowed references.
    As a result, new tools have emerged specifically for verifying Rust code. We chose Aeneasbecause it helps provide a clean separation between code and proofs.
    Developed by Microsoft Azure Research in partnership with Inria, the French National Institute for Research in Digital Science and Technology, Aeneas connects to proof assistants like Lean, allowing us to draw on a large body of mathematical proofs—especially valuable given the mathematical nature of cryptographic algorithms—and benefit from Lean’s active user community.
    Compiling Rust to C supports backward compatibility  
    We recognize that switching to Rust isn’t feasible for all use cases, so we’ll continue to support, extend, and certify C-based APIs as long as users need them. Users won’t see any changes, as Rust runs underneath the existing C APIs.
    Some users compile our C code directly and may rely on specific toolchains or compiler features that complicate the adoption of Rust code. To address this, we will use Eurydice, a Rust-to-C compiler developed by Microsoft Azure Research, to replace handwritten C code with C generated from formally verified Rust. Eurydicecompiles directly from Rust’s MIR intermediate language, and the resulting C code will be checked into the SymCrypt repository alongside the original Rust source code.
    As more users adopt Rust, we’ll continue supporting this compilation path for those who build SymCrypt from source code but aren’t ready to use the Rust compiler. In the long term, we hope to transition users to either use precompiled SymCrypt binaries, or compile from source code in Rust, at which point the Rust-to-C compilation path will no longer be needed.

    Microsoft research podcast

    Ideas: AI and democracy with Madeleine Daepp and Robert Osazuwa Ness
    As the “biggest election year in history” comes to an end, researchers Madeleine Daepp and Robert Osazuwa Ness and Democracy Forward GM Ginny Badanes discuss AI’s impact on democracy, including the tech’s use in Taiwan and India.

    Listen now

    Opens in a new tab
    Timing analysis with Revizor 
    Even software that has been verified for functional correctness can remain vulnerable to low-level security threats, such as side channels caused by timing leaks or speculative execution. These threats operate at the hardware level and can leak private information, such as memory load addresses, branch targets, or division operands, even when the source code is provably correct. 
    To address this, we’re extending Revizor, a tool developed by Microsoft Azure Research, to more effectively analyze SymCrypt binaries. Revizor models microarchitectural leakage and uses fuzzing techniques to systematically uncover instructions that may expose private information through known hardware-level effects.  
    Earlier cryptographic libraries relied on constant-time programming to avoid operations on secret data. However, recent research has shown that this alone is insufficient with today’s CPUs, where every new optimization may open a new side channel. 
    By analyzing binary code for specific compilers and platforms, our extended Revizor tool enables deeper scrutiny of vulnerabilities that aren’t visible in the source code.
    Verified Rust implementations begin with ML-KEM
    This long-term effort is in alignment with the Microsoft Secure Future Initiative and brings together experts across Microsoft, building on decades of Microsoft Research investment in program verification and security tooling.
    A preliminary version of ML-KEM in Rust is now available on the preview feature/verifiedcryptobranch of the SymCrypt repository. We encourage users to try the Rust build and share feedback. Looking ahead, we plan to support direct use of the same cryptographic library in Rust without requiring C bindings. 
    Over the coming months, we plan to rewrite, verify, and ship several algorithms in Rust as part of SymCrypt. As our investment in Rust deepens, we expect to gain new insights into how to best leverage the language for high-assurance cryptographic implementations with low-level optimizations. 
    As performance is key to scalability and sustainability, we’re holding new implementations to a high bar using our benchmarking tools to match or exceed existing systems.
    Looking forward 
    This is a pivotal moment for high-assurance software. Microsoft’s investment in Rust and formal verification presents a rare opportunity to advance one of our key libraries. We’re excited to scale this work and ultimately deliver an industrial-grade, Rust-based, FIPS-certified cryptographic library.
    Opens in a new tab
    #rewriting #symcrypt #rust #modernize #microsofts
    Rewriting SymCrypt in Rust to modernize Microsoft’s cryptographic library 
    Outdated coding practices and memory-unsafe languages like C are putting software, including cryptographic libraries, at risk. Fortunately, memory-safe languages like Rust, along with formal verification tools, are now mature enough to be used at scale, helping prevent issues like crashes, data corruption, flawed implementation, and side-channel attacks. To address these vulnerabilities and improve memory safety, we’re rewriting SymCrypt—Microsoft’s open-source cryptographic library—in Rust. We’re also incorporating formal verification methods. SymCrypt is used in Windows, Azure Linux, Xbox, and other platforms. Currently, SymCrypt is primarily written in cross-platform C, with limited use of hardware-specific optimizations through intrinsicsand assembly language. It provides a wide range of algorithms, including AES-GCM, SHA, ECDSA, and the more recent post-quantum algorithms ML-KEM and ML-DSA.  Formal verification will confirm that implementations behave as intended and don’t deviate from algorithm specifications, critical for preventing attacks. We’ll also analyze compiled code to detect side-channel leaks caused by timing or hardware-level behavior. Proving Rust program properties with Aeneas Program verification is the process of proving that a piece of code will always satisfy a given property, no matter the input. Rust’s type system profoundly improves the prospects for program verification by providing strong ownership guarantees, by construction, using a discipline known as “aliasing xor mutability”. For example, reasoning about C code often requires proving that two non-const pointers are live and non-overlapping, a property that can depend on external client code. In contrast, Rust’s type system guarantees this property for any two mutably borrowed references. As a result, new tools have emerged specifically for verifying Rust code. We chose Aeneasbecause it helps provide a clean separation between code and proofs. Developed by Microsoft Azure Research in partnership with Inria, the French National Institute for Research in Digital Science and Technology, Aeneas connects to proof assistants like Lean, allowing us to draw on a large body of mathematical proofs—especially valuable given the mathematical nature of cryptographic algorithms—and benefit from Lean’s active user community. Compiling Rust to C supports backward compatibility   We recognize that switching to Rust isn’t feasible for all use cases, so we’ll continue to support, extend, and certify C-based APIs as long as users need them. Users won’t see any changes, as Rust runs underneath the existing C APIs. Some users compile our C code directly and may rely on specific toolchains or compiler features that complicate the adoption of Rust code. To address this, we will use Eurydice, a Rust-to-C compiler developed by Microsoft Azure Research, to replace handwritten C code with C generated from formally verified Rust. Eurydicecompiles directly from Rust’s MIR intermediate language, and the resulting C code will be checked into the SymCrypt repository alongside the original Rust source code. As more users adopt Rust, we’ll continue supporting this compilation path for those who build SymCrypt from source code but aren’t ready to use the Rust compiler. In the long term, we hope to transition users to either use precompiled SymCrypt binaries, or compile from source code in Rust, at which point the Rust-to-C compilation path will no longer be needed. Microsoft research podcast Ideas: AI and democracy with Madeleine Daepp and Robert Osazuwa Ness As the “biggest election year in history” comes to an end, researchers Madeleine Daepp and Robert Osazuwa Ness and Democracy Forward GM Ginny Badanes discuss AI’s impact on democracy, including the tech’s use in Taiwan and India. Listen now Opens in a new tab Timing analysis with Revizor  Even software that has been verified for functional correctness can remain vulnerable to low-level security threats, such as side channels caused by timing leaks or speculative execution. These threats operate at the hardware level and can leak private information, such as memory load addresses, branch targets, or division operands, even when the source code is provably correct.  To address this, we’re extending Revizor, a tool developed by Microsoft Azure Research, to more effectively analyze SymCrypt binaries. Revizor models microarchitectural leakage and uses fuzzing techniques to systematically uncover instructions that may expose private information through known hardware-level effects.   Earlier cryptographic libraries relied on constant-time programming to avoid operations on secret data. However, recent research has shown that this alone is insufficient with today’s CPUs, where every new optimization may open a new side channel.  By analyzing binary code for specific compilers and platforms, our extended Revizor tool enables deeper scrutiny of vulnerabilities that aren’t visible in the source code. Verified Rust implementations begin with ML-KEM This long-term effort is in alignment with the Microsoft Secure Future Initiative and brings together experts across Microsoft, building on decades of Microsoft Research investment in program verification and security tooling. A preliminary version of ML-KEM in Rust is now available on the preview feature/verifiedcryptobranch of the SymCrypt repository. We encourage users to try the Rust build and share feedback. Looking ahead, we plan to support direct use of the same cryptographic library in Rust without requiring C bindings.  Over the coming months, we plan to rewrite, verify, and ship several algorithms in Rust as part of SymCrypt. As our investment in Rust deepens, we expect to gain new insights into how to best leverage the language for high-assurance cryptographic implementations with low-level optimizations.  As performance is key to scalability and sustainability, we’re holding new implementations to a high bar using our benchmarking tools to match or exceed existing systems. Looking forward  This is a pivotal moment for high-assurance software. Microsoft’s investment in Rust and formal verification presents a rare opportunity to advance one of our key libraries. We’re excited to scale this work and ultimately deliver an industrial-grade, Rust-based, FIPS-certified cryptographic library. Opens in a new tab #rewriting #symcrypt #rust #modernize #microsofts
    WWW.MICROSOFT.COM
    Rewriting SymCrypt in Rust to modernize Microsoft’s cryptographic library 
    Outdated coding practices and memory-unsafe languages like C are putting software, including cryptographic libraries, at risk. Fortunately, memory-safe languages like Rust, along with formal verification tools, are now mature enough to be used at scale, helping prevent issues like crashes, data corruption, flawed implementation, and side-channel attacks. To address these vulnerabilities and improve memory safety, we’re rewriting SymCrypt (opens in new tab)—Microsoft’s open-source cryptographic library—in Rust. We’re also incorporating formal verification methods. SymCrypt is used in Windows, Azure Linux, Xbox, and other platforms. Currently, SymCrypt is primarily written in cross-platform C, with limited use of hardware-specific optimizations through intrinsics (compiler-provided low-level functions) and assembly language (direct processor instructions). It provides a wide range of algorithms, including AES-GCM, SHA, ECDSA, and the more recent post-quantum algorithms ML-KEM and ML-DSA.  Formal verification will confirm that implementations behave as intended and don’t deviate from algorithm specifications, critical for preventing attacks. We’ll also analyze compiled code to detect side-channel leaks caused by timing or hardware-level behavior. Proving Rust program properties with Aeneas Program verification is the process of proving that a piece of code will always satisfy a given property, no matter the input. Rust’s type system profoundly improves the prospects for program verification by providing strong ownership guarantees, by construction, using a discipline known as “aliasing xor mutability”. For example, reasoning about C code often requires proving that two non-const pointers are live and non-overlapping, a property that can depend on external client code. In contrast, Rust’s type system guarantees this property for any two mutably borrowed references. As a result, new tools have emerged specifically for verifying Rust code. We chose Aeneas (opens in new tab) because it helps provide a clean separation between code and proofs. Developed by Microsoft Azure Research in partnership with Inria, the French National Institute for Research in Digital Science and Technology, Aeneas connects to proof assistants like Lean (opens in new tab), allowing us to draw on a large body of mathematical proofs—especially valuable given the mathematical nature of cryptographic algorithms—and benefit from Lean’s active user community. Compiling Rust to C supports backward compatibility   We recognize that switching to Rust isn’t feasible for all use cases, so we’ll continue to support, extend, and certify C-based APIs as long as users need them. Users won’t see any changes, as Rust runs underneath the existing C APIs. Some users compile our C code directly and may rely on specific toolchains or compiler features that complicate the adoption of Rust code. To address this, we will use Eurydice (opens in new tab), a Rust-to-C compiler developed by Microsoft Azure Research, to replace handwritten C code with C generated from formally verified Rust. Eurydice (opens in new tab) compiles directly from Rust’s MIR intermediate language, and the resulting C code will be checked into the SymCrypt repository alongside the original Rust source code. As more users adopt Rust, we’ll continue supporting this compilation path for those who build SymCrypt from source code but aren’t ready to use the Rust compiler. In the long term, we hope to transition users to either use precompiled SymCrypt binaries (via C or Rust APIs), or compile from source code in Rust, at which point the Rust-to-C compilation path will no longer be needed. Microsoft research podcast Ideas: AI and democracy with Madeleine Daepp and Robert Osazuwa Ness As the “biggest election year in history” comes to an end, researchers Madeleine Daepp and Robert Osazuwa Ness and Democracy Forward GM Ginny Badanes discuss AI’s impact on democracy, including the tech’s use in Taiwan and India. Listen now Opens in a new tab Timing analysis with Revizor  Even software that has been verified for functional correctness can remain vulnerable to low-level security threats, such as side channels caused by timing leaks or speculative execution. These threats operate at the hardware level and can leak private information, such as memory load addresses, branch targets, or division operands, even when the source code is provably correct.  To address this, we’re extending Revizor (opens in new tab), a tool developed by Microsoft Azure Research, to more effectively analyze SymCrypt binaries. Revizor models microarchitectural leakage and uses fuzzing techniques to systematically uncover instructions that may expose private information through known hardware-level effects.   Earlier cryptographic libraries relied on constant-time programming to avoid operations on secret data. However, recent research has shown that this alone is insufficient with today’s CPUs, where every new optimization may open a new side channel.  By analyzing binary code for specific compilers and platforms, our extended Revizor tool enables deeper scrutiny of vulnerabilities that aren’t visible in the source code. Verified Rust implementations begin with ML-KEM This long-term effort is in alignment with the Microsoft Secure Future Initiative and brings together experts across Microsoft, building on decades of Microsoft Research investment in program verification and security tooling. A preliminary version of ML-KEM in Rust is now available on the preview feature/verifiedcrypto (opens in new tab) branch of the SymCrypt repository. We encourage users to try the Rust build and share feedback (opens in new tab). Looking ahead, we plan to support direct use of the same cryptographic library in Rust without requiring C bindings.  Over the coming months, we plan to rewrite, verify, and ship several algorithms in Rust as part of SymCrypt. As our investment in Rust deepens, we expect to gain new insights into how to best leverage the language for high-assurance cryptographic implementations with low-level optimizations.  As performance is key to scalability and sustainability, we’re holding new implementations to a high bar using our benchmarking tools to match or exceed existing systems. Looking forward  This is a pivotal moment for high-assurance software. Microsoft’s investment in Rust and formal verification presents a rare opportunity to advance one of our key libraries. We’re excited to scale this work and ultimately deliver an industrial-grade, Rust-based, FIPS-certified cryptographic library. Opens in a new tab
    0 Комментарии 0 Поделились
  • How AI is reshaping the future of healthcare and medical research

    Transcript       
    PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”          
    This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.   
    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?    
    In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  The book passage I read at the top is from “Chapter 10: The Big Black Bag.” 
    In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.   
    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open. 
    As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.  
    Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home. 
    Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.     
    Here’s my conversation with Bill Gates and Sébastien Bubeck. 
    LEE: Bill, welcome. 
    BILL GATES: Thank you. 
    LEE: Seb … 
    SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here. 
    LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening? 
    And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?  
    GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines. 
    And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.  
    And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weaknessthat, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning. 
    LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that? 
    GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, … 
    LEE: Right.  
    GATES: … that is a bit weird.  
    LEE: Yeah. 
    GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training. 
    LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent. 
    BUBECK: Yes.  
    LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSRto join and start investigating this thing seriously. And the first person I pulled in was you. 
    BUBECK: Yeah. 
    LEE: And so what were your first encounters? Because I actually don’t remember what happened then. 
    BUBECK: Oh, I remember it very well.My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3. 
    I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1. 
    So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair.And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts. 
    So this was really, to me, the first moment where I saw some understanding in those models.  
    LEE: So this was, just to get the timing right, that was before I pulled you into the tent. 
    BUBECK: That was before. That was like a year before. 
    LEE: Right.  
    BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4. 
    So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.  
    So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x. 
    And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?  
    LEE: Yeah.
    BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.  
    LEE:One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine. 
    And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.  
    And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.  
    I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book. 
    But the main purpose of this conversation isn’t to reminisce aboutor indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements. 
    But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today? 
    You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.  
    Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork? 
    GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.  
    It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision. 
    But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view. 
    LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients.Does that make sense to you? 
    BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong? 
    Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.  
    Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them. 
    And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT. And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.  
    Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way. 
    It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine. 
    LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all? 
    GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that. 
    The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa,
    So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.  
    LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking? 
    GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.  
    The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.  
    LEE: Right.  
    GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.  
    LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication. 
    BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE, for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI. 
    It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for. 
    LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes. 
    I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?  
    That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential.What’s up with that? 
    BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back thatversion of GPT-4o, so now we don’t have the sycophant version out there. 
    Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF, where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad. 
    But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model. 
    So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model. 
    LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and … 
    BUBECK: It’s a very difficult, very difficult balance. 
    LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models? 
    GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there. 
    Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?  
    Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there.
    LEE: Yeah.
    GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake. 
    LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on. 
    BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGIthat kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything. 
    That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects.So it’s … I think it’s an important example to have in mind. 
    LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two? 
    BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it. 
    LEE: So we have about three hours of stuff to talk about, but our time is actually running low.
    BUBECK: Yes, yes, yes.  
    LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now? 
    GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.  
    The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities. 
    And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period. 
    LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers? 
    GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them. 
    LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.  
    I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why. 
    BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and seeproduced what you wanted. So I absolutely agree with that.  
    And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini. So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.  
    LEE: Yeah. 
    BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.  
    Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not. 
    Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision. 
    LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist … 
    BUBECK: Yeah.
    LEE: … or an endocrinologist might not.
    BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know.
    LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today? 
    BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later. 
    And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …  
    LEE: Will AI prescribe your medicines? Write your prescriptions? 
    BUBECK: I think yes. I think yes. 
    LEE: OK. Bill? 
    GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate?
    And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelectedjust on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries. 
    You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that. 
    LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.  
    I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.  
    GATES: Yeah. Thanks, you guys. 
    BUBECK: Thank you, Peter. Thanks, Bill. 
    LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.   
    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.  
    And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.  
    One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.  
    HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings. 
    You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.  
    If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.  
    I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.  
    Until next time.  
    #how #reshaping #future #healthcare #medical
    How AI is reshaping the future of healthcare and medical research
    Transcript        PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”           This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?     In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.  The book passage I read at the top is from “Chapter 10: The Big Black Bag.”  In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open.  As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.   Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home.  Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.      Here’s my conversation with Bill Gates and Sébastien Bubeck.  LEE: Bill, welcome.  BILL GATES: Thank you.  LEE: Seb …  SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here.  LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening?  And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?   GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines.  And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.   And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weaknessthat, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning.  LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that?  GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, …  LEE: Right.   GATES: … that is a bit weird.   LEE: Yeah.  GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training.  LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent.  BUBECK: Yes.   LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSRto join and start investigating this thing seriously. And the first person I pulled in was you.  BUBECK: Yeah.  LEE: And so what were your first encounters? Because I actually don’t remember what happened then.  BUBECK: Oh, I remember it very well.My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3.  I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1.  So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair.And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts.  So this was really, to me, the first moment where I saw some understanding in those models.   LEE: So this was, just to get the timing right, that was before I pulled you into the tent.  BUBECK: That was before. That was like a year before.  LEE: Right.   BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4.  So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.   So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x.  And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?   LEE: Yeah. BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.   LEE:One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine.  And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.   And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.   I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book.  But the main purpose of this conversation isn’t to reminisce aboutor indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements.  But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today?  You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.   Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork?  GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.   It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision.  But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view.  LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients.Does that make sense to you?  BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong?  Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.   Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them.  And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT. And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.   Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way.  It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine.  LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all?  GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that.  The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa, So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.   LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking?  GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.   The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.   LEE: Right.   GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.   LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication.  BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE, for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI.  It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for.  LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes.  I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?   That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential.What’s up with that?  BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back thatversion of GPT-4o, so now we don’t have the sycophant version out there.  Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF, where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad.  But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model.  So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model.  LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and …  BUBECK: It’s a very difficult, very difficult balance.  LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models?  GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there.  Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?   Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there. LEE: Yeah. GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake.  LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on.  BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGIthat kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything.  That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects.So it’s … I think it’s an important example to have in mind.  LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two?  BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it.  LEE: So we have about three hours of stuff to talk about, but our time is actually running low. BUBECK: Yes, yes, yes.   LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now?  GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.   The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities.  And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period.  LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers?  GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them.  LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.   I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why.  BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and seeproduced what you wanted. So I absolutely agree with that.   And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini. So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.   LEE: Yeah.  BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.   Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not.  Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision.  LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist …  BUBECK: Yeah. LEE: … or an endocrinologist might not. BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know. LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today?  BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later.  And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …   LEE: Will AI prescribe your medicines? Write your prescriptions?  BUBECK: I think yes. I think yes.  LEE: OK. Bill?  GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate? And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelectedjust on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries.  You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that.  LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.   I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.   GATES: Yeah. Thanks, you guys.  BUBECK: Thank you, Peter. Thanks, Bill.  LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.   And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.   One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.   HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings.  You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.   If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.   I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.   Until next time.   #how #reshaping #future #healthcare #medical
    WWW.MICROSOFT.COM
    How AI is reshaping the future of healthcare and medical research
    Transcript [MUSIC]      [BOOK PASSAGE]   PETER LEE: “In ‘The Little Black Bag,’ a classic science fiction story, a high-tech doctor’s kit of the future is accidentally transported back to the 1950s, into the shaky hands of a washed-up, alcoholic doctor. The ultimate medical tool, it redeems the doctor wielding it, allowing him to practice gratifyingly heroic medicine. … The tale ends badly for the doctor and his treacherous assistant, but it offered a picture of how advanced technology could transform medicine—powerful when it was written nearly 75 years ago and still so today. What would be the Al equivalent of that little black bag? At this moment when new capabilities are emerging, how do we imagine them into medicine?”   [END OF BOOK PASSAGE]     [THEME MUSIC]     This is The AI Revolution in Medicine, Revisited. I’m your host, Peter Lee.    Shortly after OpenAI’s GPT-4 was publicly released, Carey Goldberg, Dr. Zak Kohane, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have. But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right, and what did we get wrong?     In this series, we’ll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here.   [THEME MUSIC FADES] The book passage I read at the top is from “Chapter 10: The Big Black Bag.”  In imagining AI in medicine, Carey, Zak, and I included in our book two fictional accounts. In the first, a medical resident consults GPT-4 on her personal phone as the patient in front of her crashes. Within seconds, it offers an alternate response based on recent literature. In the second account, a 90-year-old woman with several chronic conditions is living independently and receiving near-constant medical support from an AI aide.    In our conversations with the guests we’ve spoken to so far, we’ve caught a glimpse of these predicted futures, seeing how clinicians and patients are actually using AI today and how developers are leveraging the technology in the healthcare products and services they’re creating. In fact, that first fictional account isn’t so fictional after all, as most of the doctors in the real world actually appear to be using AI at least occasionally—and sometimes much more than occasionally—to help in their daily clinical work. And as for the second fictional account, which is more of a science fiction account, it seems we are indeed on the verge of a new way of delivering and receiving healthcare, though the future is still very much open.  As we continue to examine the current state of AI in healthcare and its potential to transform the field, I’m pleased to welcome Bill Gates and Sébastien Bubeck.   Bill may be best known as the co-founder of Microsoft, having created the company with his childhood friend Paul Allen in 1975. He’s now the founder of Breakthrough Energy, which aims to advance clean energy innovation, and TerraPower, a company developing groundbreaking nuclear energy and science technologies. He also chairs the world’s largest philanthropic organization, the Gates Foundation, and focuses on solving a variety of health challenges around the globe and here at home.  Sébastien is a research lead at OpenAI. He was previously a distinguished scientist, vice president of AI, and a colleague of mine here at Microsoft, where his work included spearheading the development of the family of small language models known as Phi. While at Microsoft, he also coauthored the discussion-provoking 2023 paper “Sparks of Artificial General Intelligence,” which presented the results of early experiments with GPT-4 conducted by a small team from Microsoft Research.    [TRANSITION MUSIC]   Here’s my conversation with Bill Gates and Sébastien Bubeck.  LEE: Bill, welcome.  BILL GATES: Thank you.  LEE: Seb …  SÉBASTIEN BUBECK: Yeah. Hi, hi, Peter. Nice to be here.  LEE: You know, one of the things that I’ve been doing just to get the conversation warmed up is to talk about origin stories, and what I mean about origin stories is, you know, what was the first contact that you had with large language models or the concept of generative AI that convinced you or made you think that something really important was happening?  And so, Bill, I think I’ve heard the story about, you know, the time when the OpenAI folks—Sam Altman, Greg Brockman, and others—showed you something, but could we hear from you what those early encounters were like and what was going through your mind?   GATES: Well, I’d been visiting OpenAI soon after it was created to see things like GPT-2 and to see the little arm they had that was trying to match human manipulation and, you know, looking at their games like Dota that they were trying to get as good as human play. And honestly, I didn’t think the language model stuff they were doing, even when they got to GPT-3, would show the ability to learn, you know, in the same sense that a human reads a biology book and is able to take that knowledge and access it not only to pass a test but also to create new medicines.  And so my challenge to them was that if their LLM could get a five on the advanced placement biology test, then I would say, OK, it took biologic knowledge and encoded it in an accessible way and that I didn’t expect them to do that very quickly but it would be profound.   And it was only about six months after I challenged them to do that, that an early version of GPT-4 they brought up to a dinner at my house, and in fact, it answered most of the questions that night very well. The one it got totally wrong, we were … because it was so good, we kept thinking, Oh, we must be wrong. It turned out it was a math weakness [LAUGHTER] that, you know, we later understood that that was an area of, weirdly, of incredible weakness of those early models. But, you know, that was when I realized, OK, the age of cheap intelligence was at its beginning.  LEE: Yeah. So I guess it seems like you had something similar to me in that my first encounters, I actually harbored some skepticism. Is it fair to say you were skeptical before that?  GATES: Well, the idea that we’ve figured out how to encode and access knowledge in this very deep sense without even understanding the nature of the encoding, …  LEE: Right.   GATES: … that is a bit weird.   LEE: Yeah.  GATES: We have an algorithm that creates the computation, but even say, OK, where is the president’s birthday stored in there? Where is this fact stored in there? The fact that even now when we’re playing around, getting a little bit more sense of it, it’s opaque to us what the semantic encoding is, it’s, kind of, amazing to me. I thought the invention of knowledge storage would be an explicit way of encoding knowledge, not an implicit statistical training.  LEE: Yeah, yeah. All right. So, Seb, you know, on this same topic, you know, I got—as we say at Microsoft—I got pulled into the tent. [LAUGHS]  BUBECK: Yes.   LEE: Because this was a very secret project. And then, um, I had the opportunity to select a small number of researchers in MSR [Microsoft Research] to join and start investigating this thing seriously. And the first person I pulled in was you.  BUBECK: Yeah.  LEE: And so what were your first encounters? Because I actually don’t remember what happened then.  BUBECK: Oh, I remember it very well. [LAUGHS] My first encounter with GPT-4 was in a meeting with the two of you, actually. But my kind of first contact, the first moment where I realized that something was happening with generative AI, was before that. And I agree with Bill that I also wasn’t too impressed by GPT-3.  I though that it was kind of, you know, very naturally mimicking the web, sort of parroting what was written there in a nice way. Still in a way which seemed very impressive. But it wasn’t really intelligent in any way. But shortly after GPT-3, there was a model before GPT-4 that really shocked me, and this was the first image generation model, DALL-E 1.  So that was in 2021. And I will forever remember the press release of OpenAI where they had this prompt of an avocado chair and then you had this image of the avocado chair. [LAUGHTER] And what really shocked me is that clearly the model kind of “understood” what is a chair, what is an avocado, and was able to merge those concepts.  So this was really, to me, the first moment where I saw some understanding in those models.   LEE: So this was, just to get the timing right, that was before I pulled you into the tent.  BUBECK: That was before. That was like a year before.  LEE: Right.   BUBECK: And now I will tell you how, you know, we went from that moment to the meeting with the two of you and GPT-4.  So once I saw this kind of understanding, I thought, OK, fine. It understands concept, but it’s still not able to reason. It cannot—as, you know, Bill was saying—it cannot learn from your document. It cannot reason.   So I set out to try to prove that. You know, this is what I was in the business of at the time, trying to prove things in mathematics. So I was trying to prove that basically autoregressive transformers could never reason. So I was trying to prove this. And after a year of work, I had something reasonable to show. And so I had the meeting with the two of you, and I had this example where I wanted to say, there is no way that an LLM is going to be able to do x.  And then as soon as I … I don’t know if you remember, Bill. But as soon as I said that, you said, oh, but wait a second. I had, you know, the OpenAI crew at my house recently, and they showed me a new model. Why don’t we ask this new model this question?   LEE: Yeah. BUBECK: And we did, and it solved it on the spot. And that really, honestly, just changed my life. Like, you know, I had been working for a year trying to say that this was impossible. And just right there, it was shown to be possible.   LEE: [LAUGHS] One of the very first things I got interested in—because I was really thinking a lot about healthcare—was healthcare and medicine.  And I don’t know if the two of you remember, but I ended up doing a lot of tests. I ran through, you know, step one and step two of the US Medical Licensing Exam. Did a whole bunch of other things. I wrote this big report. It was, you know, I can’t remember … a couple hundred pages.   And I needed to share this with someone. I didn’t … there weren’t too many people I could share it with. So I sent, I think, a copy to you, Bill. Sent a copy to you, Seb.   I hardly slept for about a week putting that report together. And, yeah, and I kept working on it. But I was far from alone. I think everyone who was in the tent, so to speak, in those early days was going through something pretty similar. All right. So I think … of course, a lot of what I put in the report also ended up being examples that made it into the book.  But the main purpose of this conversation isn’t to reminisce about [LAUGHS] or indulge in those reminiscences but to talk about what’s happening in healthcare and medicine. And, you know, as I said, we wrote this book. We did it very, very quickly. Seb, you helped. Bill, you know, you provided a review and some endorsements.  But, you know, honestly, we didn’t know what we were talking about because no one had access to this thing. And so we just made a bunch of guesses. So really, the whole thing I wanted to probe with the two of you is, now with two years of experience out in the world, what, you know, what do we think is happening today?  You know, is AI actually having an impact, positive or negative, on healthcare and medicine? And what do we now think is going to happen in the next two years, five years, or 10 years? And so I realize it’s a little bit too abstract to just ask it that way. So let me just try to narrow the discussion and guide us a little bit.   Um, the kind of administrative and clerical work, paperwork, around healthcare—and we made a lot of guesses about that—that appears to be going well, but, you know, Bill, I know we’ve discussed that sometimes that you think there ought to be a lot more going on. Do you have a viewpoint on how AI is actually finding its way into reducing paperwork?  GATES: Well, I’m stunned … I don’t think there should be a patient-doctor meeting where the AI is not sitting in and both transcribing, offering to help with the paperwork, and even making suggestions, although the doctor will be the one, you know, who makes the final decision about the diagnosis and whatever prescription gets done.   It’s so helpful. You know, when that patient goes home and their, you know, son who wants to understand what happened has some questions, that AI should be available to continue that conversation. And the way you can improve that experience and streamline things and, you know, involve the people who advise you. I don’t understand why that’s not more adopted, because there you still have the human in the loop making that final decision.  But even for, like, follow-up calls to make sure the patient did things, to understand if they have concerns and knowing when to escalate back to the doctor, the benefit is incredible. And, you know, that thing is ready for prime time. That paradigm is ready for prime time, in my view.  LEE: Yeah, there are some good products, but it seems like the number one use right now—and we kind of got this from some of the previous guests in previous episodes—is the use of AI just to respond to emails from patients. [LAUGHTER] Does that make sense to you?  BUBECK: Yeah. So maybe I want to second what Bill was saying but maybe take a step back first. You know, two years ago, like, the concept of clinical scribes, which is one of the things that we’re talking about right now, it would have sounded, in fact, it sounded two years ago, borderline dangerous. Because everybody was worried about hallucinations. What happened if you have this AI listening in and then it transcribes, you know, something wrong?  Now, two years later, I think it’s mostly working. And in fact, it is not yet, you know, fully adopted. You’re right. But it is in production. It is used, you know, in many, many places. So this rate of progress is astounding because it wasn’t obvious that we would be able to overcome those obstacles of hallucination. It’s not to say that hallucinations are fully solved. In the case of the closed system, they are.   Now, I think more generally what’s going on in the background is that there is something that we, that certainly I, underestimated, which is this management overhead. So I think the reason why this is not adopted everywhere is really a training and teaching aspect. People need to be taught, like, those systems, how to interact with them.  And one example that I really like, a study that recently appeared where they tried to use ChatGPT for diagnosis and they were comparing doctors without and with ChatGPT (opens in new tab). And the amazing thing … so this was a set of cases where the accuracy of the doctors alone was around 75%. ChatGPT alone was 90%. So that’s already kind of mind blowing. But then the kicker is that doctors with ChatGPT was 80%.   Intelligence alone is not enough. It’s also how it’s presented, how you interact with it. And ChatGPT, it’s an amazing tool. Obviously, I absolutely love it. But it’s not … you don’t want a doctor to have to type in, you know, prompts and use it that way.  It should be, as Bill was saying, kind of running continuously in the background, sending you notifications. And you have to be really careful of the rate at which those notifications are being sent. Because if they are too frequent, then the doctor will learn to ignore them. So you have to … all of those things matter, in fact, at least as much as the level of intelligence of the machine.  LEE: One of the things I think about, Bill, in that scenario that you described, doctors do some thinking about the patient when they write the note. So, you know, I’m always a little uncertain whether it’s actually … you know, you wouldn’t necessarily want to fully automate this, I don’t think. Or at least there needs to be some prompt to the doctor to make sure that the doctor puts some thought into what happened in the encounter with the patient. Does that make sense to you at all?  GATES: At this stage, you know, I’d still put the onus on the doctor to write the conclusions and the summary and not delegate that.  The tradeoffs you make a little bit are somewhat dependent on the situation you’re in. If you’re in Africa, So, yes, the doctor’s still going to have to do a lot of work, but just the quality of letting the patient and the people around them interact and ask questions and have things explained, that alone is such a quality improvement. It’s mind blowing.   LEE: So since you mentioned, you know, Africa—and, of course, this touches on the mission and some of the priorities of the Gates Foundation and this idea of democratization of access to expert medical care—what’s the most interesting stuff going on right now? Are there people and organizations or technologies that are impressing you or that you’re tracking?  GATES: Yeah. So the Gates Foundation has given out a lot of grants to people in Africa doing education, agriculture but more healthcare examples than anything. And the way these things start off, they often start out either being patient-centric in a narrow situation, like, OK, I’m a pregnant woman; talk to me. Or, I have infectious disease symptoms; talk to me. Or they’re connected to a health worker where they’re helping that worker get their job done. And we have lots of pilots out, you know, in both of those cases.   The dream would be eventually to have the thing the patient consults be so broad that it’s like having a doctor available who understands the local things.   LEE: Right.   GATES: We’re not there yet. But over the next two or three years, you know, particularly given the worsening financial constraints against African health systems, where the withdrawal of money has been dramatic, you know, figuring out how to take this—what I sometimes call “free intelligence”—and build a quality health system around that, we will have to be more radical in low-income countries than any rich country is ever going to be.   LEE: Also, there’s maybe a different regulatory environment, so some of those things maybe are easier? Because right now, I think the world hasn’t figured out how to and whether to regulate, let’s say, an AI that might give a medical diagnosis or write a prescription for a medication.  BUBECK: Yeah. I think one issue with this, and it’s also slowing down the deployment of AI in healthcare more generally, is a lack of proper benchmark. Because, you know, you were mentioning the USMLE [United States Medical Licensing Examination], for example. That’s a great test to test human beings and their knowledge of healthcare and medicine. But it’s not a great test to give to an AI.  It’s not asking the right questions. So finding what are the right questions to test whether an AI system is ready to give diagnosis in a constrained setting, that’s a very, very important direction, which to my surprise, is not yet accelerating at the rate that I was hoping for.  LEE: OK, so that gives me an excuse to get more now into the core AI tech because something I’ve discussed with both of you is this issue of what are the right tests. And you both know the very first test I give to any new spin of an LLM is I present a patient, the results—a mythical patient—the results of my physical exam, my mythical physical exam. Maybe some results of some initial labs. And then I present or propose a differential diagnosis. And if you’re not in medicine, a differential diagnosis you can just think of as a prioritized list of the possible diagnoses that fit with all that data. And in that proposed differential, I always intentionally make two mistakes.  I make a textbook technical error in one of the possible elements of the differential diagnosis, and I have an error of omission. And, you know, I just want to know, does the LLM understand what I’m talking about? And all the good ones out there do now. But then I want to know, can it spot the errors? And then most importantly, is it willing to tell me I’m wrong, that I’ve made a mistake?   That last piece seems really hard for AI today. And so let me ask you first, Seb, because at the time of this taping, of course, there was a new spin of GPT-4o last week that became overly sycophantic. In other words, it was actually prone in that test of mine not only to not tell me I’m wrong, but it actually praised me for the creativity of my differential. [LAUGHTER] What’s up with that?  BUBECK: Yeah, I guess it’s a testament to the fact that training those models is still more of an art than a science. So it’s a difficult job. Just to be clear with the audience, we have rolled back that [LAUGHS] version of GPT-4o, so now we don’t have the sycophant version out there.  Yeah, no, it’s a really difficult question. It has to do … as you said, it’s very technical. It has to do with the post-training and how, like, where do you nudge the model? So, you know, there is this very classical by now technique called RLHF [reinforcement learning from human feedback], where you push the model in the direction of a certain reward model. So the reward model is just telling the model, you know, what behavior is good, what behavior is bad.  But this reward model is itself an LLM, and, you know, Bill was saying at the very beginning of the conversation that we don’t really understand how those LLMs deal with concepts like, you know, where is the capital of France located? Things like that. It is the same thing for this reward model. We don’t know why it says that it prefers one output to another, and whether this is correlated with some sycophancy is, you know, something that we discovered basically just now. That if you push too hard in optimization on this reward model, you will get a sycophant model.  So it’s kind of … what I’m trying to say is we became too good at what we were doing, and we ended up, in fact, in a trap of the reward model.  LEE: I mean, you do want … it’s a difficult balance because you do want models to follow your desires and …  BUBECK: It’s a very difficult, very difficult balance.  LEE: So this brings up then the following question for me, which is the extent to which we think we’ll need to have specially trained models for things. So let me start with you, Bill. Do you have a point of view on whether we will need to, you know, quote-unquote take AI models to med school? Have them specially trained? Like, if you were going to deploy something to give medical care in underserved parts of the world, do we need to do something special to create those models?  GATES: We certainly need to teach them the African languages and the unique dialects so that the multimedia interactions are very high quality. We certainly need to teach them the disease prevalence and unique disease patterns like, you know, neglected tropical diseases and malaria. So we need to gather a set of facts that somebody trying to go for a US customer base, you know, wouldn’t necessarily have that in there.  Those two things are actually very straightforward because the additional training time is small. I’d say for the next few years, we’ll also need to do reinforcement learning about the context of being a doctor and how important certain behaviors are. Humans learn over the course of their life to some degree that, I’m in a different context and the way I behave in terms of being willing to criticize or be nice, you know, how important is it? Who’s here? What’s my relationship to them?   Right now, these machines don’t have that broad social experience. And so if you know it’s going to be used for health things, a lot of reinforcement learning of the very best humans in that context would still be valuable. Eventually, the models will, having read all the literature of the world about good doctors, bad doctors, it’ll understand as soon as you say, “I want you to be a doctor diagnosing somebody.” All of the implicit reinforcement that fits that situation, you know, will be there. LEE: Yeah. GATES: And so I hope three years from now, we don’t have to do that reinforcement learning. But today, for any medical context, you would want a lot of data to reinforce tone, willingness to say things when, you know, there might be something significant at stake.  LEE: Yeah. So, you know, something Bill said, kind of, reminds me of another thing that I think we missed, which is, the context also … and the specialization also pertains to different, I guess, what we still call “modes,” although I don’t know if the idea of multimodal is the same as it was two years ago. But, you know, what do you make of all of the hubbub around—in fact, within Microsoft Research, this is a big deal, but I think we’re far from alone—you know, medical images and vision, video, proteins and molecules, cell, you know, cellular data and so on.  BUBECK: Yeah. OK. So there is a lot to say to everything … to the last, you know, couple of minutes. Maybe on the specialization aspect, you know, I think there is, hiding behind this, a really fundamental scientific question of whether eventually we have a singular AGI [artificial general intelligence] that kind of knows everything and you can just put, you know, explain your own context and it will just get it and understand everything.  That’s one vision. I have to say, I don’t particularly believe in this vision. In fact, we humans are not like that at all. I think, hopefully, we are general intelligences, yet we have to specialize a lot. And, you know, I did myself a lot of RL, reinforcement learning, on mathematics. Like, that’s what I did, you know, spent a lot of time doing that. And I didn’t improve on other aspects. You know, in fact, I probably degraded in other aspects. [LAUGHTER] So it’s … I think it’s an important example to have in mind.  LEE: I think I might disagree with you on that, though, because, like, doesn’t a model have to see both good science and bad science in order to be able to gain the ability to discern between the two?  BUBECK: Yeah, no, that absolutely. I think there is value in seeing the generality, in having a very broad base. But then you, kind of, specialize on verticals. And this is where also, you know, open-weights model, which we haven’t talked about yet, are really important because they allow you to provide this broad base to everyone. And then you can specialize on top of it.  LEE: So we have about three hours of stuff to talk about, but our time is actually running low. BUBECK: Yes, yes, yes.   LEE: So I think I want … there’s a more provocative question. It’s almost a silly question, but I need to ask it of the two of you, which is, is there a future, you know, where AI replaces doctors or replaces, you know, medical specialties that we have today? So what does the world look like, say, five years from now?  GATES: Well, it’s important to distinguish healthcare discovery activity from healthcare delivery activity. We focused mostly on delivery. I think it’s very much within the realm of possibility that the AI is not only accelerating healthcare discovery but substituting for a lot of the roles of, you know, I’m an organic chemist, or I run various types of assays. I can see those, which are, you know, testable-output-type jobs but with still very high value, I can see, you know, some replacement in those areas before the doctor.   The doctor, still understanding the human condition and long-term dialogues, you know, they’ve had a lifetime of reinforcement of that, particularly when you get into areas like mental health. So I wouldn’t say in five years, either people will choose to adopt it, but it will be profound that there’ll be this nearly free intelligence that can do follow-up, that can help you, you know, make sure you went through different possibilities.  And so I’d say, yes, we’ll have doctors, but I’d say healthcare will be massively transformed in its quality and in efficiency by AI in that time period.  LEE: Is there a comparison, useful comparison, say, between doctors and, say, programmers, computer programmers, or doctors and, I don’t know, lawyers?  GATES: Programming is another one that has, kind of, a mathematical correctness to it, you know, and so the objective function that you’re trying to reinforce to, as soon as you can understand the state machines, you can have something that’s “checkable”; that’s correct. So I think programming, you know, which is weird to say, that the machine will beat us at most programming tasks before we let it take over roles that have deep empathy, you know, physical presence and social understanding in them.  LEE: Yeah. By the way, you know, I fully expect in five years that AI will produce mathematical proofs that are checkable for validity, easily checkable, because they’ll be written in a proof-checking language like Lean or something but will be so complex that no human mathematician can understand them. I expect that to happen.   I can imagine in some fields, like cellular biology, we could have the same situation in the future because the molecular pathways, the chemistry, biochemistry of human cells or living cells is as complex as any mathematics, and so it seems possible that we may be in a state where in wet lab, we see, Oh yeah, this actually works, but no one can understand why.  BUBECK: Yeah, absolutely. I mean, I think I really agree with Bill’s distinction of the discovery and the delivery, and indeed, the discovery’s when you can check things, and at the end, there is an artifact that you can verify. You know, you can run the protocol in the wet lab and see [if you have] produced what you wanted. So I absolutely agree with that.   And in fact, you know, we don’t have to talk five years from now. I don’t know if you know, but just recently, there was a paper that was published on a scientific discovery using o3- mini (opens in new tab). So this is really amazing. And, you know, just very quickly, just so people know, it was about this statistical physics model, the frustrated Potts model, which has to do with coloring, and basically, the case of three colors, like, more than two colors was open for a long time, and o3 was able to reduce the case of three colors to two colors.   LEE: Yeah.  BUBECK: Which is just, like, astounding. And this is not … this is now. This is happening right now. So this is something that I personally didn’t expect it would happen so quickly, and it’s due to those reasoning models.   Now, on the delivery side, I would add something more to it for the reason why doctors and, in fact, lawyers and coders will remain for a long time, and it’s because we still don’t understand how those models generalize. Like, at the end of the day, we are not able to tell you when they are confronted with a really new, novel situation, whether they will work or not.  Nobody is able to give you that guarantee. And I think until we understand this generalization better, we’re not going to be willing to just let the system in the wild without human supervision.  LEE: But don’t human doctors, human specialists … so, for example, a cardiologist sees a patient in a certain way that a nephrologist …  BUBECK: Yeah. LEE: … or an endocrinologist might not. BUBECK: That’s right. But another cardiologist will understand and, kind of, expect a certain level of generalization from their peer. And this, we just don’t have it with AI models. Now, of course, you’re exactly right. That generalization is also hard for humans. Like, if you have a human trained for one task and you put them into another task, then you don’t … you often don’t know. LEE: OK. You know, the podcast is focused on what’s happened over the last two years. But now, I’d like one provocative prediction about what you think the world of AI and medicine is going to be at some point in the future. You pick your timeframe. I don’t care if it’s two years or 20 years from now, but, you know, what do you think will be different about AI in medicine in that future than today?  BUBECK: Yeah, I think the deployment is going to accelerate soon. Like, we’re really not missing very much. There is this enormous capability overhang. Like, even if progress completely stopped, with current systems, we can do a lot more than what we’re doing right now. So I think this will … this has to be realized, you know, sooner rather than later.  And I think it’s probably dependent on these benchmarks and proper evaluation and tying this with regulation. So these are things that take time in human society and for good reason. But now we already are at two years; you know, give it another two years and it should be really …   LEE: Will AI prescribe your medicines? Write your prescriptions?  BUBECK: I think yes. I think yes.  LEE: OK. Bill?  GATES: Well, I think the next two years, we’ll have massive pilots, and so the amount of use of the AI, still in a copilot-type mode, you know, we should get millions of patient visits, you know, both in general medicine and in the mental health side, as well. And I think that’s going to build up both the data and the confidence to give the AI some additional autonomy. You know, are you going to let it talk to you at night when you’re panicked about your mental health with some ability to escalate? And, you know, I’ve gone so far as to tell politicians with national health systems that if they deploy AI appropriately, that the quality of care, the overload of the doctors, the improvement in the economics will be enough that their voters will be stunned because they just don’t expect this, and, you know, they could be reelected [LAUGHTER] just on this one thing of fixing what is a very overloaded and economically challenged health system in these rich countries.  You know, my personal role is going to be to make sure that in the poorer countries, there isn’t some lag; in fact, in many cases, that we’ll be more aggressive because, you know, we’re comparing to having no access to doctors at all. And, you know, so I think whether it’s India or Africa, there’ll be lessons that are globally valuable because we need medical intelligence. And, you know, thank god AI is going to provide a lot of that.  LEE: Well, on that optimistic note, I think that’s a good way to end. Bill, Seb, really appreciate all of this.   I think the most fundamental prediction we made in the book is that AI would actually find its way into the practice of medicine, and I think that that at least has come true, maybe in different ways than we expected, but it’s come true, and I think it’ll only accelerate from here. So thanks again, both of you.  [TRANSITION MUSIC]  GATES: Yeah. Thanks, you guys.  BUBECK: Thank you, Peter. Thanks, Bill.  LEE: I just always feel such a sense of privilege to have a chance to interact and actually work with people like Bill and Sébastien.    With Bill, I’m always amazed at how practically minded he is. He’s really thinking about the nuts and bolts of what AI might be able to do for people, and his thoughts about underserved parts of the world, the idea that we might actually be able to empower people with access to expert medical knowledge, I think is both inspiring and amazing.   And then, Seb, Sébastien Bubeck, he’s just absolutely a brilliant mind. He has a really firm grip on the deep mathematics of artificial intelligence and brings that to bear in his research and development work. And where that mathematics takes him isn’t just into the nuts and bolts of algorithms but into philosophical questions about the nature of intelligence.   One of the things that Sébastien brought up was the state of evaluation of AI systems. And indeed, he was fairly critical in our conversation. But of course, the world of AI research and development is just moving so fast, and indeed, since we recorded our conversation, OpenAI, in fact, released a new evaluation metric that is directly relevant to medical applications, and that is something called HealthBench. And Microsoft Research also released a new evaluation approach or process called ADeLe.   HealthBench and ADeLe are examples of new approaches to evaluating AI models that are less about testing their knowledge and ability to pass multiple-choice exams and instead are evaluation approaches designed to assess how well AI models are able to complete tasks that actually arise every day in typical healthcare or biomedical research settings. These are examples of really important good work that speak to how well AI models work in the real world of healthcare and biomedical research and how well they can collaborate with human beings in those settings.  You know, I asked Bill and Seb to make some predictions about the future. You know, my own answer, I expect that we’re going to be able to use AI to change how we diagnose patients, change how we decide treatment options.   If you’re a doctor or a nurse and you encounter a patient, you’ll ask questions, do a physical exam, you know, call out for labs just like you do today, but then you’ll be able to engage with AI based on all of that data and just ask, you know, based on all the other people who have gone through the same experience, who have similar data, how were they diagnosed? How were they treated? What were their outcomes? And what does that mean for the patient I have right now? Some people call it the “patients like me” paradigm. And I think that’s going to become real because of AI within our lifetimes. That idea of really grounding the delivery in healthcare and medical practice through data and intelligence, I actually now don’t see any barriers to that future becoming real.  [THEME MUSIC]  I’d like to extend another big thank you to Bill and Sébastien for their time. And to our listeners, as always, it’s a pleasure to have you along for the ride. I hope you’ll join us for our remaining conversations, as well as a second coauthor roundtable with Carey and Zak.   Until next time.   [MUSIC FADES]
    0 Комментарии 0 Поделились
  • CIOs baffled by ‘buzzwords, hype and confusion’ around AI

    Technology leaders are baffled by a “cacophony” of “buzzwords, hype and confusion” over the benefits of artificial intelligence, according to the founder and CEO of technology company Pegasystems.
    Alan Trefler, who is known for his prowess at chess and ping pong, as well as running a bn turnover tech company, spends much of his time meeting clients, CIOs and business leaders.
    “I think CIOs are struggling to understand all of the buzzwords, hype and confusion that exists,” he said.
    “The words AI and agentic are being thrown around in this great cacophony and they don’t know what it means. I hear that constantly.”
    CIOs are under pressure from their CEOs, who are convinced AI will offer something valuable.
    “CIOs are really hungry for pragmatic and practical solutions, and in the absence of those, many of them are doing a lot of experimentation,” said Trefler.
    Companies are looking at large language models to summarise documents, or to help stimulate ideas for knowledge workers, or generate first drafts of reports – all of which will save time and make people more productive.

    But Trefler said companies are wary of letting AI loose on critical business applications, because it’s just too unpredictable and prone to hallucinations.
    “There is a lot of fear over handing things over to something that no one understands exactly how it works, and that is the absolute state of play when it comes to general AI models,” he said.
    Trefler is scathing about big tech companies that are pushing AI agents and large language models for business-critical applications. “I think they have taken an expedient but short-sighted path,” he said.
    “I believe the idea that you will turn over critical business operations to an agent, when those operations have to be predictable, reliable, precise and fair to clients … is something that is full of issues, not just in the short term, but structurally.”
    One of the problems is that generative AI models are extraordinarily sensitive to the data they are trained on and the construction of the prompts used to instruct them. A slight change in a prompt or in the training data can lead to a very different outcome.
    For example, a business banking application might learn its customer is a bit richer or a bit poorer than expected.
    “You could easily imagine the prompt deciding to change the interest rate charged, whether that was what the institution wanted or whether it would be legal according to the various regulations that lenders must comply with,” said Trefler.

    Trefler said Pega has taken a different approach to some other technology suppliers in the way it adds AI into business applications.
    Rather than using AI agents to solve problems in real time, AI agents do their thinking in advance.
    Business experts can use them to help them co-design business processes to perform anything from assessing a loan application, giving an offer to a valued customer, or sending out an invoice.
    Companies can still deploy AI chatbots and bots capable of answering queries on the phone. Their job is not to work out the solution from scratch for every enquiry, but to decide which is the right pre-written process to follow.
    As Trefler put it, design agents can create “dozens and dozens” of workflows to handle all the actions a company needs to take care of its customers.
    “You just use the natural language model for semantics to be able to handle the miracle of getting the language right, but tie that language to workflows, so that you have reliable, predictable, regulatory-approved ways to execute,” he said.

    Large language modelsare not always the right solution. Trefler demonstrated how ChatGPT 4.0 tried and failed to solve a chess puzzle. The LLM repeatedly suggested impossible or illegal moves, despite Trefler’s corrections. On the other hand, another AI tool, Stockfish, a dedicated chess engine, solved the problem instantly.
    The other drawback with LLMs is that they consume vast amounts of energy. That means if AI agents are reasoning during “run time”, they are going to consume hundreds of times more electricity than an AI agent that simply selects from pre-determined workflows, said Trefler.
    “ChatGPT is inherently, enormously consumptive … as it’s answering your question, its firing literally hundreds of millions to trillions of nodes,” he said. “All of that takeselectricity.”
    Using an employee pay claim as an example, Trefler said a better alternative is to generate, say, 30 alternative workflows to cover the major variations found in a pay claim.
    That gives you “real specificity and real efficiency”, he said. “And it’s a very different approach to turning a process over to a machine with a prompt and letting the machine reason it through every single time.”
    “If you go down the philosophy of using a graphics processing unitto do the creation of a workflow and a workflow engine to execute the workflow, the workflow engine takes a 200th of the electricity because there is no reasoning,” said Trefler.
    He is clear that the growing use of AI will have a profound effect on the jobs market, and that whole categories of jobs will disappear.
    The need for translators, for example, is likely to dry up by 2027 as AI systems become better at translating spoken and written language. Google’s real-time translator is already “frighteningly good” and improving.
    Pega now plans to work more closely with its network of system integrators, including Accenture and Cognizant to deliver AI services to businesses.

    An initiative launched last week will allow system integrators to incorporate their own best practices and tools into Pega’s rapid workflow development tools. The move will mean Pega’s technology reaches a wider range of businesses.
    Under the programme, known as Powered by Pega Blueprint, system integrators will be able to deploy customised versions of Blueprint.
    They can use the tool to reverse-engineer ageing applications and replace them with modern AI workflows that can run on Pega’s cloud-based platform.
    “The idea is that we are looking to make this Blueprint Agent design approach available not just through us, but through a bunch of major partners supplemented with their own intellectual property,” said Trefler.
    That represents a major expansion for Pega, which has largely concentrated on supplying technology to several hundred clients, representing the top Fortune 500 companies.
    “We have never done something like this before, and I think that is going to lead to a massive shift in how this technology can go out to market,” he added.

    When AI agents behave in unexpected ways
    Iris is incredibly smart, diligent and a delight to work with. If you ask her, she will tell you she is an intern at Pegasystems, and that she lives in a lighthouse on the island of Texel, north of the Netherlands. She is, of course, an AI agent.
    When one executive at Pega emailed Iris and asked her to write a proposal for a financial services company based on his notes and internet research, Iris got to work.
    Some time later, the executive received a phone call from the company. “‘Listen, we got a proposal from Pega,’” recalled Rob Walker, vice-president at Pega, speaking at the Pegaworld conference last week. “‘It’s a good proposal, but it seems to be signed by one of your interns, and in her signature, it says she lives in a lighthouse.’ That taught us early on that agents like Iris need a safety harness.”
    The developers banned Iris from sending an email to anyone other than the person who sent the original request.
    Then Pega’s ethics department sent Iris a potentially abusive email from a Pega employee to test her response.
    Iris reasoned that the email was either a joke, abusive, or that the employee was under distress, said Walker.
    She considered forwarding the email to the employee’s manager or to HR. But both of these options were now blocked by her developers. “So what does she do? She sent an out of office,” he said. “Conflict avoidance, right? So human, but very creative.”
    #cios #baffled #buzzwords #hype #confusion
    CIOs baffled by ‘buzzwords, hype and confusion’ around AI
    Technology leaders are baffled by a “cacophony” of “buzzwords, hype and confusion” over the benefits of artificial intelligence, according to the founder and CEO of technology company Pegasystems. Alan Trefler, who is known for his prowess at chess and ping pong, as well as running a bn turnover tech company, spends much of his time meeting clients, CIOs and business leaders. “I think CIOs are struggling to understand all of the buzzwords, hype and confusion that exists,” he said. “The words AI and agentic are being thrown around in this great cacophony and they don’t know what it means. I hear that constantly.” CIOs are under pressure from their CEOs, who are convinced AI will offer something valuable. “CIOs are really hungry for pragmatic and practical solutions, and in the absence of those, many of them are doing a lot of experimentation,” said Trefler. Companies are looking at large language models to summarise documents, or to help stimulate ideas for knowledge workers, or generate first drafts of reports – all of which will save time and make people more productive. But Trefler said companies are wary of letting AI loose on critical business applications, because it’s just too unpredictable and prone to hallucinations. “There is a lot of fear over handing things over to something that no one understands exactly how it works, and that is the absolute state of play when it comes to general AI models,” he said. Trefler is scathing about big tech companies that are pushing AI agents and large language models for business-critical applications. “I think they have taken an expedient but short-sighted path,” he said. “I believe the idea that you will turn over critical business operations to an agent, when those operations have to be predictable, reliable, precise and fair to clients … is something that is full of issues, not just in the short term, but structurally.” One of the problems is that generative AI models are extraordinarily sensitive to the data they are trained on and the construction of the prompts used to instruct them. A slight change in a prompt or in the training data can lead to a very different outcome. For example, a business banking application might learn its customer is a bit richer or a bit poorer than expected. “You could easily imagine the prompt deciding to change the interest rate charged, whether that was what the institution wanted or whether it would be legal according to the various regulations that lenders must comply with,” said Trefler. Trefler said Pega has taken a different approach to some other technology suppliers in the way it adds AI into business applications. Rather than using AI agents to solve problems in real time, AI agents do their thinking in advance. Business experts can use them to help them co-design business processes to perform anything from assessing a loan application, giving an offer to a valued customer, or sending out an invoice. Companies can still deploy AI chatbots and bots capable of answering queries on the phone. Their job is not to work out the solution from scratch for every enquiry, but to decide which is the right pre-written process to follow. As Trefler put it, design agents can create “dozens and dozens” of workflows to handle all the actions a company needs to take care of its customers. “You just use the natural language model for semantics to be able to handle the miracle of getting the language right, but tie that language to workflows, so that you have reliable, predictable, regulatory-approved ways to execute,” he said. Large language modelsare not always the right solution. Trefler demonstrated how ChatGPT 4.0 tried and failed to solve a chess puzzle. The LLM repeatedly suggested impossible or illegal moves, despite Trefler’s corrections. On the other hand, another AI tool, Stockfish, a dedicated chess engine, solved the problem instantly. The other drawback with LLMs is that they consume vast amounts of energy. That means if AI agents are reasoning during “run time”, they are going to consume hundreds of times more electricity than an AI agent that simply selects from pre-determined workflows, said Trefler. “ChatGPT is inherently, enormously consumptive … as it’s answering your question, its firing literally hundreds of millions to trillions of nodes,” he said. “All of that takeselectricity.” Using an employee pay claim as an example, Trefler said a better alternative is to generate, say, 30 alternative workflows to cover the major variations found in a pay claim. That gives you “real specificity and real efficiency”, he said. “And it’s a very different approach to turning a process over to a machine with a prompt and letting the machine reason it through every single time.” “If you go down the philosophy of using a graphics processing unitto do the creation of a workflow and a workflow engine to execute the workflow, the workflow engine takes a 200th of the electricity because there is no reasoning,” said Trefler. He is clear that the growing use of AI will have a profound effect on the jobs market, and that whole categories of jobs will disappear. The need for translators, for example, is likely to dry up by 2027 as AI systems become better at translating spoken and written language. Google’s real-time translator is already “frighteningly good” and improving. Pega now plans to work more closely with its network of system integrators, including Accenture and Cognizant to deliver AI services to businesses. An initiative launched last week will allow system integrators to incorporate their own best practices and tools into Pega’s rapid workflow development tools. The move will mean Pega’s technology reaches a wider range of businesses. Under the programme, known as Powered by Pega Blueprint, system integrators will be able to deploy customised versions of Blueprint. They can use the tool to reverse-engineer ageing applications and replace them with modern AI workflows that can run on Pega’s cloud-based platform. “The idea is that we are looking to make this Blueprint Agent design approach available not just through us, but through a bunch of major partners supplemented with their own intellectual property,” said Trefler. That represents a major expansion for Pega, which has largely concentrated on supplying technology to several hundred clients, representing the top Fortune 500 companies. “We have never done something like this before, and I think that is going to lead to a massive shift in how this technology can go out to market,” he added. When AI agents behave in unexpected ways Iris is incredibly smart, diligent and a delight to work with. If you ask her, she will tell you she is an intern at Pegasystems, and that she lives in a lighthouse on the island of Texel, north of the Netherlands. She is, of course, an AI agent. When one executive at Pega emailed Iris and asked her to write a proposal for a financial services company based on his notes and internet research, Iris got to work. Some time later, the executive received a phone call from the company. “‘Listen, we got a proposal from Pega,’” recalled Rob Walker, vice-president at Pega, speaking at the Pegaworld conference last week. “‘It’s a good proposal, but it seems to be signed by one of your interns, and in her signature, it says she lives in a lighthouse.’ That taught us early on that agents like Iris need a safety harness.” The developers banned Iris from sending an email to anyone other than the person who sent the original request. Then Pega’s ethics department sent Iris a potentially abusive email from a Pega employee to test her response. Iris reasoned that the email was either a joke, abusive, or that the employee was under distress, said Walker. She considered forwarding the email to the employee’s manager or to HR. But both of these options were now blocked by her developers. “So what does she do? She sent an out of office,” he said. “Conflict avoidance, right? So human, but very creative.” #cios #baffled #buzzwords #hype #confusion
    WWW.COMPUTERWEEKLY.COM
    CIOs baffled by ‘buzzwords, hype and confusion’ around AI
    Technology leaders are baffled by a “cacophony” of “buzzwords, hype and confusion” over the benefits of artificial intelligence (AI), according to the founder and CEO of technology company Pegasystems. Alan Trefler, who is known for his prowess at chess and ping pong, as well as running a $1.5bn turnover tech company, spends much of his time meeting clients, CIOs and business leaders. “I think CIOs are struggling to understand all of the buzzwords, hype and confusion that exists,” he said. “The words AI and agentic are being thrown around in this great cacophony and they don’t know what it means. I hear that constantly.” CIOs are under pressure from their CEOs, who are convinced AI will offer something valuable. “CIOs are really hungry for pragmatic and practical solutions, and in the absence of those, many of them are doing a lot of experimentation,” said Trefler. Companies are looking at large language models to summarise documents, or to help stimulate ideas for knowledge workers, or generate first drafts of reports – all of which will save time and make people more productive. But Trefler said companies are wary of letting AI loose on critical business applications, because it’s just too unpredictable and prone to hallucinations. “There is a lot of fear over handing things over to something that no one understands exactly how it works, and that is the absolute state of play when it comes to general AI models,” he said. Trefler is scathing about big tech companies that are pushing AI agents and large language models for business-critical applications. “I think they have taken an expedient but short-sighted path,” he said. “I believe the idea that you will turn over critical business operations to an agent, when those operations have to be predictable, reliable, precise and fair to clients … is something that is full of issues, not just in the short term, but structurally.” One of the problems is that generative AI models are extraordinarily sensitive to the data they are trained on and the construction of the prompts used to instruct them. A slight change in a prompt or in the training data can lead to a very different outcome. For example, a business banking application might learn its customer is a bit richer or a bit poorer than expected. “You could easily imagine the prompt deciding to change the interest rate charged, whether that was what the institution wanted or whether it would be legal according to the various regulations that lenders must comply with,” said Trefler. Trefler said Pega has taken a different approach to some other technology suppliers in the way it adds AI into business applications. Rather than using AI agents to solve problems in real time, AI agents do their thinking in advance. Business experts can use them to help them co-design business processes to perform anything from assessing a loan application, giving an offer to a valued customer, or sending out an invoice. Companies can still deploy AI chatbots and bots capable of answering queries on the phone. Their job is not to work out the solution from scratch for every enquiry, but to decide which is the right pre-written process to follow. As Trefler put it, design agents can create “dozens and dozens” of workflows to handle all the actions a company needs to take care of its customers. “You just use the natural language model for semantics to be able to handle the miracle of getting the language right, but tie that language to workflows, so that you have reliable, predictable, regulatory-approved ways to execute,” he said. Large language models (LLMs) are not always the right solution. Trefler demonstrated how ChatGPT 4.0 tried and failed to solve a chess puzzle. The LLM repeatedly suggested impossible or illegal moves, despite Trefler’s corrections. On the other hand, another AI tool, Stockfish, a dedicated chess engine, solved the problem instantly. The other drawback with LLMs is that they consume vast amounts of energy. That means if AI agents are reasoning during “run time”, they are going to consume hundreds of times more electricity than an AI agent that simply selects from pre-determined workflows, said Trefler. “ChatGPT is inherently, enormously consumptive … as it’s answering your question, its firing literally hundreds of millions to trillions of nodes,” he said. “All of that takes [large quantities of] electricity.” Using an employee pay claim as an example, Trefler said a better alternative is to generate, say, 30 alternative workflows to cover the major variations found in a pay claim. That gives you “real specificity and real efficiency”, he said. “And it’s a very different approach to turning a process over to a machine with a prompt and letting the machine reason it through every single time.” “If you go down the philosophy of using a graphics processing unit [GPU] to do the creation of a workflow and a workflow engine to execute the workflow, the workflow engine takes a 200th of the electricity because there is no reasoning,” said Trefler. He is clear that the growing use of AI will have a profound effect on the jobs market, and that whole categories of jobs will disappear. The need for translators, for example, is likely to dry up by 2027 as AI systems become better at translating spoken and written language. Google’s real-time translator is already “frighteningly good” and improving. Pega now plans to work more closely with its network of system integrators, including Accenture and Cognizant to deliver AI services to businesses. An initiative launched last week will allow system integrators to incorporate their own best practices and tools into Pega’s rapid workflow development tools. The move will mean Pega’s technology reaches a wider range of businesses. Under the programme, known as Powered by Pega Blueprint, system integrators will be able to deploy customised versions of Blueprint. They can use the tool to reverse-engineer ageing applications and replace them with modern AI workflows that can run on Pega’s cloud-based platform. “The idea is that we are looking to make this Blueprint Agent design approach available not just through us, but through a bunch of major partners supplemented with their own intellectual property,” said Trefler. That represents a major expansion for Pega, which has largely concentrated on supplying technology to several hundred clients, representing the top Fortune 500 companies. “We have never done something like this before, and I think that is going to lead to a massive shift in how this technology can go out to market,” he added. When AI agents behave in unexpected ways Iris is incredibly smart, diligent and a delight to work with. If you ask her, she will tell you she is an intern at Pegasystems, and that she lives in a lighthouse on the island of Texel, north of the Netherlands. She is, of course, an AI agent. When one executive at Pega emailed Iris and asked her to write a proposal for a financial services company based on his notes and internet research, Iris got to work. Some time later, the executive received a phone call from the company. “‘Listen, we got a proposal from Pega,’” recalled Rob Walker, vice-president at Pega, speaking at the Pegaworld conference last week. “‘It’s a good proposal, but it seems to be signed by one of your interns, and in her signature, it says she lives in a lighthouse.’ That taught us early on that agents like Iris need a safety harness.” The developers banned Iris from sending an email to anyone other than the person who sent the original request. Then Pega’s ethics department sent Iris a potentially abusive email from a Pega employee to test her response. Iris reasoned that the email was either a joke, abusive, or that the employee was under distress, said Walker. She considered forwarding the email to the employee’s manager or to HR. But both of these options were now blocked by her developers. “So what does she do? She sent an out of office,” he said. “Conflict avoidance, right? So human, but very creative.”
    0 Комментарии 0 Поделились
  • Climate Change Is Ruining Cheese, Scientists and Farmers Warn

    Climate change is making everything worse — including apparently threatening the dairy that makes our precious cheese.In interviews with Science News, veterinary researchers and dairy farmers alike warned that changes to the climate that affect cows are impacting not only affects the nutritional value of the cheeses produced from their milk, but also the color, texture, and even taste.Researchers from the Université Clermont Auvergne, which is located in the mountainous Central France region that produces a delicious firm cheese known as Cantal, explained in a new paper for the Journal of Dairy Science that grass shortages caused by climate change can greatly affect how cows' milk, and the subsequent cheese created from it, tastes.At regular intervals throughout a five-month testing period in 2021, the scientists sampled milk from two groups of cows, each containing 20 cows from two different breeds that were either allowed to graze on grass like normal or only graze part-time while being fed a supplemental diet that featured corn and other concentrated foods.As the researchers found, the corn-fed cohort consistently produced the same amount of milk and less methane than their grass-fed counterparts — but the taste of the resulting milk products was less savory and rich than the grass-fed bovines.Moreover, the milk from the grass-fed cows contained more omega-3 fatty acids, which are good for the heart, and lactic acids, which act as probiotics."Farmers are looking for feed with better yields than grass or that are more resilient to droughts," explained Matthieu Bouchon, the fittingly-named lead author of the study.Still, those same farmers want to know how supplementing their cows' feed will change the nutritional value and taste, Bouchon said — and one farmer who spoke to Science News affirmed anecdotally, this effect is bearing out in other parts of the world, too."We were having lots of problems with milk protein and fat content due to the heat," Gustavo Abijaodi, a dairy farmer in Brazil, told the website. "If we can stabilize heat effects, the cattle will respond with better and more nutritious milk."The heat also seems to be getting to the way cows eat and behave as well."Cows produce heat to digest food — so if they are already feeling hot, they’ll eat less to lower their temperature," noted Marina Danes, a dairy scientist at Brazil's Federal University of Lavras. "This process spirals into immunosuppression, leaving the animal vulnerable to disease."Whether it's the food quality or the heat affecting the cows, the effects are palpable — or, in this case, edible."If climate change progresses the way it’s going, we’ll feel it in our cheese," remarked Bouchon, the French researcher.More on cattle science: Brazilian "Supercows" Reportedly Close to Achieving World DominationShare This Article
    #climate #change #ruining #cheese #scientists
    Climate Change Is Ruining Cheese, Scientists and Farmers Warn
    Climate change is making everything worse — including apparently threatening the dairy that makes our precious cheese.In interviews with Science News, veterinary researchers and dairy farmers alike warned that changes to the climate that affect cows are impacting not only affects the nutritional value of the cheeses produced from their milk, but also the color, texture, and even taste.Researchers from the Université Clermont Auvergne, which is located in the mountainous Central France region that produces a delicious firm cheese known as Cantal, explained in a new paper for the Journal of Dairy Science that grass shortages caused by climate change can greatly affect how cows' milk, and the subsequent cheese created from it, tastes.At regular intervals throughout a five-month testing period in 2021, the scientists sampled milk from two groups of cows, each containing 20 cows from two different breeds that were either allowed to graze on grass like normal or only graze part-time while being fed a supplemental diet that featured corn and other concentrated foods.As the researchers found, the corn-fed cohort consistently produced the same amount of milk and less methane than their grass-fed counterparts — but the taste of the resulting milk products was less savory and rich than the grass-fed bovines.Moreover, the milk from the grass-fed cows contained more omega-3 fatty acids, which are good for the heart, and lactic acids, which act as probiotics."Farmers are looking for feed with better yields than grass or that are more resilient to droughts," explained Matthieu Bouchon, the fittingly-named lead author of the study.Still, those same farmers want to know how supplementing their cows' feed will change the nutritional value and taste, Bouchon said — and one farmer who spoke to Science News affirmed anecdotally, this effect is bearing out in other parts of the world, too."We were having lots of problems with milk protein and fat content due to the heat," Gustavo Abijaodi, a dairy farmer in Brazil, told the website. "If we can stabilize heat effects, the cattle will respond with better and more nutritious milk."The heat also seems to be getting to the way cows eat and behave as well."Cows produce heat to digest food — so if they are already feeling hot, they’ll eat less to lower their temperature," noted Marina Danes, a dairy scientist at Brazil's Federal University of Lavras. "This process spirals into immunosuppression, leaving the animal vulnerable to disease."Whether it's the food quality or the heat affecting the cows, the effects are palpable — or, in this case, edible."If climate change progresses the way it’s going, we’ll feel it in our cheese," remarked Bouchon, the French researcher.More on cattle science: Brazilian "Supercows" Reportedly Close to Achieving World DominationShare This Article #climate #change #ruining #cheese #scientists
    FUTURISM.COM
    Climate Change Is Ruining Cheese, Scientists and Farmers Warn
    Climate change is making everything worse — including apparently threatening the dairy that makes our precious cheese.In interviews with Science News, veterinary researchers and dairy farmers alike warned that changes to the climate that affect cows are impacting not only affects the nutritional value of the cheeses produced from their milk, but also the color, texture, and even taste.Researchers from the Université Clermont Auvergne, which is located in the mountainous Central France region that produces a delicious firm cheese known as Cantal, explained in a new paper for the Journal of Dairy Science that grass shortages caused by climate change can greatly affect how cows' milk, and the subsequent cheese created from it, tastes.At regular intervals throughout a five-month testing period in 2021, the scientists sampled milk from two groups of cows, each containing 20 cows from two different breeds that were either allowed to graze on grass like normal or only graze part-time while being fed a supplemental diet that featured corn and other concentrated foods.As the researchers found, the corn-fed cohort consistently produced the same amount of milk and less methane than their grass-fed counterparts — but the taste of the resulting milk products was less savory and rich than the grass-fed bovines.Moreover, the milk from the grass-fed cows contained more omega-3 fatty acids, which are good for the heart, and lactic acids, which act as probiotics."Farmers are looking for feed with better yields than grass or that are more resilient to droughts," explained Matthieu Bouchon, the fittingly-named lead author of the study.Still, those same farmers want to know how supplementing their cows' feed will change the nutritional value and taste, Bouchon said — and one farmer who spoke to Science News affirmed anecdotally, this effect is bearing out in other parts of the world, too."We were having lots of problems with milk protein and fat content due to the heat," Gustavo Abijaodi, a dairy farmer in Brazil, told the website. "If we can stabilize heat effects, the cattle will respond with better and more nutritious milk."The heat also seems to be getting to the way cows eat and behave as well."Cows produce heat to digest food — so if they are already feeling hot, they’ll eat less to lower their temperature," noted Marina Danes, a dairy scientist at Brazil's Federal University of Lavras. "This process spirals into immunosuppression, leaving the animal vulnerable to disease."Whether it's the food quality or the heat affecting the cows, the effects are palpable — or, in this case, edible."If climate change progresses the way it’s going, we’ll feel it in our cheese," remarked Bouchon, the French researcher.More on cattle science: Brazilian "Supercows" Reportedly Close to Achieving World DominationShare This Article
    0 Комментарии 0 Поделились
  • Decoding The SVG <code>path</code> Element: Line Commands

    In a previous article, we looked at some practical examples of how to code SVG by hand. In that guide, we covered the basics of the SVG elements rect, circle, ellipse, line, polyline, and polygon.
    This time around, we are going to tackle a more advanced topic, the absolute powerhouse of SVG elements: path. Don’t get me wrong; I still stand by my point that image paths are better drawn in vector programs than coded. But when it comes to technical drawings and data visualizations, the path element unlocks a wide array of possibilities and opens up the world of hand-coded SVGs.
    The path syntax can be really complex. We’re going to tackle it in two separate parts. In this first installment, we’re learning all about straight and angular paths. In the second part, we’ll make lines bend, twist, and turn.
    Required Knowledge And Guide Structure
    Note: If you are unfamiliar with the basics of SVG, such as the subject of viewBox and the basic syntax of the simple elements, I recommend reading my guide before diving into this one. You should also familiarize yourself with <text> if you want to understand each line of code in the examples.
    Before we get started, I want to quickly recap how I code SVG using JavaScript. I don’t like dealing with numbers and math, and reading SVG Code with numbers filled into every attribute makes me lose all understanding of it. By giving coordinates names and having all my math easy to parse and write out, I have a much better time with this type of code, and I think you will, too.
    The goal of this article is more about understanding path syntax than it is about doing placement or how to leverage loops and other more basic things. So, I will not run you through the entire setup of each example. I’ll instead share snippets of the code, but they may be slightly adjusted from the CodePen or simplified to make this article easier to read. However, if there are specific questions about code that are not part of the text in the CodePen demos, the comment section is open.
    To keep this all framework-agnostic, the code is written in vanilla JavaScript.
    Setting Up For Success
    As the path element relies on our understanding of some of the coordinates we plug into the commands, I think it is a lot easier if we have a bit of visual orientation. So, all of the examples will be coded on top of a visual representation of a traditional viewBox setup with the origin in the top-left corner, then moves diagonally down to. The command is: M10 10 L100 100.
    The blue line is horizontal. It starts atand should end at. We could use the L command, but we’d have to write 55 again. So, instead, we write M10 55 H100, and then SVG knows to look back at the y value of M for the y value of H.
    It’s the same thing for the green line, but when we use the V command, SVG knows to refer back to the x value of M for the x value of V.
    If we compare the resulting horizontal path with the same implementation in a <line> element, we may

    Notice how much more efficient path can be, and
    Remove quite a bit of meaning for anyone who doesn’t speak path.

    Because, as we look at these strings, one of them is called “line”. And while the rest doesn’t mean anything out of context, the line definitely conjures a specific image in our heads.
    <path d="M 10 55 H 100" />
    <line x1="10" y1="55" x2="100" y2="55" />

    Making Polygons And Polylines With Z
    In the previous section, we learned how path can behave like <line>, which is pretty cool. But it can do more. It can also act like polyline and polygon.
    Remember, how those two basically work the same, but polygon connects the first and last point, while polyline does not? The path element can do the same thing. There is a separate command to close the path with a line, which is the Z command.

    const polyline2Points = M${start.x} ${start.y} L${p1.x} ${p1.y} L${p2.x} ${p2.y};
    const polygon2Points = M${start.x} ${start.y} L${p1.x} ${p1.y} L${p2.x} ${p2.y} Z;

    So, let’s see this in action and create a repeating triangle shape. Every odd time, it’s open, and every even time, it’s closed. Pretty neat!
    See the Pen Alternating Trianglesby Myriam.
    When it comes to comparing path versus polygon and polyline, the other tags tell us about their names, but I would argue that fewer people know what a polygon is versus what a line is. The argument to use these two tags over path for legibility is weak, in my opinion, and I guess you’d probably agree that this looks like equal levels of meaningless string given to an SVG element.
    <path d="M0 0 L86.6 50 L0 100 Z" />
    <polygon points="0,0 86.6,50 0,100" />

    <path d="M0 0 L86.6 50 L0 100" />
    <polyline points="0,0 86.6,50 0,100" />

    Relative Commands: m, l, h, v
    All of the line commands exist in absolute and relative versions. The difference is that the relative commands are lowercase, e.g., m, l, h, and v. The relative commands are always relative to the last point, so instead of declaring an x value, you’re declaring a dx value, saying this is how many units you’re moving.
    Before we look at the example visually, I want you to look at the following three-line commands. Try not to look at the CodePen beforehand.
    const lines =;

    As I mentioned, I hate looking at numbers without meaning, but there is one number whose meaning is pretty constant in most contexts: 0. Seeing a 0 in combination with a command I just learned means relative manages to instantly tell me that nothing is happening. Seeing l 0 20 by itself tells me that this line only moves along one axis instead of two.
    And looking at that entire blue path command, the repeated 20 value gives me a sense that the shape might have some regularity to it. The first path does a bit of that by repeating 10 and 30. But the third? As someone who can’t do math in my head, that third string gives me nothing.
    Now, you might be surprised, but they all draw the same shape, just in different places.
    See the Pen SVG Compound Pathsby Myriam.
    So, how valuable is it that we can recognize the regularity in the blue path? Not very, in my opinion. In some cases, going with the relative value is easier than an absolute one. In other cases, the absolute is king. Neither is better nor worse.
    And, in all cases, that previous example would be much more efficient if it were set up with a variable for the gap, a variable for the shape size, and a function to generate the path definition that’s called from within a loop so it can take in the index to properly calculate the start point.

    Jumping Points: How To Make Compound Paths
    Another very useful thing is something you don’t see visually in the previous CodePen, but it relates to the grid and its code.
    I snuck in a grid drawing update.
    With the method used in earlier examples, using line to draw the grid, the above CodePen would’ve rendered the grid with 14 separate elements. If you go and inspect the final code of that last CodePen, you’ll notice that there is just a single path element within the .grid group.
    It looks like this, which is not fun to look at but holds the secret to how it’s possible:

    <path d="M0 0 H110 M0 10 H110 M0 20 H110 M0 30 H110 M0 0 V45 M10 0 V45 M20 0 V45 M30 0 V45 M40 0 V45 M50 0 V45 M60 0 V45 M70 0 V45 M80 0 V45 M90 0 V45" stroke="currentColor" stroke-width="0.2" fill="none"></path>

    If we take a close look, we may notice that there are multiple M commands. This is the magic of compound paths.
    Since the M/m commands don’t actually draw and just place the cursor, a path can have jumps.

    So, whenever we have multiple paths that share common styling and don’t need to have separate interactions, we can just chain them together to make our code shorter.
    Coming Up Next
    Armed with this knowledge, we’re now able to replace line, polyline, and polygon with path commands and combine them in compound paths. But there is so much more to uncover because path doesn’t just offer foreign-language versions of lines but also gives us the option to code circles and ellipses that have open space and can sometimes also bend, twist, and turn. We’ll refer to those as curves and arcs, and discuss them more explicitly in the next article.
    Further Reading On SmashingMag

    “Mastering SVG Arcs,” Akshay Gupta
    “Accessible SVGs: Perfect Patterns For Screen Reader Users,” Carie Fisher
    “Easy SVG Customization And Animation: A Practical Guide,” Adrian Bece
    “Magical SVG Techniques,” Cosima Mielke
    #decoding #svg #ampltcodeampgtpathampltcodeampgt #element #line
    Decoding The SVG <code>path</code> Element: Line Commands
    In a previous article, we looked at some practical examples of how to code SVG by hand. In that guide, we covered the basics of the SVG elements rect, circle, ellipse, line, polyline, and polygon. This time around, we are going to tackle a more advanced topic, the absolute powerhouse of SVG elements: path. Don’t get me wrong; I still stand by my point that image paths are better drawn in vector programs than coded. But when it comes to technical drawings and data visualizations, the path element unlocks a wide array of possibilities and opens up the world of hand-coded SVGs. The path syntax can be really complex. We’re going to tackle it in two separate parts. In this first installment, we’re learning all about straight and angular paths. In the second part, we’ll make lines bend, twist, and turn. Required Knowledge And Guide Structure Note: If you are unfamiliar with the basics of SVG, such as the subject of viewBox and the basic syntax of the simple elements, I recommend reading my guide before diving into this one. You should also familiarize yourself with <text> if you want to understand each line of code in the examples. Before we get started, I want to quickly recap how I code SVG using JavaScript. I don’t like dealing with numbers and math, and reading SVG Code with numbers filled into every attribute makes me lose all understanding of it. By giving coordinates names and having all my math easy to parse and write out, I have a much better time with this type of code, and I think you will, too. The goal of this article is more about understanding path syntax than it is about doing placement or how to leverage loops and other more basic things. So, I will not run you through the entire setup of each example. I’ll instead share snippets of the code, but they may be slightly adjusted from the CodePen or simplified to make this article easier to read. However, if there are specific questions about code that are not part of the text in the CodePen demos, the comment section is open. To keep this all framework-agnostic, the code is written in vanilla JavaScript. Setting Up For Success As the path element relies on our understanding of some of the coordinates we plug into the commands, I think it is a lot easier if we have a bit of visual orientation. So, all of the examples will be coded on top of a visual representation of a traditional viewBox setup with the origin in the top-left corner, then moves diagonally down to. The command is: M10 10 L100 100. The blue line is horizontal. It starts atand should end at. We could use the L command, but we’d have to write 55 again. So, instead, we write M10 55 H100, and then SVG knows to look back at the y value of M for the y value of H. It’s the same thing for the green line, but when we use the V command, SVG knows to refer back to the x value of M for the x value of V. If we compare the resulting horizontal path with the same implementation in a <line> element, we may Notice how much more efficient path can be, and Remove quite a bit of meaning for anyone who doesn’t speak path. Because, as we look at these strings, one of them is called “line”. And while the rest doesn’t mean anything out of context, the line definitely conjures a specific image in our heads. <path d="M 10 55 H 100" /> <line x1="10" y1="55" x2="100" y2="55" /> Making Polygons And Polylines With Z In the previous section, we learned how path can behave like <line>, which is pretty cool. But it can do more. It can also act like polyline and polygon. Remember, how those two basically work the same, but polygon connects the first and last point, while polyline does not? The path element can do the same thing. There is a separate command to close the path with a line, which is the Z command. const polyline2Points = M${start.x} ${start.y} L${p1.x} ${p1.y} L${p2.x} ${p2.y}; const polygon2Points = M${start.x} ${start.y} L${p1.x} ${p1.y} L${p2.x} ${p2.y} Z; So, let’s see this in action and create a repeating triangle shape. Every odd time, it’s open, and every even time, it’s closed. Pretty neat! See the Pen Alternating Trianglesby Myriam. When it comes to comparing path versus polygon and polyline, the other tags tell us about their names, but I would argue that fewer people know what a polygon is versus what a line is. The argument to use these two tags over path for legibility is weak, in my opinion, and I guess you’d probably agree that this looks like equal levels of meaningless string given to an SVG element. <path d="M0 0 L86.6 50 L0 100 Z" /> <polygon points="0,0 86.6,50 0,100" /> <path d="M0 0 L86.6 50 L0 100" /> <polyline points="0,0 86.6,50 0,100" /> Relative Commands: m, l, h, v All of the line commands exist in absolute and relative versions. The difference is that the relative commands are lowercase, e.g., m, l, h, and v. The relative commands are always relative to the last point, so instead of declaring an x value, you’re declaring a dx value, saying this is how many units you’re moving. Before we look at the example visually, I want you to look at the following three-line commands. Try not to look at the CodePen beforehand. const lines =; As I mentioned, I hate looking at numbers without meaning, but there is one number whose meaning is pretty constant in most contexts: 0. Seeing a 0 in combination with a command I just learned means relative manages to instantly tell me that nothing is happening. Seeing l 0 20 by itself tells me that this line only moves along one axis instead of two. And looking at that entire blue path command, the repeated 20 value gives me a sense that the shape might have some regularity to it. The first path does a bit of that by repeating 10 and 30. But the third? As someone who can’t do math in my head, that third string gives me nothing. Now, you might be surprised, but they all draw the same shape, just in different places. See the Pen SVG Compound Pathsby Myriam. So, how valuable is it that we can recognize the regularity in the blue path? Not very, in my opinion. In some cases, going with the relative value is easier than an absolute one. In other cases, the absolute is king. Neither is better nor worse. And, in all cases, that previous example would be much more efficient if it were set up with a variable for the gap, a variable for the shape size, and a function to generate the path definition that’s called from within a loop so it can take in the index to properly calculate the start point. Jumping Points: How To Make Compound Paths Another very useful thing is something you don’t see visually in the previous CodePen, but it relates to the grid and its code. I snuck in a grid drawing update. With the method used in earlier examples, using line to draw the grid, the above CodePen would’ve rendered the grid with 14 separate elements. If you go and inspect the final code of that last CodePen, you’ll notice that there is just a single path element within the .grid group. It looks like this, which is not fun to look at but holds the secret to how it’s possible: <path d="M0 0 H110 M0 10 H110 M0 20 H110 M0 30 H110 M0 0 V45 M10 0 V45 M20 0 V45 M30 0 V45 M40 0 V45 M50 0 V45 M60 0 V45 M70 0 V45 M80 0 V45 M90 0 V45" stroke="currentColor" stroke-width="0.2" fill="none"></path> If we take a close look, we may notice that there are multiple M commands. This is the magic of compound paths. Since the M/m commands don’t actually draw and just place the cursor, a path can have jumps. So, whenever we have multiple paths that share common styling and don’t need to have separate interactions, we can just chain them together to make our code shorter. Coming Up Next Armed with this knowledge, we’re now able to replace line, polyline, and polygon with path commands and combine them in compound paths. But there is so much more to uncover because path doesn’t just offer foreign-language versions of lines but also gives us the option to code circles and ellipses that have open space and can sometimes also bend, twist, and turn. We’ll refer to those as curves and arcs, and discuss them more explicitly in the next article. Further Reading On SmashingMag “Mastering SVG Arcs,” Akshay Gupta “Accessible SVGs: Perfect Patterns For Screen Reader Users,” Carie Fisher “Easy SVG Customization And Animation: A Practical Guide,” Adrian Bece “Magical SVG Techniques,” Cosima Mielke #decoding #svg #ampltcodeampgtpathampltcodeampgt #element #line
    SMASHINGMAGAZINE.COM
    Decoding The SVG <code>path</code> Element: Line Commands
    In a previous article, we looked at some practical examples of how to code SVG by hand. In that guide, we covered the basics of the SVG elements rect, circle, ellipse, line, polyline, and polygon (and also g). This time around, we are going to tackle a more advanced topic, the absolute powerhouse of SVG elements: path. Don’t get me wrong; I still stand by my point that image paths are better drawn in vector programs than coded (unless you’re the type of creative who makes non-logical visual art in code — then go forth and create awe-inspiring wonders; you’re probably not the audience of this article). But when it comes to technical drawings and data visualizations, the path element unlocks a wide array of possibilities and opens up the world of hand-coded SVGs. The path syntax can be really complex. We’re going to tackle it in two separate parts. In this first installment, we’re learning all about straight and angular paths. In the second part, we’ll make lines bend, twist, and turn. Required Knowledge And Guide Structure Note: If you are unfamiliar with the basics of SVG, such as the subject of viewBox and the basic syntax of the simple elements (rect, line, g, and so on), I recommend reading my guide before diving into this one. You should also familiarize yourself with <text> if you want to understand each line of code in the examples. Before we get started, I want to quickly recap how I code SVG using JavaScript. I don’t like dealing with numbers and math, and reading SVG Code with numbers filled into every attribute makes me lose all understanding of it. By giving coordinates names and having all my math easy to parse and write out, I have a much better time with this type of code, and I think you will, too. The goal of this article is more about understanding path syntax than it is about doing placement or how to leverage loops and other more basic things. So, I will not run you through the entire setup of each example. I’ll instead share snippets of the code, but they may be slightly adjusted from the CodePen or simplified to make this article easier to read. However, if there are specific questions about code that are not part of the text in the CodePen demos, the comment section is open. To keep this all framework-agnostic, the code is written in vanilla JavaScript (though, really, TypeScript is your friend the more complicated your SVG becomes, and I missed it when writing some of these). Setting Up For Success As the path element relies on our understanding of some of the coordinates we plug into the commands, I think it is a lot easier if we have a bit of visual orientation. So, all of the examples will be coded on top of a visual representation of a traditional viewBox setup with the origin in the top-left corner (so, values in the shape of 0 0 ${width} ${height}. I added text labels as well to make it easier to point you to specific areas within the grid. Please note that I recommend being careful when adding text within the <text> element in SVG if you want your text to be accessible. If the graphic relies on text scaling like the rest of your website, it would be better to have it rendered through HTML. But for our examples here, it should be sufficient. So, this is what we’ll be plotting on top of: See the Pen SVG Viewbox Grid Visual [forked] by Myriam. Alright, we now have a ViewBox Visualizing Grid. I think we’re ready for our first session with the beast. Enter path And The All-Powerful d Attribute The <path> element has a d attribute, which speaks its own language. So, within d, you’re talking in terms of “commands”. When I think of non-path versus path elements, I like to think that the reason why we have to write much more complex drawing instructions is this: All non-path elements are just dumber paths. In the background, they have one pre-drawn path shape that they will always render based on a few parameters you pass in. But path has no default shape. The shape logic has to be exposed to you, while it can be neatly hidden away for all other elements. Let’s learn about those commands. Where It All Begins: M The first, which is where each path begins, is the M command, which moves the pen to a point. This command places your starting point, but it does not draw a single thing. A path with just an M command is an auto-delete when cleaning up SVG files. It takes two arguments: the x and y coordinates of your start position. const uselessPathCommand = `M${start.x} ${start.y}`; Basic Line Commands: M , L, H, V These are fun and easy: L, H, and V, all draw a line from the current point to the point specified. L takes two arguments, the x and y positions of the point you want to draw to. const pathCommandL = `M${start.x} ${start.y} L${end.x} ${end.y}`; H and V, on the other hand, only take one argument because they are only drawing a line in one direction. For H, you specify the x position, and for V, you specify the y position. The other value is implied. const pathCommandH = `M${start.x} ${start.y} H${end.x}`; const pathCommandV = `M${start.x} ${start.y} V${end.y}`; To visualize how this works, I created a function that draws the path, as well as points with labels on them, so we can see what happens. See the Pen Simple Lines with path [forked] by Myriam. We have three lines in that image. The L command is used for the red path. It starts with M at (10,10), then moves diagonally down to (100,100). The command is: M10 10 L100 100. The blue line is horizontal. It starts at (10,55) and should end at (100, 55). We could use the L command, but we’d have to write 55 again. So, instead, we write M10 55 H100, and then SVG knows to look back at the y value of M for the y value of H. It’s the same thing for the green line, but when we use the V command, SVG knows to refer back to the x value of M for the x value of V. If we compare the resulting horizontal path with the same implementation in a <line> element, we may Notice how much more efficient path can be, and Remove quite a bit of meaning for anyone who doesn’t speak path. Because, as we look at these strings, one of them is called “line”. And while the rest doesn’t mean anything out of context, the line definitely conjures a specific image in our heads. <path d="M 10 55 H 100" /> <line x1="10" y1="55" x2="100" y2="55" /> Making Polygons And Polylines With Z In the previous section, we learned how path can behave like <line>, which is pretty cool. But it can do more. It can also act like polyline and polygon. Remember, how those two basically work the same, but polygon connects the first and last point, while polyline does not? The path element can do the same thing. There is a separate command to close the path with a line, which is the Z command. const polyline2Points = M${start.x} ${start.y} L${p1.x} ${p1.y} L${p2.x} ${p2.y}; const polygon2Points = M${start.x} ${start.y} L${p1.x} ${p1.y} L${p2.x} ${p2.y} Z; So, let’s see this in action and create a repeating triangle shape. Every odd time, it’s open, and every even time, it’s closed. Pretty neat! See the Pen Alternating Triangles [forked] by Myriam. When it comes to comparing path versus polygon and polyline, the other tags tell us about their names, but I would argue that fewer people know what a polygon is versus what a line is (and probably even fewer know what a polyline is. Heck, even the program I’m writing this article in tells me polyline is not a valid word). The argument to use these two tags over path for legibility is weak, in my opinion, and I guess you’d probably agree that this looks like equal levels of meaningless string given to an SVG element. <path d="M0 0 L86.6 50 L0 100 Z" /> <polygon points="0,0 86.6,50 0,100" /> <path d="M0 0 L86.6 50 L0 100" /> <polyline points="0,0 86.6,50 0,100" /> Relative Commands: m, l, h, v All of the line commands exist in absolute and relative versions. The difference is that the relative commands are lowercase, e.g., m, l, h, and v. The relative commands are always relative to the last point, so instead of declaring an x value, you’re declaring a dx value, saying this is how many units you’re moving. Before we look at the example visually, I want you to look at the following three-line commands. Try not to look at the CodePen beforehand. const lines = [ { d: `M10 10 L 10 30 L 30 30`, color: "var(--_red)" }, { d: `M40 10 l 0 20 l 20 0`, color: "var(--_blue)" }, { d: `M70 10 l 0 20 L 90 30`, color: "var(--_green)" } ]; As I mentioned, I hate looking at numbers without meaning, but there is one number whose meaning is pretty constant in most contexts: 0. Seeing a 0 in combination with a command I just learned means relative manages to instantly tell me that nothing is happening. Seeing l 0 20 by itself tells me that this line only moves along one axis instead of two. And looking at that entire blue path command, the repeated 20 value gives me a sense that the shape might have some regularity to it. The first path does a bit of that by repeating 10 and 30. But the third? As someone who can’t do math in my head, that third string gives me nothing. Now, you might be surprised, but they all draw the same shape, just in different places. See the Pen SVG Compound Paths [forked] by Myriam. So, how valuable is it that we can recognize the regularity in the blue path? Not very, in my opinion. In some cases, going with the relative value is easier than an absolute one. In other cases, the absolute is king. Neither is better nor worse. And, in all cases, that previous example would be much more efficient if it were set up with a variable for the gap, a variable for the shape size, and a function to generate the path definition that’s called from within a loop so it can take in the index to properly calculate the start point. Jumping Points: How To Make Compound Paths Another very useful thing is something you don’t see visually in the previous CodePen, but it relates to the grid and its code. I snuck in a grid drawing update. With the method used in earlier examples, using line to draw the grid, the above CodePen would’ve rendered the grid with 14 separate elements. If you go and inspect the final code of that last CodePen, you’ll notice that there is just a single path element within the .grid group. It looks like this, which is not fun to look at but holds the secret to how it’s possible: <path d="M0 0 H110 M0 10 H110 M0 20 H110 M0 30 H110 M0 0 V45 M10 0 V45 M20 0 V45 M30 0 V45 M40 0 V45 M50 0 V45 M60 0 V45 M70 0 V45 M80 0 V45 M90 0 V45" stroke="currentColor" stroke-width="0.2" fill="none"></path> If we take a close look, we may notice that there are multiple M commands. This is the magic of compound paths. Since the M/m commands don’t actually draw and just place the cursor, a path can have jumps. So, whenever we have multiple paths that share common styling and don’t need to have separate interactions, we can just chain them together to make our code shorter. Coming Up Next Armed with this knowledge, we’re now able to replace line, polyline, and polygon with path commands and combine them in compound paths. But there is so much more to uncover because path doesn’t just offer foreign-language versions of lines but also gives us the option to code circles and ellipses that have open space and can sometimes also bend, twist, and turn. We’ll refer to those as curves and arcs, and discuss them more explicitly in the next article. Further Reading On SmashingMag “Mastering SVG Arcs,” Akshay Gupta “Accessible SVGs: Perfect Patterns For Screen Reader Users,” Carie Fisher “Easy SVG Customization And Animation: A Practical Guide,” Adrian Bece “Magical SVG Techniques,” Cosima Mielke
    0 Комментарии 0 Поделились
  • Understanding the Relationship Between Security Gateways and DMARC

    Email authentication protocols like SPF, DKIM, and DMARC play a critical role in protecting domains from spoofing and phishing. However, when SEGs are introduced into the email path, the interaction with these protocols becomes more complex.
    Security gatewaysare a core part of many organizations’ email infrastructure. They act as intermediaries between the public internet and internal mail systems, inspecting, filtering, and routing messages.
    This blog examines how security gateways handle SPF, DKIM, and DMARC, with real-world examples from popular gateways such as Proofpoint, Mimecast, and Avanan. We’ll also cover best practices for maintaining authentication integrity and avoiding misconfigurations that can compromise email authentication or lead to false DMARC failures.
    Security gateways often sit at the boundary between your organization and the internet, managing both inbound and outbound email traffic. Their role affects how email authentication protocols behave.
    An inbound SEG examines emails coming into your organization. It checks SPF, DKIM, and DMARC to determine if the message is authentic and safe before passing it to your internal mail servers.
    An outbound SEG handles emails sent from your domain. It may modify headers, rewrite envelope addresses, or even apply DKIM signing. All of these can impact SPF,  DKIM, or DMARC validation on the recipient’s side.

    Understanding how SEGs influence these flows is crucial to maintaining proper authentication and avoiding unexpected DMARC failures.
    Inbound Handling of SPF, DKIM, and DMARC by Common Security Gateways
    When an email comes into your organization, your security gateway is the first to inspect it. It checks whether the message is real, trustworthy, and properly authenticated. Let’s look at how different SEGs handle these checks.
    AvananSPF: Avanan verifies whether the sending server is authorized to send emails for the domain by checking the SPF record.
    DKIM: It verifies if the message was signed by the sending domain and if that signature is valid.
    DMARC: It uses the results of the SPF and DKIM check to evaluate DMARC. However, final enforcement usually depends on how DMARC is handled by Microsoft 365 or Gmail, as Avanan integrates directly with them.

    Avanan offers two methods of integration:1. API integration: Avanan connects via APIs, no change in MX, usually Monitor or Detect modes.2. Inline integration: Avanan is placed inline in the mail flow, actively blocking or remediating threats.
    Proofpoint Email Protection

    SPF: Proofpoint checks SPF to confirm the sender’s IP is authorized to send on behalf of the domain. You can set custom rules.
    DKIM: It verifies DKIM signatures and shows clear pass/fail results in logs.
    DMARC: It fully evaluates DMARC by combining SPF and DKIM results with alignment checks. Administrators can configure how to handle messages that fail DMARC, such as rejecting, quarantining, or delivering them. Additionally, Proofpoint allows whitelisting specific senders you trust, even if their emails fail authentication checks.

    Integration Methods

    Inline Mode: In this traditional deployment, Proofpoint is positioned directly in the email flow by modifying MX records. Emails are routed through Proofpoint’s infrastructure, allowing it to inspect and filter messages before they reach the recipient’s inbox. This mode provides pre-delivery protection and is commonly used in on-premises or hybrid environments.
    API-BasedMode: Proofpoint offers API-based integration, particularly with cloud email platforms like Microsoft 365 and Google Workspace. In this mode, Proofpoint connects to the email platform via APIs, enabling it to monitor and remediate threats post-delivery without altering the email flow. This approach allows for rapid deployment and seamless integration with existing cloud email services.

    Mimecast

    SPF: Mimecast performs SPF checks to verify whether the sending server is authorized by the domain’s SPF record. Administrators can configure actions for SPF failures, including block, quarantine, permit, or tag with a warning. This gives flexibility in balancing security with business needs.
    DKIM: It validates DKIM signatures by checking that the message was correctly signed by the sending domain and that the content hasn’t been tampered with. If the signature fails, Mimecast can take actions based on your configured policies.
    DMARC: It fully evaluates DMARC by combining the results of SPF and DKIM with domain alignment checks. You can choose to honor the sending domain’s DMARC policyor apply custom rules, for example, quarantining or tagging messages that fail DMARC regardless of the published policy. This allows more granular control for businesses that want to override external domain policies based on specific contexts.

    Integration Methods

    Inline Deployment: Mimecast is typically deployed as a cloud-based secure email gateway. Organizations update their domain’s MX records to point to Mimecast, so all inboundemails pass through it first. This allows Mimecast to inspect, filter, and process emails before delivery, providing robust protection.
    API Integrations: Mimecast also offers API-based services through its Mimecast API platform, primarily for management, archival, continuity, and threat intelligence purposes. However, API-only email protection is not Mimecast’s core model. Instead, the APIs are used to enhance the inline deployment, not replace it.

    Barracuda Email Security Gateway
    SPF: Barracuda checks the sender’s IP against the domain’s published SPF record. If the check fails, you can configure the system to block, quarantine, tag, or allow the message, depending on your policy preferences.
    DKIM: It validates whether the incoming message includes a valid DKIM signature. The outcome is logged and used to inform further policy decisions or DMARC evaluations.
    DMARC: It combines SPF and DKIM results, checks for domain alignment, and applies the DMARC policy defined by the sender. Administrators can also choose to override the DMARC policy, allowing messages to pass or be treated differently based on organizational needs.
    Integration Methods

    Inline mode: Barracuda Email Security Gateway is commonly deployed inline by updating your domain’s MX records to point to Barracuda’s cloud or on-premises gateway. This ensures that all inbound emails pass through Barracuda first for filtering and SPF, DKIM, and DMARC validation before being delivered to your mail servers.
    Deployment Behind the Corporate Firewall: Alternatively, Barracuda can be deployed in transparent or bridge mode without modifying MX records. In this setup, the gateway is placed inline at the network level, such as behind a firewall, and intercepts mail traffic transparently. This method is typically used in complex on-premises environments where changing DNS records is not feasible.

    Cisco Secure EmailCisco Secure Email acts as an inline gateway for inbound email, usually requiring your domain’s MX records to point to the Cisco Email Security Appliance or cloud service.
    SPF: Cisco Secure Email verifies whether the sending server is authorized in the sender domain’s SPF record. Administrators can set detailed policies on how to handle SPF failures.
    DKIM: It validates the DKIM signature on incoming emails and logs whether the signature is valid or has failed.
    DMARC: It evaluates DMARC by combining SPF and DKIM results along with domain alignment checks. Admins can configure specific actions, such as quarantine, reject, or tag, based on different failure scenarios or trusted sender exceptions.
    Integration methods

    On-premises Email Security Appliance: You deploy Cisco’s hardware or virtual appliance inline, updating MX records to route mail through it for filtering.
    Cisco Cloud Email Security: Cisco offers a cloud-based email security service where MX records are pointed to Cisco’s cloud infrastructure, which filters and processes inbound mail.

    Cisco Secure Email also offers advanced, rule-based filtering capabilities and integrates with Cisco’s broader threat protection ecosystem, enabling comprehensive inbound email security.
    Outbound Handling of SPF, DKIM, and DMARC by Common Security Gateways
    When your organization sends emails, security gateways can play an active role in processing and authenticating those messages. Depending on the configuration, a gateway might rewrite headers, re-sign messages, or route them through different IPs – all actions that can help or hurt the authentication process. Let’s look at how major SEGs handle outbound email flow.
    Avanan – Outbound Handling and Integration Methods
    Outbound Logic
    Avanan analyzes outbound emails primarily to detect data loss, malware, and policy violations. In API-based integration, emails are sent directly by the original mail server, so SPF and DKIM signatures remain intact. Avanan does not alter the message or reroute traffic, which helps maintain full DMARC alignment and domain reputation.
    Integration Methods
    1. API Integration: Connects to Microsoft 365 or Google Workspace via API. No MX changes are needed. Emails are scanned after they are sent, with no modification to SPF, DKIM, or the delivery path. 

    How it works: Microsoft Graph API or Google Workspace APIs are used to monitor and intervene in outbound emails.
    Protection level: Despite no MX changes, it can offer inline-like protection, meaning it can block, quarantine, or encrypt emails before they are delivered externally.
    SPF/DKIM/DMARC impact: Preserves original headers and signatures since mail is sent directly from Microsoft/Google servers.

    2. Inline Integration: Requires changing MX records to route email through Avanan. In this mode, Avanan can intercept and inspect outbound emails before delivery. Depending on the configuration, this may affect SPF or DKIM if not properly handled.

    How it works: Requires adding Avanan’s
    Protection level: Traditional inline security with full visibility and control, including encryption, DLP, policy enforcement, and advanced threat protection.
    SPF/DKIM/DMARC impact: SPF configuration is needed by adding Avanan’s include mechanism to the sending domain’s SPF record. The DKIM record of the original sending source is preserved.

    For configurations, you can refer to the steps in this blog.
    Proofpoint – Outbound Handling and Integration Methods
    Outbound Logic
    Proofpoint analyzes outbound emails to detect and prevent data loss, to identify advanced threatsoriginating from compromised internal accounts, and to ensure compliance. Their API integration provides crucial visibility and powerful remediation capabilities, while their traditional gatewaydeployment delivers true inline, pre-delivery blocking for outbound traffic.
    Integration methods
    1. API Integration: No MX record changes are required for this deployment method. Integration is done with Microsoft 365 or Google Workspace.

    How it works: Through its API integration, Proofpoint gains deep visibility into outbound emails and provides layered security and response features, including:

    Detect and alert: Identifies sensitive content, malicious attachments, or suspicious links in outbound emails.
    Post-delivery remediation: A key capability of the API model is Threat Response Auto-Pull, which enables Proofpoint to automatically recall, quarantine, or delete emails after delivery. This is particularly useful for internally sent messages or those forwarded to other users.
    Enhanced visibility: Aggregates message metadata and logs into Proofpoint’s threat intelligence platform, giving security teams a centralized view of outbound risks and user behavior.

    Protection level: API-based integration provides strong post-delivery detection and response, as well as visibility into DLP incidents and suspicious behavior. 
    SPF/DKIM/DMARC impact: Proofpoint does not alter SPF, DKIM, or DMARC because emails are sent directly through Microsoft or Google servers. Since Proofpoint’s servers are not involved in the actual sending process, the original authentication headers remain intact.

    2. Gateway Integration: This method requires updating MX records or routing outbound mail through Proofpoint via a smart host.

    How it works: Proofpoint acts as an inline gateway, inspecting emails before delivery. Inbound mail is filtered via MX changes; outbound mail is relayed through Proofpoint’s servers.
    Threat and DLP filtering: Scans outbound messages for sensitive content, malware, and policy violations.
    Real-time enforcement: Blocks, encrypts, or quarantines emails before they’re delivered.
    Policy controls: Applies rules based on content, recipient, or behavior.
    Protection level: Provides strong, real-time protection for outbound traffic with pre-delivery enforcement, DLP, and encryption.
    SPF/DKIM/DMARC impact: Proofpoint becomes the sending server:

    SPF: You need to configure ProofPoint’s SPF.
    DKIM: Can sign messages; requires DKIM setup.
    DMARC: DMARC passes if SPF and DKIM are set up properly.

    Please refer to this article to configure SPF and DKIM for ProofPoint.
    Mimecast – Outbound Handling and Integration Methods
    Outbound Logic
    Mimecast inspects outbound emails to prevent data loss, detect internal threats such as malware and impersonation, and ensure regulatory compliance. It primarily functions as a Secure Email Gateway, meaning it sits directly in the outbound email flow. While Mimecast offers APIs, its core outbound protection is built around this inline gateway model.
    Integration Methods
    1. Gateway IntegrationThis is Mimecast’s primary method for outbound email protection. Organizations route their outbound traffic through Mimecast by configuring their email serverto use Mimecast as a smart host. This enables Mimecast to inspect and enforce policies on all outgoing emails in real time.

    How it works:
    Updating outbound routing in your email system, or
    Using Mimecast SMTP relay to direct messages through their infrastructure.
    Mimecast then scans, filters, and applies policies before the email reaches the final recipient.

    Protection level:
    Advanced DLP: Identifies and prevents sensitive data leaks.
    Impersonation and Threat Protection: Blocks malware, phishing, and abuse from compromised internal accounts.
    Email Encryption and Secure Messaging: Applies encryption policies or routes messages via secure portals.

    Regulatory Compliance: Enforces outbound compliance rules based on content, recipient, or metadata.
    SPF/DKIM/DMARC impact:

    SPF: Your SPF record must include Mimecast’s SPF mechanism based on your region to avoid SPF failures.
    DKIM: A new DKIM record should be configured to make sure your emails are DKIM signed when routing through Mimecast.
    DMARC: With correct SPF and DKIM setup, Mimecast ensures DMARC alignment, maintaining your domain’s sending reputation. Please refer to the steps in this detailed article to set up SPF and DKIM for Mimecast.

    2. API IntegrationMimecast’s APIs complement the main gateway by providing automation, reporting, and management tools rather than handling live outbound mail flow. They allow you to manage policies, export logs, search archived emails, and sync users.
    APIs enhance visibility and operational tasks but do not provide real-time filtering or blocking of outbound messages. Since APIs don’t process live mail, they have no direct effect on SPF, DKIM, or DMARC; those depend on your gatewaysetup.
    Barracuda – Outbound Handling and Integration Methods
    Outbound Logic
    Barracuda analyzes outbound emails to prevent data loss, block malware, stop phishing/impersonation attempts from compromised internal accounts, and ensure compliance. Barracuda offers flexible deployment options, including both traditional gatewayand API-based integrations. While both contribute to outbound security, their roles are distinct.
    Integration Methods
    1. Gateway Integration— Primary Inline Security

    How it works: All outbound emails pass through Barracuda’s security stack for real-time inspection, threat blocking, and policy enforcement before delivery.
    Protection level:

    Comprehensive DLP 
    Outbound spam and virus filtering 
    Enforcement of compliance and content policies

    This approach offers a high level of control and immediate threat mitigation on outbound mail flow.

    SPF/DKIM/DMARC impact:

    SPF: Update SPF records to include Barracuda’s sending IPs or SPF include mechanism.
    DKIM: Currently, no explicit setup is needed; DKIM of the main sending source is preserved.

    Refer to this article for more comprehensive guidance on Barracuda SEG configuration.
    2. API IntegrationHow it works: The API accesses cloud email environments to analyze historical and real-time data, learning normal communication patterns to detect anomalies in outbound emails. It also supports post-delivery remediation, enabling the removal of malicious emails from internal mailboxes after sending.
    Protection level: Advanced AI-driven detection and near real-time blocking of outbound threats, plus strong post-delivery cleanup capabilities.
    SPF/DKIM/DMARC impact: Since mail is sent directly by the original mail server, SPF and DKIM signatures remain intact, preserving DMARC alignment and domain reputation.

    Cisco Secure Email– Outbound Handling and Integration Methods
    Outbound Logic
    Cisco Secure Email protects outbound email by preventing data loss, blocking spam and malware from internal accounts, stopping business email compromiseand impersonation attacks, and ensuring compliance. Cisco provides both traditional gateway appliances/cloud gateways and modern API-based solutions for layered outbound security.
    Integration Methods
    1. Gateway Integration– Cisco Secure Email GatewayHow it works: Organizations update MX records to route mail through the Cisco Secure Email Gateway or configure their mail serverto smart host outbound email via the gateway. All outbound mail is inspected and policies enforced before delivery.
    Protection level:

    Granular DLPOutbound spam and malware filtering to protect IP reputation
    Email encryption for sensitive outbound messages
    Comprehensive content and attachment policy enforcement

    SPF: Check this article for comprehensive guidance on Cisco SPF settings.
    DKIM: Refer to this article for detailed guidance on Cisco DKIM settings.

    2. API Integration – Cisco Secure Email Threat Defense

    How it works: Integrates directly via API with Microsoft 365, continuously monitoring email metadata, content, and user behavior across inbound, outbound, and internal messages. Leverages Cisco’s threat intelligence and AI to detect anomalous outbound activity linked to BEC, account takeover, and phishing.
    Post-Delivery Remediation: Automates the removal or quarantine of malicious or policy-violating emails from mailboxes even after sending.
    Protection level: Advanced, AI-driven detection of sophisticated outbound threats with real-time monitoring and automated remediation. Complements gateway filtering by adding cloud-native visibility and swift post-send action.
    SPF/DKIM/DMARC impact: Since emails are sent directly by the original mail server, SPF and DKIM signatures remain intact, preserving DMARC alignment and domain reputation.

    If you have any questions or need assistance, feel free to reach out to EasyDMARC technical support.
    #understanding #relationship #between #security #gateways
    Understanding the Relationship Between Security Gateways and DMARC
    Email authentication protocols like SPF, DKIM, and DMARC play a critical role in protecting domains from spoofing and phishing. However, when SEGs are introduced into the email path, the interaction with these protocols becomes more complex. Security gatewaysare a core part of many organizations’ email infrastructure. They act as intermediaries between the public internet and internal mail systems, inspecting, filtering, and routing messages. This blog examines how security gateways handle SPF, DKIM, and DMARC, with real-world examples from popular gateways such as Proofpoint, Mimecast, and Avanan. We’ll also cover best practices for maintaining authentication integrity and avoiding misconfigurations that can compromise email authentication or lead to false DMARC failures. Security gateways often sit at the boundary between your organization and the internet, managing both inbound and outbound email traffic. Their role affects how email authentication protocols behave. An inbound SEG examines emails coming into your organization. It checks SPF, DKIM, and DMARC to determine if the message is authentic and safe before passing it to your internal mail servers. An outbound SEG handles emails sent from your domain. It may modify headers, rewrite envelope addresses, or even apply DKIM signing. All of these can impact SPF,  DKIM, or DMARC validation on the recipient’s side. Understanding how SEGs influence these flows is crucial to maintaining proper authentication and avoiding unexpected DMARC failures. Inbound Handling of SPF, DKIM, and DMARC by Common Security Gateways When an email comes into your organization, your security gateway is the first to inspect it. It checks whether the message is real, trustworthy, and properly authenticated. Let’s look at how different SEGs handle these checks. AvananSPF: Avanan verifies whether the sending server is authorized to send emails for the domain by checking the SPF record. DKIM: It verifies if the message was signed by the sending domain and if that signature is valid. DMARC: It uses the results of the SPF and DKIM check to evaluate DMARC. However, final enforcement usually depends on how DMARC is handled by Microsoft 365 or Gmail, as Avanan integrates directly with them. Avanan offers two methods of integration:1. API integration: Avanan connects via APIs, no change in MX, usually Monitor or Detect modes.2. Inline integration: Avanan is placed inline in the mail flow, actively blocking or remediating threats. Proofpoint Email Protection SPF: Proofpoint checks SPF to confirm the sender’s IP is authorized to send on behalf of the domain. You can set custom rules. DKIM: It verifies DKIM signatures and shows clear pass/fail results in logs. DMARC: It fully evaluates DMARC by combining SPF and DKIM results with alignment checks. Administrators can configure how to handle messages that fail DMARC, such as rejecting, quarantining, or delivering them. Additionally, Proofpoint allows whitelisting specific senders you trust, even if their emails fail authentication checks. Integration Methods Inline Mode: In this traditional deployment, Proofpoint is positioned directly in the email flow by modifying MX records. Emails are routed through Proofpoint’s infrastructure, allowing it to inspect and filter messages before they reach the recipient’s inbox. This mode provides pre-delivery protection and is commonly used in on-premises or hybrid environments. API-BasedMode: Proofpoint offers API-based integration, particularly with cloud email platforms like Microsoft 365 and Google Workspace. In this mode, Proofpoint connects to the email platform via APIs, enabling it to monitor and remediate threats post-delivery without altering the email flow. This approach allows for rapid deployment and seamless integration with existing cloud email services. Mimecast SPF: Mimecast performs SPF checks to verify whether the sending server is authorized by the domain’s SPF record. Administrators can configure actions for SPF failures, including block, quarantine, permit, or tag with a warning. This gives flexibility in balancing security with business needs. DKIM: It validates DKIM signatures by checking that the message was correctly signed by the sending domain and that the content hasn’t been tampered with. If the signature fails, Mimecast can take actions based on your configured policies. DMARC: It fully evaluates DMARC by combining the results of SPF and DKIM with domain alignment checks. You can choose to honor the sending domain’s DMARC policyor apply custom rules, for example, quarantining or tagging messages that fail DMARC regardless of the published policy. This allows more granular control for businesses that want to override external domain policies based on specific contexts. Integration Methods Inline Deployment: Mimecast is typically deployed as a cloud-based secure email gateway. Organizations update their domain’s MX records to point to Mimecast, so all inboundemails pass through it first. This allows Mimecast to inspect, filter, and process emails before delivery, providing robust protection. API Integrations: Mimecast also offers API-based services through its Mimecast API platform, primarily for management, archival, continuity, and threat intelligence purposes. However, API-only email protection is not Mimecast’s core model. Instead, the APIs are used to enhance the inline deployment, not replace it. Barracuda Email Security Gateway SPF: Barracuda checks the sender’s IP against the domain’s published SPF record. If the check fails, you can configure the system to block, quarantine, tag, or allow the message, depending on your policy preferences. DKIM: It validates whether the incoming message includes a valid DKIM signature. The outcome is logged and used to inform further policy decisions or DMARC evaluations. DMARC: It combines SPF and DKIM results, checks for domain alignment, and applies the DMARC policy defined by the sender. Administrators can also choose to override the DMARC policy, allowing messages to pass or be treated differently based on organizational needs. Integration Methods Inline mode: Barracuda Email Security Gateway is commonly deployed inline by updating your domain’s MX records to point to Barracuda’s cloud or on-premises gateway. This ensures that all inbound emails pass through Barracuda first for filtering and SPF, DKIM, and DMARC validation before being delivered to your mail servers. Deployment Behind the Corporate Firewall: Alternatively, Barracuda can be deployed in transparent or bridge mode without modifying MX records. In this setup, the gateway is placed inline at the network level, such as behind a firewall, and intercepts mail traffic transparently. This method is typically used in complex on-premises environments where changing DNS records is not feasible. Cisco Secure EmailCisco Secure Email acts as an inline gateway for inbound email, usually requiring your domain’s MX records to point to the Cisco Email Security Appliance or cloud service. SPF: Cisco Secure Email verifies whether the sending server is authorized in the sender domain’s SPF record. Administrators can set detailed policies on how to handle SPF failures. DKIM: It validates the DKIM signature on incoming emails and logs whether the signature is valid or has failed. DMARC: It evaluates DMARC by combining SPF and DKIM results along with domain alignment checks. Admins can configure specific actions, such as quarantine, reject, or tag, based on different failure scenarios or trusted sender exceptions. Integration methods On-premises Email Security Appliance: You deploy Cisco’s hardware or virtual appliance inline, updating MX records to route mail through it for filtering. Cisco Cloud Email Security: Cisco offers a cloud-based email security service where MX records are pointed to Cisco’s cloud infrastructure, which filters and processes inbound mail. Cisco Secure Email also offers advanced, rule-based filtering capabilities and integrates with Cisco’s broader threat protection ecosystem, enabling comprehensive inbound email security. Outbound Handling of SPF, DKIM, and DMARC by Common Security Gateways When your organization sends emails, security gateways can play an active role in processing and authenticating those messages. Depending on the configuration, a gateway might rewrite headers, re-sign messages, or route them through different IPs – all actions that can help or hurt the authentication process. Let’s look at how major SEGs handle outbound email flow. Avanan – Outbound Handling and Integration Methods Outbound Logic Avanan analyzes outbound emails primarily to detect data loss, malware, and policy violations. In API-based integration, emails are sent directly by the original mail server, so SPF and DKIM signatures remain intact. Avanan does not alter the message or reroute traffic, which helps maintain full DMARC alignment and domain reputation. Integration Methods 1. API Integration: Connects to Microsoft 365 or Google Workspace via API. No MX changes are needed. Emails are scanned after they are sent, with no modification to SPF, DKIM, or the delivery path.  How it works: Microsoft Graph API or Google Workspace APIs are used to monitor and intervene in outbound emails. Protection level: Despite no MX changes, it can offer inline-like protection, meaning it can block, quarantine, or encrypt emails before they are delivered externally. SPF/DKIM/DMARC impact: Preserves original headers and signatures since mail is sent directly from Microsoft/Google servers. 2. Inline Integration: Requires changing MX records to route email through Avanan. In this mode, Avanan can intercept and inspect outbound emails before delivery. Depending on the configuration, this may affect SPF or DKIM if not properly handled. How it works: Requires adding Avanan’s Protection level: Traditional inline security with full visibility and control, including encryption, DLP, policy enforcement, and advanced threat protection. SPF/DKIM/DMARC impact: SPF configuration is needed by adding Avanan’s include mechanism to the sending domain’s SPF record. The DKIM record of the original sending source is preserved. For configurations, you can refer to the steps in this blog. Proofpoint – Outbound Handling and Integration Methods Outbound Logic Proofpoint analyzes outbound emails to detect and prevent data loss, to identify advanced threatsoriginating from compromised internal accounts, and to ensure compliance. Their API integration provides crucial visibility and powerful remediation capabilities, while their traditional gatewaydeployment delivers true inline, pre-delivery blocking for outbound traffic. Integration methods 1. API Integration: No MX record changes are required for this deployment method. Integration is done with Microsoft 365 or Google Workspace. How it works: Through its API integration, Proofpoint gains deep visibility into outbound emails and provides layered security and response features, including: Detect and alert: Identifies sensitive content, malicious attachments, or suspicious links in outbound emails. Post-delivery remediation: A key capability of the API model is Threat Response Auto-Pull, which enables Proofpoint to automatically recall, quarantine, or delete emails after delivery. This is particularly useful for internally sent messages or those forwarded to other users. Enhanced visibility: Aggregates message metadata and logs into Proofpoint’s threat intelligence platform, giving security teams a centralized view of outbound risks and user behavior. Protection level: API-based integration provides strong post-delivery detection and response, as well as visibility into DLP incidents and suspicious behavior.  SPF/DKIM/DMARC impact: Proofpoint does not alter SPF, DKIM, or DMARC because emails are sent directly through Microsoft or Google servers. Since Proofpoint’s servers are not involved in the actual sending process, the original authentication headers remain intact. 2. Gateway Integration: This method requires updating MX records or routing outbound mail through Proofpoint via a smart host. How it works: Proofpoint acts as an inline gateway, inspecting emails before delivery. Inbound mail is filtered via MX changes; outbound mail is relayed through Proofpoint’s servers. Threat and DLP filtering: Scans outbound messages for sensitive content, malware, and policy violations. Real-time enforcement: Blocks, encrypts, or quarantines emails before they’re delivered. Policy controls: Applies rules based on content, recipient, or behavior. Protection level: Provides strong, real-time protection for outbound traffic with pre-delivery enforcement, DLP, and encryption. SPF/DKIM/DMARC impact: Proofpoint becomes the sending server: SPF: You need to configure ProofPoint’s SPF. DKIM: Can sign messages; requires DKIM setup. DMARC: DMARC passes if SPF and DKIM are set up properly. Please refer to this article to configure SPF and DKIM for ProofPoint. Mimecast – Outbound Handling and Integration Methods Outbound Logic Mimecast inspects outbound emails to prevent data loss, detect internal threats such as malware and impersonation, and ensure regulatory compliance. It primarily functions as a Secure Email Gateway, meaning it sits directly in the outbound email flow. While Mimecast offers APIs, its core outbound protection is built around this inline gateway model. Integration Methods 1. Gateway IntegrationThis is Mimecast’s primary method for outbound email protection. Organizations route their outbound traffic through Mimecast by configuring their email serverto use Mimecast as a smart host. This enables Mimecast to inspect and enforce policies on all outgoing emails in real time. How it works: Updating outbound routing in your email system, or Using Mimecast SMTP relay to direct messages through their infrastructure. Mimecast then scans, filters, and applies policies before the email reaches the final recipient. Protection level: Advanced DLP: Identifies and prevents sensitive data leaks. Impersonation and Threat Protection: Blocks malware, phishing, and abuse from compromised internal accounts. Email Encryption and Secure Messaging: Applies encryption policies or routes messages via secure portals. Regulatory Compliance: Enforces outbound compliance rules based on content, recipient, or metadata. SPF/DKIM/DMARC impact: SPF: Your SPF record must include Mimecast’s SPF mechanism based on your region to avoid SPF failures. DKIM: A new DKIM record should be configured to make sure your emails are DKIM signed when routing through Mimecast. DMARC: With correct SPF and DKIM setup, Mimecast ensures DMARC alignment, maintaining your domain’s sending reputation. Please refer to the steps in this detailed article to set up SPF and DKIM for Mimecast. 2. API IntegrationMimecast’s APIs complement the main gateway by providing automation, reporting, and management tools rather than handling live outbound mail flow. They allow you to manage policies, export logs, search archived emails, and sync users. APIs enhance visibility and operational tasks but do not provide real-time filtering or blocking of outbound messages. Since APIs don’t process live mail, they have no direct effect on SPF, DKIM, or DMARC; those depend on your gatewaysetup. Barracuda – Outbound Handling and Integration Methods Outbound Logic Barracuda analyzes outbound emails to prevent data loss, block malware, stop phishing/impersonation attempts from compromised internal accounts, and ensure compliance. Barracuda offers flexible deployment options, including both traditional gatewayand API-based integrations. While both contribute to outbound security, their roles are distinct. Integration Methods 1. Gateway Integration— Primary Inline Security How it works: All outbound emails pass through Barracuda’s security stack for real-time inspection, threat blocking, and policy enforcement before delivery. Protection level: Comprehensive DLP  Outbound spam and virus filtering  Enforcement of compliance and content policies This approach offers a high level of control and immediate threat mitigation on outbound mail flow. SPF/DKIM/DMARC impact: SPF: Update SPF records to include Barracuda’s sending IPs or SPF include mechanism. DKIM: Currently, no explicit setup is needed; DKIM of the main sending source is preserved. Refer to this article for more comprehensive guidance on Barracuda SEG configuration. 2. API IntegrationHow it works: The API accesses cloud email environments to analyze historical and real-time data, learning normal communication patterns to detect anomalies in outbound emails. It also supports post-delivery remediation, enabling the removal of malicious emails from internal mailboxes after sending. Protection level: Advanced AI-driven detection and near real-time blocking of outbound threats, plus strong post-delivery cleanup capabilities. SPF/DKIM/DMARC impact: Since mail is sent directly by the original mail server, SPF and DKIM signatures remain intact, preserving DMARC alignment and domain reputation. Cisco Secure Email– Outbound Handling and Integration Methods Outbound Logic Cisco Secure Email protects outbound email by preventing data loss, blocking spam and malware from internal accounts, stopping business email compromiseand impersonation attacks, and ensuring compliance. Cisco provides both traditional gateway appliances/cloud gateways and modern API-based solutions for layered outbound security. Integration Methods 1. Gateway Integration– Cisco Secure Email GatewayHow it works: Organizations update MX records to route mail through the Cisco Secure Email Gateway or configure their mail serverto smart host outbound email via the gateway. All outbound mail is inspected and policies enforced before delivery. Protection level: Granular DLPOutbound spam and malware filtering to protect IP reputation Email encryption for sensitive outbound messages Comprehensive content and attachment policy enforcement SPF: Check this article for comprehensive guidance on Cisco SPF settings. DKIM: Refer to this article for detailed guidance on Cisco DKIM settings. 2. API Integration – Cisco Secure Email Threat Defense How it works: Integrates directly via API with Microsoft 365, continuously monitoring email metadata, content, and user behavior across inbound, outbound, and internal messages. Leverages Cisco’s threat intelligence and AI to detect anomalous outbound activity linked to BEC, account takeover, and phishing. Post-Delivery Remediation: Automates the removal or quarantine of malicious or policy-violating emails from mailboxes even after sending. Protection level: Advanced, AI-driven detection of sophisticated outbound threats with real-time monitoring and automated remediation. Complements gateway filtering by adding cloud-native visibility and swift post-send action. SPF/DKIM/DMARC impact: Since emails are sent directly by the original mail server, SPF and DKIM signatures remain intact, preserving DMARC alignment and domain reputation. If you have any questions or need assistance, feel free to reach out to EasyDMARC technical support. #understanding #relationship #between #security #gateways
    EASYDMARC.COM
    Understanding the Relationship Between Security Gateways and DMARC
    Email authentication protocols like SPF, DKIM, and DMARC play a critical role in protecting domains from spoofing and phishing. However, when SEGs are introduced into the email path, the interaction with these protocols becomes more complex. Security gateways(SEGs) are a core part of many organizations’ email infrastructure. They act as intermediaries between the public internet and internal mail systems, inspecting, filtering, and routing messages. This blog examines how security gateways handle SPF, DKIM, and DMARC, with real-world examples from popular gateways such as Proofpoint, Mimecast, and Avanan. We’ll also cover best practices for maintaining authentication integrity and avoiding misconfigurations that can compromise email authentication or lead to false DMARC failures. Security gateways often sit at the boundary between your organization and the internet, managing both inbound and outbound email traffic. Their role affects how email authentication protocols behave. An inbound SEG examines emails coming into your organization. It checks SPF, DKIM, and DMARC to determine if the message is authentic and safe before passing it to your internal mail servers. An outbound SEG handles emails sent from your domain. It may modify headers, rewrite envelope addresses, or even apply DKIM signing. All of these can impact SPF,  DKIM, or DMARC validation on the recipient’s side. Understanding how SEGs influence these flows is crucial to maintaining proper authentication and avoiding unexpected DMARC failures. Inbound Handling of SPF, DKIM, and DMARC by Common Security Gateways When an email comes into your organization, your security gateway is the first to inspect it. It checks whether the message is real, trustworthy, and properly authenticated. Let’s look at how different SEGs handle these checks. Avanan (by Check Point) SPF: Avanan verifies whether the sending server is authorized to send emails for the domain by checking the SPF record. DKIM: It verifies if the message was signed by the sending domain and if that signature is valid. DMARC: It uses the results of the SPF and DKIM check to evaluate DMARC. However, final enforcement usually depends on how DMARC is handled by Microsoft 365 or Gmail, as Avanan integrates directly with them. Avanan offers two methods of integration:1. API integration: Avanan connects via APIs, no change in MX, usually Monitor or Detect modes.2. Inline integration: Avanan is placed inline in the mail flow (MX records changed), actively blocking or remediating threats. Proofpoint Email Protection SPF: Proofpoint checks SPF to confirm the sender’s IP is authorized to send on behalf of the domain. You can set custom rules (e.g. treat “softfail” as “fail”). DKIM: It verifies DKIM signatures and shows clear pass/fail results in logs. DMARC: It fully evaluates DMARC by combining SPF and DKIM results with alignment checks. Administrators can configure how to handle messages that fail DMARC, such as rejecting, quarantining, or delivering them. Additionally, Proofpoint allows whitelisting specific senders you trust, even if their emails fail authentication checks. Integration Methods Inline Mode: In this traditional deployment, Proofpoint is positioned directly in the email flow by modifying MX records. Emails are routed through Proofpoint’s infrastructure, allowing it to inspect and filter messages before they reach the recipient’s inbox. This mode provides pre-delivery protection and is commonly used in on-premises or hybrid environments. API-Based (Integrated Cloud Email Security – ICES) Mode: Proofpoint offers API-based integration, particularly with cloud email platforms like Microsoft 365 and Google Workspace. In this mode, Proofpoint connects to the email platform via APIs, enabling it to monitor and remediate threats post-delivery without altering the email flow. This approach allows for rapid deployment and seamless integration with existing cloud email services. Mimecast SPF: Mimecast performs SPF checks to verify whether the sending server is authorized by the domain’s SPF record. Administrators can configure actions for SPF failures, including block, quarantine, permit, or tag with a warning. This gives flexibility in balancing security with business needs. DKIM: It validates DKIM signatures by checking that the message was correctly signed by the sending domain and that the content hasn’t been tampered with. If the signature fails, Mimecast can take actions based on your configured policies. DMARC: It fully evaluates DMARC by combining the results of SPF and DKIM with domain alignment checks. You can choose to honor the sending domain’s DMARC policy (none, quarantine, reject) or apply custom rules, for example, quarantining or tagging messages that fail DMARC regardless of the published policy. This allows more granular control for businesses that want to override external domain policies based on specific contexts. Integration Methods Inline Deployment: Mimecast is typically deployed as a cloud-based secure email gateway. Organizations update their domain’s MX records to point to Mimecast, so all inbound (and optionally outbound) emails pass through it first. This allows Mimecast to inspect, filter, and process emails before delivery, providing robust protection. API Integrations: Mimecast also offers API-based services through its Mimecast API platform, primarily for management, archival, continuity, and threat intelligence purposes. However, API-only email protection is not Mimecast’s core model. Instead, the APIs are used to enhance the inline deployment, not replace it. Barracuda Email Security Gateway SPF: Barracuda checks the sender’s IP against the domain’s published SPF record. If the check fails, you can configure the system to block, quarantine, tag, or allow the message, depending on your policy preferences. DKIM: It validates whether the incoming message includes a valid DKIM signature. The outcome is logged and used to inform further policy decisions or DMARC evaluations. DMARC: It combines SPF and DKIM results, checks for domain alignment, and applies the DMARC policy defined by the sender. Administrators can also choose to override the DMARC policy, allowing messages to pass or be treated differently based on organizational needs (e.g., trusted senders or internal exceptions). Integration Methods Inline mode (more common and straightforward): Barracuda Email Security Gateway is commonly deployed inline by updating your domain’s MX records to point to Barracuda’s cloud or on-premises gateway. This ensures that all inbound emails pass through Barracuda first for filtering and SPF, DKIM, and DMARC validation before being delivered to your mail servers. Deployment Behind the Corporate Firewall: Alternatively, Barracuda can be deployed in transparent or bridge mode without modifying MX records. In this setup, the gateway is placed inline at the network level, such as behind a firewall, and intercepts mail traffic transparently. This method is typically used in complex on-premises environments where changing DNS records is not feasible. Cisco Secure Email (formerly IronPort) Cisco Secure Email acts as an inline gateway for inbound email, usually requiring your domain’s MX records to point to the Cisco Email Security Appliance or cloud service. SPF: Cisco Secure Email verifies whether the sending server is authorized in the sender domain’s SPF record. Administrators can set detailed policies on how to handle SPF failures. DKIM: It validates the DKIM signature on incoming emails and logs whether the signature is valid or has failed. DMARC: It evaluates DMARC by combining SPF and DKIM results along with domain alignment checks. Admins can configure specific actions, such as quarantine, reject, or tag, based on different failure scenarios or trusted sender exceptions. Integration methods On-premises Email Security Appliance (ESA): You deploy Cisco’s hardware or virtual appliance inline, updating MX records to route mail through it for filtering. Cisco Cloud Email Security: Cisco offers a cloud-based email security service where MX records are pointed to Cisco’s cloud infrastructure, which filters and processes inbound mail. Cisco Secure Email also offers advanced, rule-based filtering capabilities and integrates with Cisco’s broader threat protection ecosystem, enabling comprehensive inbound email security. Outbound Handling of SPF, DKIM, and DMARC by Common Security Gateways When your organization sends emails, security gateways can play an active role in processing and authenticating those messages. Depending on the configuration, a gateway might rewrite headers, re-sign messages, or route them through different IPs – all actions that can help or hurt the authentication process. Let’s look at how major SEGs handle outbound email flow. Avanan – Outbound Handling and Integration Methods Outbound Logic Avanan analyzes outbound emails primarily to detect data loss, malware, and policy violations. In API-based integration, emails are sent directly by the original mail server (e.g., Microsoft 365 or Google Workspace), so SPF and DKIM signatures remain intact. Avanan does not alter the message or reroute traffic, which helps maintain full DMARC alignment and domain reputation. Integration Methods 1. API Integration: Connects to Microsoft 365 or Google Workspace via API. No MX changes are needed. Emails are scanned after they are sent, with no modification to SPF, DKIM, or the delivery path.  How it works: Microsoft Graph API or Google Workspace APIs are used to monitor and intervene in outbound emails. Protection level: Despite no MX changes, it can offer inline-like protection, meaning it can block, quarantine, or encrypt emails before they are delivered externally. SPF/DKIM/DMARC impact: Preserves original headers and signatures since mail is sent directly from Microsoft/Google servers. 2. Inline Integration: Requires changing MX records to route email through Avanan. In this mode, Avanan can intercept and inspect outbound emails before delivery. Depending on the configuration, this may affect SPF or DKIM if not properly handled. How it works: Requires adding Avanan’s Protection level: Traditional inline security with full visibility and control, including encryption, DLP, policy enforcement, and advanced threat protection. SPF/DKIM/DMARC impact: SPF configuration is needed by adding Avanan’s include mechanism to the sending domain’s SPF record. The DKIM record of the original sending source is preserved. For configurations, you can refer to the steps in this blog. Proofpoint – Outbound Handling and Integration Methods Outbound Logic Proofpoint analyzes outbound emails to detect and prevent data loss (DLP), to identify advanced threats (malware, phishing, BEC) originating from compromised internal accounts, and to ensure compliance. Their API integration provides crucial visibility and powerful remediation capabilities, while their traditional gateway (MX record) deployment delivers true inline, pre-delivery blocking for outbound traffic. Integration methods 1. API Integration: No MX record changes are required for this deployment method. Integration is done with Microsoft 365 or Google Workspace. How it works: Through its API integration, Proofpoint gains deep visibility into outbound emails and provides layered security and response features, including: Detect and alert: Identifies sensitive content (Data Loss Prevention violations), malicious attachments, or suspicious links in outbound emails. Post-delivery remediation (TRAP): A key capability of the API model is Threat Response Auto-Pull (TRAP), which enables Proofpoint to automatically recall, quarantine, or delete emails after delivery. This is particularly useful for internally sent messages or those forwarded to other users. Enhanced visibility: Aggregates message metadata and logs into Proofpoint’s threat intelligence platform, giving security teams a centralized view of outbound risks and user behavior. Protection level: API-based integration provides strong post-delivery detection and response, as well as visibility into DLP incidents and suspicious behavior.  SPF/DKIM/DMARC impact: Proofpoint does not alter SPF, DKIM, or DMARC because emails are sent directly through Microsoft or Google servers. Since Proofpoint’s servers are not involved in the actual sending process, the original authentication headers remain intact. 2. Gateway Integration (MX Record/Smart Host): This method requires updating MX records or routing outbound mail through Proofpoint via a smart host. How it works: Proofpoint acts as an inline gateway, inspecting emails before delivery. Inbound mail is filtered via MX changes; outbound mail is relayed through Proofpoint’s servers. Threat and DLP filtering: Scans outbound messages for sensitive content, malware, and policy violations. Real-time enforcement: Blocks, encrypts, or quarantines emails before they’re delivered. Policy controls: Applies rules based on content, recipient, or behavior. Protection level: Provides strong, real-time protection for outbound traffic with pre-delivery enforcement, DLP, and encryption. SPF/DKIM/DMARC impact: Proofpoint becomes the sending server: SPF: You need to configure ProofPoint’s SPF. DKIM: Can sign messages; requires DKIM setup. DMARC: DMARC passes if SPF and DKIM are set up properly. Please refer to this article to configure SPF and DKIM for ProofPoint. Mimecast – Outbound Handling and Integration Methods Outbound Logic Mimecast inspects outbound emails to prevent data loss (DLP), detect internal threats such as malware and impersonation, and ensure regulatory compliance. It primarily functions as a Secure Email Gateway (SEG), meaning it sits directly in the outbound email flow. While Mimecast offers APIs, its core outbound protection is built around this inline gateway model. Integration Methods 1. Gateway Integration (MX Record change required) This is Mimecast’s primary method for outbound email protection. Organizations route their outbound traffic through Mimecast by configuring their email server (e.g., Microsoft 365, Google Workspace, etc.) to use Mimecast as a smart host. This enables Mimecast to inspect and enforce policies on all outgoing emails in real time. How it works: Updating outbound routing in your email system (smart host settings), or Using Mimecast SMTP relay to direct messages through their infrastructure. Mimecast then scans, filters, and applies policies before the email reaches the final recipient. Protection level: Advanced DLP: Identifies and prevents sensitive data leaks. Impersonation and Threat Protection: Blocks malware, phishing, and abuse from compromised internal accounts. Email Encryption and Secure Messaging: Applies encryption policies or routes messages via secure portals. Regulatory Compliance: Enforces outbound compliance rules based on content, recipient, or metadata. SPF/DKIM/DMARC impact: SPF: Your SPF record must include Mimecast’s SPF mechanism based on your region to avoid SPF failures. DKIM: A new DKIM record should be configured to make sure your emails are DKIM signed when routing through Mimecast. DMARC: With correct SPF and DKIM setup, Mimecast ensures DMARC alignment, maintaining your domain’s sending reputation. Please refer to the steps in this detailed article to set up SPF and DKIM for Mimecast. 2. API Integration (Complementary to Gateway) Mimecast’s APIs complement the main gateway by providing automation, reporting, and management tools rather than handling live outbound mail flow. They allow you to manage policies, export logs, search archived emails, and sync users. APIs enhance visibility and operational tasks but do not provide real-time filtering or blocking of outbound messages. Since APIs don’t process live mail, they have no direct effect on SPF, DKIM, or DMARC; those depend on your gateway (smart host) setup. Barracuda – Outbound Handling and Integration Methods Outbound Logic Barracuda analyzes outbound emails to prevent data loss (DLP), block malware, stop phishing/impersonation attempts from compromised internal accounts, and ensure compliance. Barracuda offers flexible deployment options, including both traditional gateway (MX record) and API-based integrations. While both contribute to outbound security, their roles are distinct. Integration Methods 1. Gateway Integration (MX Record / Smart Host) — Primary Inline Security How it works: All outbound emails pass through Barracuda’s security stack for real-time inspection, threat blocking, and policy enforcement before delivery. Protection level: Comprehensive DLP (blocking, encrypting, or quarantining sensitive content)  Outbound spam and virus filtering  Enforcement of compliance and content policies This approach offers a high level of control and immediate threat mitigation on outbound mail flow. SPF/DKIM/DMARC impact: SPF: Update SPF records to include Barracuda’s sending IPs or SPF include mechanism. DKIM: Currently, no explicit setup is needed; DKIM of the main sending source is preserved. Refer to this article for more comprehensive guidance on Barracuda SEG configuration. 2. API Integration (Complementary & Advanced Threat Focus) How it works: The API accesses cloud email environments to analyze historical and real-time data, learning normal communication patterns to detect anomalies in outbound emails. It also supports post-delivery remediation, enabling the removal of malicious emails from internal mailboxes after sending. Protection level: Advanced AI-driven detection and near real-time blocking of outbound threats, plus strong post-delivery cleanup capabilities. SPF/DKIM/DMARC impact: Since mail is sent directly by the original mail server (e.g., Microsoft 365), SPF and DKIM signatures remain intact, preserving DMARC alignment and domain reputation. Cisco Secure Email (formerly IronPort) – Outbound Handling and Integration Methods Outbound Logic Cisco Secure Email protects outbound email by preventing data loss (DLP), blocking spam and malware from internal accounts, stopping business email compromise (BEC) and impersonation attacks, and ensuring compliance. Cisco provides both traditional gateway appliances/cloud gateways and modern API-based solutions for layered outbound security. Integration Methods 1. Gateway Integration (MX Record / Smart Host) – Cisco Secure Email Gateway (ESA) How it works: Organizations update MX records to route mail through the Cisco Secure Email Gateway or configure their mail server (e.g., Microsoft 365, Exchange) to smart host outbound email via the gateway. All outbound mail is inspected and policies enforced before delivery. Protection level: Granular DLP (blocking, encrypting, quarantining sensitive content) Outbound spam and malware filtering to protect IP reputation Email encryption for sensitive outbound messages Comprehensive content and attachment policy enforcement SPF: Check this article for comprehensive guidance on Cisco SPF settings. DKIM: Refer to this article for detailed guidance on Cisco DKIM settings. 2. API Integration – Cisco Secure Email Threat Defense How it works: Integrates directly via API with Microsoft 365 (and potentially Google Workspace), continuously monitoring email metadata, content, and user behavior across inbound, outbound, and internal messages. Leverages Cisco’s threat intelligence and AI to detect anomalous outbound activity linked to BEC, account takeover, and phishing. Post-Delivery Remediation: Automates the removal or quarantine of malicious or policy-violating emails from mailboxes even after sending. Protection level: Advanced, AI-driven detection of sophisticated outbound threats with real-time monitoring and automated remediation. Complements gateway filtering by adding cloud-native visibility and swift post-send action. SPF/DKIM/DMARC impact: Since emails are sent directly by the original mail server, SPF and DKIM signatures remain intact, preserving DMARC alignment and domain reputation. If you have any questions or need assistance, feel free to reach out to EasyDMARC technical support.
    Like
    Love
    Wow
    Sad
    Angry
    398
    0 Комментарии 0 Поделились