Pesquisar

Publicações

Blogs

Usuários

Páginas

Grupos

Towards Data Science @TowardsDataScience compartilhou um link
2025-05-22 21:53:36 ·

Google’s AlphaEvolve: Getting Started with Evolutionary Coding Agents

Introduction AlphaEvolveis a promising new coding agent by Google’s DeepMind. Let’s look at what it is and why it is generating hype. Much of the Google paper is on the claim that AlphaEvolve is facilitating novel research through its ability to improve code until it solves a problem in a really good way.The post Google’s AlphaEvolve: Getting Started with Evolutionary Coding Agents appeared first on Towards Data Science.
#googles #alphaevolve #getting #started #with

Google’s AlphaEvolve: Getting Started with Evolutionary Coding Agents
Introduction AlphaEvolveis a promising new coding agent by Google’s DeepMind. Let’s look at what it is and why it is generating hype. Much of the Google paper is on the claim that AlphaEvolve is facilitating novel research through its ability to improve code until it solves a problem in a really good way.The post Google’s AlphaEvolve: Getting Started with Evolutionary Coding Agents appeared first on Towards Data Science. #googles #alphaevolve #getting #started #with

towardsdatascience.com
Introduction AlphaEvolve [1] is a promising new coding agent by Google’s DeepMind. Let’s look at what it is and why it is generating hype. Much of the Google paper is on the claim that AlphaEvolve is facilitating novel research through its ability to improve code until it solves a problem in a really good way. […] The post Google’s AlphaEvolve: Getting Started with Evolutionary Coding Agents appeared first on Towards Data Science.

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça o login para curtir, compartilhar e comentar!
Two Minute Papers @TwoMinutPapers compartilhou um link
2025-05-17 14:55:41 ·

DeepMind’s AlphaEvolve AI: History In The Making!

DeepMind’s AlphaEvolve AI: History In The Making!
#deepminds #alphaevolve #history #making

DeepMind’s AlphaEvolve AI: History In The Making!
DeepMind’s AlphaEvolve AI: History In The Making! #deepminds #alphaevolve #history #making

DeepMind’s AlphaEvolve AI: History In The Making!

www.youtube.com
DeepMind’s AlphaEvolve AI: History In The Making!

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça o login para curtir, compartilhar e comentar!
VentureBeat @VentureBeat compartilhou um link
2025-05-17 01:28:05 ·

Google’s AlphaEvolve: The AI agent that reclaimed 0.7% of Google’s compute – and how to copy it

Google's AlphaEvolve is the epitome of a best-practice AI agent orchestration. It offers a lesson in production-grade agent engineering. Discover its architecture & essential takeaways for your enterprise AI strategy.Read More
#googles #alphaevolve #agent #that #reclaimed

Google’s AlphaEvolve: The AI agent that reclaimed 0.7% of Google’s compute – and how to copy it
Google's AlphaEvolve is the epitome of a best-practice AI agent orchestration. It offers a lesson in production-grade agent engineering. Discover its architecture & essential takeaways for your enterprise AI strategy.Read More #googles #alphaevolve #agent #that #reclaimed

Google’s AlphaEvolve: The AI agent that reclaimed 0.7% of Google’s compute – and how to copy it

venturebeat.com
Google's AlphaEvolve is the epitome of a best-practice AI agent orchestration. It offers a lesson in production-grade agent engineering. Discover its architecture & essential takeaways for your enterprise AI strategy.Read More

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça o login para curtir, compartilhar e comentar!
Marktechpost AI @MarktechpostAI compartilhou um link
2025-05-16 19:17:17 ·

AI Agents Now Write Code in Parallel: OpenAI Introduces Codex, a Cloud-Based Coding Agent Inside ChatGPT

OpenAI has introduced Codex, a cloud-native software engineering agent integrated into ChatGPT, signaling a new era in AI-assisted software development. Unlike traditional coding assistants, Codex is not just a tool for autocompletion—it acts as a cloud-based agent capable of autonomously performing a wide range of programming tasks, from writing and debugging code to running tests and generating pull requests.
A Shift Toward Parallel, Agent-Driven Development
At the core of Codex is codex-1, a fine-tuned version of OpenAI’s reasoning model, optimized specifically for software engineering workflows. Codex can handle multiple tasks simultaneously, operating inside isolated cloud sandboxes that are preloaded with the user’s codebase. Each request is handled in its own environment, allowing users to delegate different coding operations in parallel without disrupting their local development environment.
This architecture introduces a fundamentally new approach to software engineering—developers now interact with an agent that behaves more like a collaborative teammate than a static code tool. You can ask Codex to “fix a bug,” “add logging,” or “refactor this module,” and it will return a verifiable response, including diffs, terminal logs, and test results. If the output looks good, you can copy the patch directly into your repository—or ask for revisions.
Embedded Within ChatGPT, Accessible to Teams
Codex lives in the ChatGPT interface, currently available to Pro, Team, and Enterprise users, with broader access expected soon. The interface includes a dedicated sidebar where developers can describe what they want in natural language. Codex then interprets the intent and handles the coding behind the scenes, surfacing results for review and feedback.
This integration offers a significant boost to developer productivity. As OpenAI notes, Codex is designed to take on many of the repetitive or boilerplate-heavy aspects of coding—allowing developers to focus on architecture, design, and higher-order problem solving. In one case, an OpenAI staffer even “checked in two bug fixes written entirely by Codex,” all while working on unrelated tasks.
Codex Understands Your Codebase
What makes Codex more than just a smart code generator is its context-awareness. Each instance runs with full access to your project’s file structure, coding conventions, and style. This allows it to write code that aligns with your team’s standards—whether you’re using Flask or FastAPI, React or Vue, or a custom internal framework.
Codex’s ability to adapt to a codebase makes it particularly useful for large-scale enterprise teams and open-source maintainers. It supports workflows like branch-based pull request generation, test suite execution, and static analysis—all initiated by simple English prompts. Over time, it learns the nuances of the repository it works in, leading to better suggestions and more accurate code synthesis.
Broader Implications: Lowering the Barrier to Software Creation
OpenAI frames Codex as a research preview, but its long-term vision is clear: AI will increasingly take over much of the routine work involved in building software. The aim isn’t to replace developers but to democratize software creation, allowing more people—especially non-traditional developers—to build working applications using natural language alone.
In this light, Codex is not just a coding tool, but a stepping stone toward a world where software development is collaborative between humans and machines. It brings software creation closer to the realm of design and ideation, and further away from syntax and implementation details.
What’s Next?
Codex is rolling out gradually, with usage limits in place during the preview phase. OpenAI is gathering feedback to refine the agent’s capabilities, improve safety, and optimize its performance across different environments and languages.
Whether you’re a solo developer, part of a DevOps team, or leading an enterprise platform, Codex represents a significant shift in how code is written, tested, and shipped. As AI agents continue to mature, the future of software engineering will be less about writing every line yourself—and more about knowing what to build, and asking the right questions.

Check out the Details here. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit.
Asif RazzaqWebsite | + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/A Step-by-Step Guide to Build an Automated Knowledge Graph Pipeline Using LangGraph and NetworkXAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Google DeepMind Introduces AlphaEvolve: A Gemini-Powered Coding AI Agent for Algorithm Discovery and Scientific OptimizationAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Rime Introduces Arcana and Rimecaster: Practical Voice AI Tools Built on Real-World SpeechAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Step-by-Step Guide to Build a Fast Semantic Search and RAG QA Engine on Web-Scraped Data Using Together AI Embeddings, FAISS Retrieval, and LangChain
#agents #now #write #code #parallel

AI Agents Now Write Code in Parallel: OpenAI Introduces Codex, a Cloud-Based Coding Agent Inside ChatGPT
OpenAI has introduced Codex, a cloud-native software engineering agent integrated into ChatGPT, signaling a new era in AI-assisted software development. Unlike traditional coding assistants, Codex is not just a tool for autocompletion—it acts as a cloud-based agent capable of autonomously performing a wide range of programming tasks, from writing and debugging code to running tests and generating pull requests. A Shift Toward Parallel, Agent-Driven Development At the core of Codex is codex-1, a fine-tuned version of OpenAI’s reasoning model, optimized specifically for software engineering workflows. Codex can handle multiple tasks simultaneously, operating inside isolated cloud sandboxes that are preloaded with the user’s codebase. Each request is handled in its own environment, allowing users to delegate different coding operations in parallel without disrupting their local development environment. This architecture introduces a fundamentally new approach to software engineering—developers now interact with an agent that behaves more like a collaborative teammate than a static code tool. You can ask Codex to “fix a bug,” “add logging,” or “refactor this module,” and it will return a verifiable response, including diffs, terminal logs, and test results. If the output looks good, you can copy the patch directly into your repository—or ask for revisions. Embedded Within ChatGPT, Accessible to Teams Codex lives in the ChatGPT interface, currently available to Pro, Team, and Enterprise users, with broader access expected soon. The interface includes a dedicated sidebar where developers can describe what they want in natural language. Codex then interprets the intent and handles the coding behind the scenes, surfacing results for review and feedback. This integration offers a significant boost to developer productivity. As OpenAI notes, Codex is designed to take on many of the repetitive or boilerplate-heavy aspects of coding—allowing developers to focus on architecture, design, and higher-order problem solving. In one case, an OpenAI staffer even “checked in two bug fixes written entirely by Codex,” all while working on unrelated tasks. Codex Understands Your Codebase What makes Codex more than just a smart code generator is its context-awareness. Each instance runs with full access to your project’s file structure, coding conventions, and style. This allows it to write code that aligns with your team’s standards—whether you’re using Flask or FastAPI, React or Vue, or a custom internal framework. Codex’s ability to adapt to a codebase makes it particularly useful for large-scale enterprise teams and open-source maintainers. It supports workflows like branch-based pull request generation, test suite execution, and static analysis—all initiated by simple English prompts. Over time, it learns the nuances of the repository it works in, leading to better suggestions and more accurate code synthesis. Broader Implications: Lowering the Barrier to Software Creation OpenAI frames Codex as a research preview, but its long-term vision is clear: AI will increasingly take over much of the routine work involved in building software. The aim isn’t to replace developers but to democratize software creation, allowing more people—especially non-traditional developers—to build working applications using natural language alone. In this light, Codex is not just a coding tool, but a stepping stone toward a world where software development is collaborative between humans and machines. It brings software creation closer to the realm of design and ideation, and further away from syntax and implementation details. What’s Next? Codex is rolling out gradually, with usage limits in place during the preview phase. OpenAI is gathering feedback to refine the agent’s capabilities, improve safety, and optimize its performance across different environments and languages. Whether you’re a solo developer, part of a DevOps team, or leading an enterprise platform, Codex represents a significant shift in how code is written, tested, and shipped. As AI agents continue to mature, the future of software engineering will be less about writing every line yourself—and more about knowing what to build, and asking the right questions. Check out the Details here. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit. Asif RazzaqWebsite | + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/A Step-by-Step Guide to Build an Automated Knowledge Graph Pipeline Using LangGraph and NetworkXAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Google DeepMind Introduces AlphaEvolve: A Gemini-Powered Coding AI Agent for Algorithm Discovery and Scientific OptimizationAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Rime Introduces Arcana and Rimecaster: Practical Voice AI Tools Built on Real-World SpeechAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Step-by-Step Guide to Build a Fast Semantic Search and RAG QA Engine on Web-Scraped Data Using Together AI Embeddings, FAISS Retrieval, and LangChain #agents #now #write #code #parallel

AI Agents Now Write Code in Parallel: OpenAI Introduces Codex, a Cloud-Based Coding Agent Inside ChatGPT

www.marktechpost.com
OpenAI has introduced Codex, a cloud-native software engineering agent integrated into ChatGPT, signaling a new era in AI-assisted software development. Unlike traditional coding assistants, Codex is not just a tool for autocompletion—it acts as a cloud-based agent capable of autonomously performing a wide range of programming tasks, from writing and debugging code to running tests and generating pull requests. A Shift Toward Parallel, Agent-Driven Development At the core of Codex is codex-1, a fine-tuned version of OpenAI’s reasoning model, optimized specifically for software engineering workflows. Codex can handle multiple tasks simultaneously, operating inside isolated cloud sandboxes that are preloaded with the user’s codebase. Each request is handled in its own environment, allowing users to delegate different coding operations in parallel without disrupting their local development environment. This architecture introduces a fundamentally new approach to software engineering—developers now interact with an agent that behaves more like a collaborative teammate than a static code tool. You can ask Codex to “fix a bug,” “add logging,” or “refactor this module,” and it will return a verifiable response, including diffs, terminal logs, and test results. If the output looks good, you can copy the patch directly into your repository—or ask for revisions. Embedded Within ChatGPT, Accessible to Teams Codex lives in the ChatGPT interface, currently available to Pro, Team, and Enterprise users, with broader access expected soon. The interface includes a dedicated sidebar where developers can describe what they want in natural language. Codex then interprets the intent and handles the coding behind the scenes, surfacing results for review and feedback. This integration offers a significant boost to developer productivity. As OpenAI notes, Codex is designed to take on many of the repetitive or boilerplate-heavy aspects of coding—allowing developers to focus on architecture, design, and higher-order problem solving. In one case, an OpenAI staffer even “checked in two bug fixes written entirely by Codex,” all while working on unrelated tasks. Codex Understands Your Codebase What makes Codex more than just a smart code generator is its context-awareness. Each instance runs with full access to your project’s file structure, coding conventions, and style. This allows it to write code that aligns with your team’s standards—whether you’re using Flask or FastAPI, React or Vue, or a custom internal framework. Codex’s ability to adapt to a codebase makes it particularly useful for large-scale enterprise teams and open-source maintainers. It supports workflows like branch-based pull request generation, test suite execution, and static analysis—all initiated by simple English prompts. Over time, it learns the nuances of the repository it works in, leading to better suggestions and more accurate code synthesis. Broader Implications: Lowering the Barrier to Software Creation OpenAI frames Codex as a research preview, but its long-term vision is clear: AI will increasingly take over much of the routine work involved in building software. The aim isn’t to replace developers but to democratize software creation, allowing more people—especially non-traditional developers—to build working applications using natural language alone. In this light, Codex is not just a coding tool, but a stepping stone toward a world where software development is collaborative between humans and machines. It brings software creation closer to the realm of design and ideation, and further away from syntax and implementation details. What’s Next? Codex is rolling out gradually, with usage limits in place during the preview phase. OpenAI is gathering feedback to refine the agent’s capabilities, improve safety, and optimize its performance across different environments and languages. Whether you’re a solo developer, part of a DevOps team, or leading an enterprise platform, Codex represents a significant shift in how code is written, tested, and shipped. As AI agents continue to mature, the future of software engineering will be less about writing every line yourself—and more about knowing what to build, and asking the right questions. Check out the Details here. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit. Asif RazzaqWebsite | + postsBioAsif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.Asif Razzaqhttps://www.marktechpost.com/author/6flvq/A Step-by-Step Guide to Build an Automated Knowledge Graph Pipeline Using LangGraph and NetworkXAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Google DeepMind Introduces AlphaEvolve: A Gemini-Powered Coding AI Agent for Algorithm Discovery and Scientific OptimizationAsif Razzaqhttps://www.marktechpost.com/author/6flvq/Rime Introduces Arcana and Rimecaster (Open Source): Practical Voice AI Tools Built on Real-World SpeechAsif Razzaqhttps://www.marktechpost.com/author/6flvq/A Step-by-Step Guide to Build a Fast Semantic Search and RAG QA Engine on Web-Scraped Data Using Together AI Embeddings, FAISS Retrieval, and LangChain

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça o login para curtir, compartilhar e comentar!
The Verge @TheVerge compartilhou um link
2025-05-16 17:40:39 ·

ChatGPT is getting an AI coding agent

OpenAI’s next “low-key research preview” has arrived.This time, it’s not ChatGPT, but a coding agent dubbed Codex that is being made available to ChatGPT Pro, Enterprise, and Team subscribers starting Friday. By drumming up comparisons to how ChatGPT was first described, CEO Sam Altman and other company leaders are positioning Codex as the company’s next major product. It doesn’t cost extra to use for now, though OpenAI plans to eventually charge for access once it gets a sense of demand.The goal for Codex is to make ChatGPT a “virtual coworker” for engineers, Josh Tobin, OpenAI’s research lead for agents, said during a press call I attended this week. Like other vibe coding tools, Codex generates code from natural language. It can act independently on sandboxed code to fix bugs, run tests, and suggest changes to how code should run in the real world. This process can take anywhere up to 30 minutes, and OpenAI plans to let Codex work in the background for longer over time. Codex is integrated into ChatGPT’s web app to start, but it’s intentionally cut off from being able to access the internet to mitigate security risks. It’s powered by a version of OpenAI’s o3 reasoning model that is customized for coding and called codex-1.According to Tobin, the company sees Codex as complementary to more granular AI coding assistants like Cursor and Windsurf, the latter of which OpenAI is in talks to acquire for roughly billion. Inside OpenAI, Codex is already being used by engineers as a “morning to-do list” that helps them run multiple tasks in parallel. They have it spin up multiple tasks in parallel that they can come back to check on, according to Alexander Embiricos, Codex’s product lead. He said that a handful of companies that have tested it externally are seeing Codex used by on-call engineers who oversee a service’s stability.For now, Codex is relatively limited in what it can do autonomously. Eventually, OpenAI’s goal is for it to fully abstract away the complexity of coding. “The way that we think most development will happen in the future is that the agent will work on its own computer, and we’ll delegate to it,” said Embiricos. During a recent talk, Altman described coding as “central to the future of OpenAI.” There’s a belief in Silicon Valley that whoever creates a general-purpose AI engineer, which Codex is supposed to become, will have an edge in the race to build artificial general intelligence. Codex was what OpenAI called its first AI coding tool way back in 2021, before ChatGPT was released. Now, models helping people code is perhaps the hottest area of AI, with Anthropic and others betting heavily on it as a business. On Thursday, Windsurf announced its own suite of coding models. And earlier this week, Google’s Gemini added the ability to connect to GitHub and announced AlphaEvolve, an AI coding agent specifically designed for developing algorithms. Speaking of Google, the company’s annual conference, I/O, happens to be next week. Given the rivalry between OpenAI and Google, the timing of this week’s Codex announcement is likely not a coincidence. We’ll see how Google responds.ElsewhereAirbnb’s Apple envy: A minimal, all-black stage. A company founder in a black shirt showing live demos from an iPhone to an audience. Jony Ive sitting in the front row. You may be thinking of an early Apple keynote, but that was what I witnessed in Los Angeles earlier this week at Airbnb’s “Summer Release” event. There, CEO Brian Chesky announced a new “Services” offering, a reboot of Airbnb Experiences, and an app redesign that would make Scott Forstall grin. Steve Jobs has inspired a generation of founders who want to present like him. I understand the impulse, but I’m ready to see a fresh take on what a tech keynote can be. That said, if you’re curious about this week’s Airbnb news, I’d recommend two longreads from Wired and The Wall Street Journal. AI news rapid fire: Microsoft laid off three percent of employees, mainly targeting engineers. / Databricks bought the database startup Neon for billion. / Meta delayed the release of its Llama 4.0 “Behemoth” frontier model. / Perplexity integrated with PayPal and Venmo to allow for purchases. / Cohere missed its revenue forecast by 85 percent. / Chegg laid off 22 percent of employees in large part due to the impact of AI.Overheard“On May 14 at approximately 3:15 AM PST, an unauthorized modification was made to the Grok response bot’s prompt on X. ” - xAI’s post explaining why Grok was suddenly trying to debunk claims of white genocide in South Africa.“When AR really works, I think that will wow people” - Google CEO Sundar Pichai on the All-In podcast.“Flat design is over. The future is colorful and dimensional.” - Airbnb CEO Brian Chesky trying to will it into existence on X.Personnel logAnand Swaminathan, Tesla’s senior manager of Optimus, left to be head of delivery flight performance for Zipline. Benjamin Joe is Meta’s new VP of Asia Pacific. Susie Dickson, Meta’s head of content design, left after 12 years.Alston Cheek, Snap’s former director of platform partnerships, joined Airbnb in the same role. Sterling Anderson, Aurora’s co-founder, joined GM as chief product officer. Richard Gringras, Google’s VP of News, left after 14 years. Christina Wootton, Roblox’s chief partnerships officer, left after over 11 years.Link listMore to click on:If you haven’t already, don’t forget to subscribe to The Verge, which includes unlimited access to Command Line and all of our reporting.As always, I welcome your feedback, especially if you have thoughts on this issue or a story idea to share. You can respond here or ping me securely on Signal.Thanks for subscribing.See More:
#chatgpt #getting #coding #agent

ChatGPT is getting an AI coding agent
OpenAI’s next “low-key research preview” has arrived.This time, it’s not ChatGPT, but a coding agent dubbed Codex that is being made available to ChatGPT Pro, Enterprise, and Team subscribers starting Friday. By drumming up comparisons to how ChatGPT was first described, CEO Sam Altman and other company leaders are positioning Codex as the company’s next major product. It doesn’t cost extra to use for now, though OpenAI plans to eventually charge for access once it gets a sense of demand.The goal for Codex is to make ChatGPT a “virtual coworker” for engineers, Josh Tobin, OpenAI’s research lead for agents, said during a press call I attended this week. Like other vibe coding tools, Codex generates code from natural language. It can act independently on sandboxed code to fix bugs, run tests, and suggest changes to how code should run in the real world. This process can take anywhere up to 30 minutes, and OpenAI plans to let Codex work in the background for longer over time. Codex is integrated into ChatGPT’s web app to start, but it’s intentionally cut off from being able to access the internet to mitigate security risks. It’s powered by a version of OpenAI’s o3 reasoning model that is customized for coding and called codex-1.According to Tobin, the company sees Codex as complementary to more granular AI coding assistants like Cursor and Windsurf, the latter of which OpenAI is in talks to acquire for roughly billion. Inside OpenAI, Codex is already being used by engineers as a “morning to-do list” that helps them run multiple tasks in parallel. They have it spin up multiple tasks in parallel that they can come back to check on, according to Alexander Embiricos, Codex’s product lead. He said that a handful of companies that have tested it externally are seeing Codex used by on-call engineers who oversee a service’s stability.For now, Codex is relatively limited in what it can do autonomously. Eventually, OpenAI’s goal is for it to fully abstract away the complexity of coding. “The way that we think most development will happen in the future is that the agent will work on its own computer, and we’ll delegate to it,” said Embiricos. During a recent talk, Altman described coding as “central to the future of OpenAI.” There’s a belief in Silicon Valley that whoever creates a general-purpose AI engineer, which Codex is supposed to become, will have an edge in the race to build artificial general intelligence. Codex was what OpenAI called its first AI coding tool way back in 2021, before ChatGPT was released. Now, models helping people code is perhaps the hottest area of AI, with Anthropic and others betting heavily on it as a business. On Thursday, Windsurf announced its own suite of coding models. And earlier this week, Google’s Gemini added the ability to connect to GitHub and announced AlphaEvolve, an AI coding agent specifically designed for developing algorithms. Speaking of Google, the company’s annual conference, I/O, happens to be next week. Given the rivalry between OpenAI and Google, the timing of this week’s Codex announcement is likely not a coincidence. We’ll see how Google responds.ElsewhereAirbnb’s Apple envy: A minimal, all-black stage. A company founder in a black shirt showing live demos from an iPhone to an audience. Jony Ive sitting in the front row. You may be thinking of an early Apple keynote, but that was what I witnessed in Los Angeles earlier this week at Airbnb’s “Summer Release” event. There, CEO Brian Chesky announced a new “Services” offering, a reboot of Airbnb Experiences, and an app redesign that would make Scott Forstall grin. Steve Jobs has inspired a generation of founders who want to present like him. I understand the impulse, but I’m ready to see a fresh take on what a tech keynote can be. That said, if you’re curious about this week’s Airbnb news, I’d recommend two longreads from Wired and The Wall Street Journal. AI news rapid fire: Microsoft laid off three percent of employees, mainly targeting engineers. / Databricks bought the database startup Neon for billion. / Meta delayed the release of its Llama 4.0 “Behemoth” frontier model. / Perplexity integrated with PayPal and Venmo to allow for purchases. / Cohere missed its revenue forecast by 85 percent. / Chegg laid off 22 percent of employees in large part due to the impact of AI.Overheard“On May 14 at approximately 3:15 AM PST, an unauthorized modification was made to the Grok response bot’s prompt on X. ” - xAI’s post explaining why Grok was suddenly trying to debunk claims of white genocide in South Africa.“When AR really works, I think that will wow people” - Google CEO Sundar Pichai on the All-In podcast.“Flat design is over. The future is colorful and dimensional.” - Airbnb CEO Brian Chesky trying to will it into existence on X.Personnel logAnand Swaminathan, Tesla’s senior manager of Optimus, left to be head of delivery flight performance for Zipline. Benjamin Joe is Meta’s new VP of Asia Pacific. Susie Dickson, Meta’s head of content design, left after 12 years.Alston Cheek, Snap’s former director of platform partnerships, joined Airbnb in the same role. Sterling Anderson, Aurora’s co-founder, joined GM as chief product officer. Richard Gringras, Google’s VP of News, left after 14 years. Christina Wootton, Roblox’s chief partnerships officer, left after over 11 years.Link listMore to click on:If you haven’t already, don’t forget to subscribe to The Verge, which includes unlimited access to Command Line and all of our reporting.As always, I welcome your feedback, especially if you have thoughts on this issue or a story idea to share. You can respond here or ping me securely on Signal.Thanks for subscribing.See More: #chatgpt #getting #coding #agent

ChatGPT is getting an AI coding agent

www.theverge.com
OpenAI’s next “low-key research preview” has arrived.This time, it’s not ChatGPT, but a coding agent dubbed Codex that is being made available to ChatGPT Pro, Enterprise, and Team subscribers starting Friday. By drumming up comparisons to how ChatGPT was first described, CEO Sam Altman and other company leaders are positioning Codex as the company’s next major product. It doesn’t cost extra to use for now, though OpenAI plans to eventually charge for access once it gets a sense of demand.The goal for Codex is to make ChatGPT a “virtual coworker” for engineers, Josh Tobin, OpenAI’s research lead for agents, said during a press call I attended this week. Like other vibe coding tools, Codex generates code from natural language. It can act independently on sandboxed code to fix bugs, run tests, and suggest changes to how code should run in the real world. This process can take anywhere up to 30 minutes, and OpenAI plans to let Codex work in the background for longer over time. Codex is integrated into ChatGPT’s web app to start, but it’s intentionally cut off from being able to access the internet to mitigate security risks. It’s powered by a version of OpenAI’s o3 reasoning model that is customized for coding and called codex-1.According to Tobin, the company sees Codex as complementary to more granular AI coding assistants like Cursor and Windsurf, the latter of which OpenAI is in talks to acquire for roughly $3 billion. Inside OpenAI, Codex is already being used by engineers as a “morning to-do list” that helps them run multiple tasks in parallel. They have it spin up multiple tasks in parallel that they can come back to check on, according to Alexander Embiricos, Codex’s product lead. He said that a handful of companies that have tested it externally are seeing Codex used by on-call engineers who oversee a service’s stability.For now, Codex is relatively limited in what it can do autonomously. Eventually, OpenAI’s goal is for it to fully abstract away the complexity of coding. “The way that we think most development will happen in the future is that the agent will work on its own computer, and we’ll delegate to it,” said Embiricos. During a recent talk, Altman described coding as “central to the future of OpenAI.” There’s a belief in Silicon Valley that whoever creates a general-purpose AI engineer, which Codex is supposed to become, will have an edge in the race to build artificial general intelligence. Codex was what OpenAI called its first AI coding tool way back in 2021, before ChatGPT was released. Now, models helping people code is perhaps the hottest area of AI, with Anthropic and others betting heavily on it as a business. On Thursday, Windsurf announced its own suite of coding models. And earlier this week, Google’s Gemini added the ability to connect to GitHub and announced AlphaEvolve, an AI coding agent specifically designed for developing algorithms. Speaking of Google, the company’s annual conference, I/O, happens to be next week. Given the rivalry between OpenAI and Google, the timing of this week’s Codex announcement is likely not a coincidence. We’ll see how Google responds.ElsewhereAirbnb’s Apple envy: A minimal, all-black stage. A company founder in a black shirt showing live demos from an iPhone to an audience. Jony Ive sitting in the front row. You may be thinking of an early Apple keynote, but that was what I witnessed in Los Angeles earlier this week at Airbnb’s “Summer Release” event. There, CEO Brian Chesky announced a new “Services” offering, a reboot of Airbnb Experiences, and an app redesign that would make Scott Forstall grin. Steve Jobs has inspired a generation of founders who want to present like him. I understand the impulse, but I’m ready to see a fresh take on what a tech keynote can be. That said, if you’re curious about this week’s Airbnb news, I’d recommend two longreads from Wired and The Wall Street Journal. AI news rapid fire: Microsoft laid off three percent of employees, mainly targeting engineers. / Databricks bought the database startup Neon for $1 billion. / Meta delayed the release of its Llama 4.0 “Behemoth” frontier model. / Perplexity integrated with PayPal and Venmo to allow for purchases. / Cohere missed its revenue forecast by 85 percent. / Chegg laid off 22 percent of employees in large part due to the impact of AI.Overheard“On May 14 at approximately 3:15 AM PST, an unauthorized modification was made to the Grok response bot’s prompt on X. ” - xAI’s post explaining why Grok was suddenly trying to debunk claims of white genocide in South Africa.“When AR really works, I think that will wow people” - Google CEO Sundar Pichai on the All-In podcast.“Flat design is over. The future is colorful and dimensional.” - Airbnb CEO Brian Chesky trying to will it into existence on X.Personnel logAnand Swaminathan, Tesla’s senior manager of Optimus, left to be head of delivery flight performance for Zipline. Benjamin Joe is Meta’s new VP of Asia Pacific. Susie Dickson, Meta’s head of content design, left after 12 years.Alston Cheek, Snap’s former director of platform partnerships, joined Airbnb in the same role. Sterling Anderson, Aurora’s co-founder, joined GM as chief product officer. Richard Gringras, Google’s VP of News, left after 14 years. Christina Wootton, Roblox’s chief partnerships officer, left after over 11 years.Link listMore to click on:If you haven’t already, don’t forget to subscribe to The Verge, which includes unlimited access to Command Line and all of our reporting.As always, I welcome your feedback, especially if you have thoughts on this issue or a story idea to share. You can respond here or ping me securely on Signal.Thanks for subscribing.See More:

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça o login para curtir, compartilhar e comentar!
Towards Data Science @TowardsDataScience compartilhou um link
2025-05-16 02:46:50 ·

Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer

AlphaEvolve imagined as a genetic algorithm coupled to a large language model. Picture created by the author using various tools including Dall-E3 via ChatGPT.

Large Language Models have undeniably revolutionized how many of us approach coding, but they’re often more like a super-powered intern than a seasoned architect. Errors, bugs and hallucinations happen all the time, and it might even happen that the code runs well but… it’s not doing exactly what we wanted.

Now, imagine an AI that doesn’t just write code based on what it’s seen, but actively evolves it. To a first surprise, this means you increase the chances of getting the right code written; however, it goes far beyond: Google showed that it can also use such AI methodology to discover new algorithms that are faster, more efficient, and sometimes, entirely new.

I’m talking about AlphaEvolve, the recent bombshell from Google DeepMind. Let me say it again: it isn’t just another code generator, but rather a system that generates and evolves code, allowing it to discover new algorithms. Powered by Google’s formidable Gemini models, AlphaEvolve could revolutionize how we approach coding, mathematics, algorithm design, and why not data analysis itself.

How Does AlphaEvolve ‘Evolve’ Code?

Think of it like natural selection, but for software. That is, think about Genetic Algorithms, which have existed in data science, numerical methods and computational mathematics for decades. Briefly, instead of starting from scratch every time, AlphaEvolve takes an initial piece of code – possibly a “skeleton” provided by a human, with specific areas marked for improvement – and then runs on it an iterative process of refinement.

Let me summarize here the procedure detailed in Deepmind’s white paper:

Intelligent prompting: AlphaEvolve is “smart” enough to craft its own prompts for the underlying Gemini Llm. These prompts instruct Gemini to act like a world-class expert in a specific domain, armed with context from previous attempts, including the points that seemed to have worked correctly and those that are clear failures. This is where those massive context windows of models like Geminicome into play.

Creative mutation: The LLM then generates a diverse pool of “candidate” solutions – variations and mutations of the original code, exploring different approaches to solve the given problem. This parallels very closely the inner working of regular genetic algorithms.

Survival of the fittest: Again like in genetic algorithms, but candidate solutions are automatically compiled, run, and rigorously evaluated against predefined metrics.

Breeding of the top programs: The best-performing solutions are selected and become the “parents” for a next generation, just like in genetic algorithms. The successful traits of the parent programs are fed back into the prompting mechanism.

Repeat: This cycle – generate, test, select, learn – repeats, and with each iteration, AlphaEvolve explores the vast search space of possible programs thus gradually homing in on solutions that are better and better, while purging those that fail. The longer you let it run, the more sophisticated and optimized the solutions can become.

Building on Previous Attempts

AlphaEvolve is the successor to earlier Google projects like AlphaCodeand, more directly, of FunSearch. FunSearch was a fascinating proof of concept that showed how LLMs could discover new mathematical insights by evolving small Python functions.

AlphaEvolve took that concept and “injected it with steroids”. I mean this for various reasons…

First, because thanks to Gemini’s huge token window, AlphaEvolve can grapple with entire codebases, hundreds of lines long, not just tiny functions as in the early tests like FunSearch. Second, because like other LLMs, Gemini has seen thousands and thousands of code in tens of programming languages; hence it has covered a wider variety of tasksand it became a kind of polyglot programmer.

Note that with smarter LLMs as engines, AlphaEvolve can itself evolve to become faster and more efficient in its search for solutions and optimal programs.

AlphaEvolve’s Mind-Blowing Results on Real-World Problems

Here are the most interesting applications presented in the white paper:

Optimizing efficiency at Google’s data centers: AlphaEvolve discovered a new scheduling heuristic that squeezed out a 0.7% saving in Google’s computing resources. This may look small, but Google’s scale this means a substantial ecological and monetary cut!

Designing better AI chips: AlphaEvolve could simplify some of the complex circuits within Google’s TPUs, specifically for the matrix multiplication operations that are the lifeblood of modern AI. This improves calculation speeds and again contributes to lower ecological and economical costs.

Faster AI training: AlphaEvolve even turned its optimization gaze inward, by accelerating a matrix multiplication library used in training the very Gemini models that power it! This means a slight but sizable reduction in AI training times and again lower ecological and economical costs!

Numerical methods: In a kind of validation test, AlphaEvolve was set loose on over 50 notoriously tricky open problems in mathematics. In around 75% of them, it independently rediscovered the best-known human solutions!

Towards Self-Improving AI?

One of the most profound implications of tools like AlphaEvolve is the “virtuous cycle” by which AI could improve AI models themselves. Moreover, more efficient models and hardware make AlphaEvolve itself more powerful, enabling it to discover even deeper optimizations. That’s a feedback loop that could dramatically accelerate AI progress, and lead who knows where. This is somehow using AI to make AI better, faster, and smarter – a genuine step on the path towards more powerful and perhaps general artificial intelligence.

Leaving aside this reflection, which quickly gets close to the realm of science function, the point is that for a vast class of problems in science, engineering, and computation, AlphaEvolve could represent a paradigm shift. As a computational chemist and biologist, I myself use tools based in LLMs and reasoning AI systems to assist my work, write and debug programs, test them, analyze data more rapidly, and more. With what Deepmind has presented now, it becomes even clearer that we approach a future where AI doesn’t just execute human instructions but becomes a creative partner in discovery and innovation.

Already for some months we have been moving from AI that completes our code to AI that creates it almost entirely, and tools like AlphaFold will push us to times where AI just sits to crack problems withus, writing and evolving code to get to optimal and possibly entirely unexpected solutions. No doubt that the next few years are going to be wild.

References and Related Reads

Deepmind’s blog post and white paper on AlphaEvolve

A Google Colab notebook with the mathematical discoveries of AlphaEvolve outlined in Section 3 of the paper!

Powerful Data Analysis and Plotting via Natural Language Requests by Giving LLMs Access to Functions

New DeepMind Work Unveils Supreme Prompt Seeds for Language Models

www.lucianoabriata.com I write about everything that lies in my broad sphere of interests: nature, science, technology, programming, etc. Subscribe to get my new stories by email. To consult about small jobs check my services page here. You can contact me here. You can tip me here.

The post Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer appeared first on Towards Data Science.
#googles #alphaevolve #isevolvingnew #algorithms #could

Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer
AlphaEvolve imagined as a genetic algorithm coupled to a large language model. Picture created by the author using various tools including Dall-E3 via ChatGPT. Large Language Models have undeniably revolutionized how many of us approach coding, but they’re often more like a super-powered intern than a seasoned architect. Errors, bugs and hallucinations happen all the time, and it might even happen that the code runs well but… it’s not doing exactly what we wanted. Now, imagine an AI that doesn’t just write code based on what it’s seen, but actively evolves it. To a first surprise, this means you increase the chances of getting the right code written; however, it goes far beyond: Google showed that it can also use such AI methodology to discover new algorithms that are faster, more efficient, and sometimes, entirely new. I’m talking about AlphaEvolve, the recent bombshell from Google DeepMind. Let me say it again: it isn’t just another code generator, but rather a system that generates and evolves code, allowing it to discover new algorithms. Powered by Google’s formidable Gemini models, AlphaEvolve could revolutionize how we approach coding, mathematics, algorithm design, and why not data analysis itself. How Does AlphaEvolve ‘Evolve’ Code? Think of it like natural selection, but for software. That is, think about Genetic Algorithms, which have existed in data science, numerical methods and computational mathematics for decades. Briefly, instead of starting from scratch every time, AlphaEvolve takes an initial piece of code – possibly a “skeleton” provided by a human, with specific areas marked for improvement – and then runs on it an iterative process of refinement. Let me summarize here the procedure detailed in Deepmind’s white paper: Intelligent prompting: AlphaEvolve is “smart” enough to craft its own prompts for the underlying Gemini Llm. These prompts instruct Gemini to act like a world-class expert in a specific domain, armed with context from previous attempts, including the points that seemed to have worked correctly and those that are clear failures. This is where those massive context windows of models like Geminicome into play. Creative mutation: The LLM then generates a diverse pool of “candidate” solutions – variations and mutations of the original code, exploring different approaches to solve the given problem. This parallels very closely the inner working of regular genetic algorithms. Survival of the fittest: Again like in genetic algorithms, but candidate solutions are automatically compiled, run, and rigorously evaluated against predefined metrics. Breeding of the top programs: The best-performing solutions are selected and become the “parents” for a next generation, just like in genetic algorithms. The successful traits of the parent programs are fed back into the prompting mechanism. Repeat: This cycle – generate, test, select, learn – repeats, and with each iteration, AlphaEvolve explores the vast search space of possible programs thus gradually homing in on solutions that are better and better, while purging those that fail. The longer you let it run, the more sophisticated and optimized the solutions can become. Building on Previous Attempts AlphaEvolve is the successor to earlier Google projects like AlphaCodeand, more directly, of FunSearch. FunSearch was a fascinating proof of concept that showed how LLMs could discover new mathematical insights by evolving small Python functions. AlphaEvolve took that concept and “injected it with steroids”. I mean this for various reasons… First, because thanks to Gemini’s huge token window, AlphaEvolve can grapple with entire codebases, hundreds of lines long, not just tiny functions as in the early tests like FunSearch. Second, because like other LLMs, Gemini has seen thousands and thousands of code in tens of programming languages; hence it has covered a wider variety of tasksand it became a kind of polyglot programmer. Note that with smarter LLMs as engines, AlphaEvolve can itself evolve to become faster and more efficient in its search for solutions and optimal programs. AlphaEvolve’s Mind-Blowing Results on Real-World Problems Here are the most interesting applications presented in the white paper: Optimizing efficiency at Google’s data centers: AlphaEvolve discovered a new scheduling heuristic that squeezed out a 0.7% saving in Google’s computing resources. This may look small, but Google’s scale this means a substantial ecological and monetary cut! Designing better AI chips: AlphaEvolve could simplify some of the complex circuits within Google’s TPUs, specifically for the matrix multiplication operations that are the lifeblood of modern AI. This improves calculation speeds and again contributes to lower ecological and economical costs. Faster AI training: AlphaEvolve even turned its optimization gaze inward, by accelerating a matrix multiplication library used in training the very Gemini models that power it! This means a slight but sizable reduction in AI training times and again lower ecological and economical costs! Numerical methods: In a kind of validation test, AlphaEvolve was set loose on over 50 notoriously tricky open problems in mathematics. In around 75% of them, it independently rediscovered the best-known human solutions! Towards Self-Improving AI? One of the most profound implications of tools like AlphaEvolve is the “virtuous cycle” by which AI could improve AI models themselves. Moreover, more efficient models and hardware make AlphaEvolve itself more powerful, enabling it to discover even deeper optimizations. That’s a feedback loop that could dramatically accelerate AI progress, and lead who knows where. This is somehow using AI to make AI better, faster, and smarter – a genuine step on the path towards more powerful and perhaps general artificial intelligence. Leaving aside this reflection, which quickly gets close to the realm of science function, the point is that for a vast class of problems in science, engineering, and computation, AlphaEvolve could represent a paradigm shift. As a computational chemist and biologist, I myself use tools based in LLMs and reasoning AI systems to assist my work, write and debug programs, test them, analyze data more rapidly, and more. With what Deepmind has presented now, it becomes even clearer that we approach a future where AI doesn’t just execute human instructions but becomes a creative partner in discovery and innovation. Already for some months we have been moving from AI that completes our code to AI that creates it almost entirely, and tools like AlphaFold will push us to times where AI just sits to crack problems withus, writing and evolving code to get to optimal and possibly entirely unexpected solutions. No doubt that the next few years are going to be wild. References and Related Reads Deepmind’s blog post and white paper on AlphaEvolve A Google Colab notebook with the mathematical discoveries of AlphaEvolve outlined in Section 3 of the paper! Powerful Data Analysis and Plotting via Natural Language Requests by Giving LLMs Access to Functions New DeepMind Work Unveils Supreme Prompt Seeds for Language Models www.lucianoabriata.com I write about everything that lies in my broad sphere of interests: nature, science, technology, programming, etc. Subscribe to get my new stories by email. To consult about small jobs check my services page here. You can contact me here. You can tip me here. The post Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer appeared first on Towards Data Science. #googles #alphaevolve #isevolvingnew #algorithms #could

Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer

towardsdatascience.com
AlphaEvolve imagined as a genetic algorithm coupled to a large language model. Picture created by the author using various tools including Dall-E3 via ChatGPT. Large Language Models have undeniably revolutionized how many of us approach coding, but they’re often more like a super-powered intern than a seasoned architect. Errors, bugs and hallucinations happen all the time, and it might even happen that the code runs well but… it’s not doing exactly what we wanted. Now, imagine an AI that doesn’t just write code based on what it’s seen, but actively evolves it. To a first surprise, this means you increase the chances of getting the right code written; however, it goes far beyond: Google showed that it can also use such AI methodology to discover new algorithms that are faster, more efficient, and sometimes, entirely new. I’m talking about AlphaEvolve, the recent bombshell from Google DeepMind. Let me say it again: it isn’t just another code generator, but rather a system that generates and evolves code, allowing it to discover new algorithms. Powered by Google’s formidable Gemini models (that I intend to cover soon, because I’m amazed at their power!), AlphaEvolve could revolutionize how we approach coding, mathematics, algorithm design, and why not data analysis itself. How Does AlphaEvolve ‘Evolve’ Code? Think of it like natural selection, but for software. That is, think about Genetic Algorithms, which have existed in data science, numerical methods and computational mathematics for decades. Briefly, instead of starting from scratch every time, AlphaEvolve takes an initial piece of code – possibly a “skeleton” provided by a human, with specific areas marked for improvement – and then runs on it an iterative process of refinement. Let me summarize here the procedure detailed in Deepmind’s white paper: Intelligent prompting: AlphaEvolve is “smart” enough to craft its own prompts for the underlying Gemini Llm. These prompts instruct Gemini to act like a world-class expert in a specific domain, armed with context from previous attempts, including the points that seemed to have worked correctly and those that are clear failures. This is where those massive context windows of models like Gemini (even you can run up to a million tokens at Google’s AI studio) come into play. Creative mutation: The LLM then generates a diverse pool of “candidate” solutions – variations and mutations of the original code, exploring different approaches to solve the given problem. This parallels very closely the inner working of regular genetic algorithms. Survival of the fittest: Again like in genetic algorithms, but candidate solutions are automatically compiled, run, and rigorously evaluated against predefined metrics. Breeding of the top programs: The best-performing solutions are selected and become the “parents” for a next generation, just like in genetic algorithms. The successful traits of the parent programs are fed back into the prompting mechanism. Repeat (to evolve): This cycle – generate, test, select, learn – repeats, and with each iteration, AlphaEvolve explores the vast search space of possible programs thus gradually homing in on solutions that are better and better, while purging those that fail. The longer you let it run (what the researchers call “test-time compute”), the more sophisticated and optimized the solutions can become. Building on Previous Attempts AlphaEvolve is the successor to earlier Google projects like AlphaCode (which tackled competitive Programming) and, more directly, of FunSearch. FunSearch was a fascinating proof of concept that showed how LLMs could discover new mathematical insights by evolving small Python functions. AlphaEvolve took that concept and “injected it with steroids”. I mean this for various reasons… First, because thanks to Gemini’s huge token window, AlphaEvolve can grapple with entire codebases, hundreds of lines long, not just tiny functions as in the early tests like FunSearch. Second, because like other LLMs, Gemini has seen thousands and thousands of code in tens of programming languages; hence it has covered a wider variety of tasks (as typically different languages are used more in some domains than others) and it became a kind of polyglot programmer. Note that with smarter LLMs as engines, AlphaEvolve can itself evolve to become faster and more efficient in its search for solutions and optimal programs. AlphaEvolve’s Mind-Blowing Results on Real-World Problems Here are the most interesting applications presented in the white paper: Optimizing efficiency at Google’s data centers: AlphaEvolve discovered a new scheduling heuristic that squeezed out a 0.7% saving in Google’s computing resources. This may look small, but Google’s scale this means a substantial ecological and monetary cut! Designing better AI chips: AlphaEvolve could simplify some of the complex circuits within Google’s TPUs, specifically for the matrix multiplication operations that are the lifeblood of modern AI. This improves calculation speeds and again contributes to lower ecological and economical costs. Faster AI training: AlphaEvolve even turned its optimization gaze inward, by accelerating a matrix multiplication library used in training the very Gemini models that power it! This means a slight but sizable reduction in AI training times and again lower ecological and economical costs! Numerical methods: In a kind of validation test, AlphaEvolve was set loose on over 50 notoriously tricky open problems in mathematics. In around 75% of them, it independently rediscovered the best-known human solutions! Towards Self-Improving AI? One of the most profound implications of tools like AlphaEvolve is the “virtuous cycle” by which AI could improve AI models themselves. Moreover, more efficient models and hardware make AlphaEvolve itself more powerful, enabling it to discover even deeper optimizations. That’s a feedback loop that could dramatically accelerate AI progress, and lead who knows where. This is somehow using AI to make AI better, faster, and smarter – a genuine step on the path towards more powerful and perhaps general artificial intelligence. Leaving aside this reflection, which quickly gets close to the realm of science function, the point is that for a vast class of problems in science, engineering, and computation, AlphaEvolve could represent a paradigm shift. As a computational chemist and biologist, I myself use tools based in LLMs and reasoning AI systems to assist my work, write and debug programs, test them, analyze data more rapidly, and more. With what Deepmind has presented now, it becomes even clearer that we approach a future where AI doesn’t just execute human instructions but becomes a creative partner in discovery and innovation. Already for some months we have been moving from AI that completes our code to AI that creates it almost entirely, and tools like AlphaFold will push us to times where AI just sits to crack problems with (or for!) us, writing and evolving code to get to optimal and possibly entirely unexpected solutions. No doubt that the next few years are going to be wild. References and Related Reads Deepmind’s blog post and white paper on AlphaEvolve A Google Colab notebook with the mathematical discoveries of AlphaEvolve outlined in Section 3 of the paper! Powerful Data Analysis and Plotting via Natural Language Requests by Giving LLMs Access to Functions New DeepMind Work Unveils Supreme Prompt Seeds for Language Models www.lucianoabriata.com I write about everything that lies in my broad sphere of interests: nature, science, technology, programming, etc. Subscribe to get my new stories by email. To consult about small jobs check my services page here. You can contact me here. You can tip me here. The post Google’s AlphaEvolve Is Evolving New Algorithms — And It Could Be a Game Changer appeared first on Towards Data Science.

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça o login para curtir, compartilhar e comentar!
Scientific American @ScientificAmerican compartilhou um link
2025-05-15 16:24:27 ·

New Google AI Chatbot Tackles Complex Math and Science

May 15, 20253 min readNew Google AI Chatbot Tackles Complex Math and ScienceA Google DeepMind system improves chip designs and addresses unsolved math problems but has not been rolled out to researchers outside the companyBy Elizabeth Gibney & Nature magazine DeepMind says that AlphaEvolve has helped to improve the design of AI chips. MF3d/Getty ImagesGoogle DeepMind has used chatbot models to come up with solutions to major problems in mathematics and computer science.The system, called AlphaEvolve, combines the creativity of a large language modelwith algorithms that can scrutinize the model’s suggestions to filter and improve solutions. It was described in a white paper released by the company on 14 May.“The paper is quite spectacular,” says Mario Krenn, who leads the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany. “I think AlphaEvolve is the first successful demonstration of new discoveries based on general-purpose LLMs.”On supporting science journalismIf you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.As well as using the system to discover solutions to open maths problems, DeepMind has already applied the artificial intelligencetechnique to its own practical challenges, says Pushmeet Kohli, head of science at the firm in London.AlphaEvolve has helped to improve the design of the company’s next generation of tensor processing units — computing chips developed specially for AI — and has found a way to more efficiently exploit Google’s worldwide computing capacity, saving 0.7% of total resources. “It has had substantial impact,” says Kohli.General-purpose AIMost of the successful applications of AI in science so far — including the protein-designing tool AlphaFold — have involved a learning algorithm that was hand-crafted for its task, says Krenn. But AlphaEvolve is general-purpose, tapping the abilities of LLMs to generate code to solve problems in a wide range of domains.DeepMind describes AlphaEvolve as an ‘agent’, because it involves using interacting AI models. But it targets a different point in the scientific process from many other ‘agentic’ AI science systems, which have been used to review the literature and suggest hypotheses.AlphaEvolve is based on the firm’s Gemini family of LLMs. Each task starts with the user inputting a question, criteria for evaluation and a suggested solution, for which the LLM proposes hundreds or thousands of modifications. An ‘evaluator’ algorithm then assesses the modifications against the metrics for a good solution.On the basis of which solutions are judged to be the best, the LLM suggests fresh ideas and over time the system evolves a population of stronger algorithms, says Matej Balog, an AI scientist at DeepMind who co-led the research. “We explore this diverse set of possibilities of how the problem can be solved,” he says.AlphaEvolve builds on the firm’s FunSearch system, which in 2023 was shown to use a similar evolutionary approach to outdo humans in unsolved problems in maths. Compared with FunSearch, AlphaEvolve can handle much larger pieces of code and tackle more complex algorithms across a wide range of scientific domains, says Balog.DeepMind says that AlphaEvolve has come up with a way to perform a calculation, known as matrix multiplication, that in some cases is faster than the fastest-known method, which was developed by German mathematician Volker Strassen in 1969. Such calculations involve multiplying numbers in grids and are used to train neural networks. Despite being general-purpose, AlphaEvolve outperformed AlphaTensor, an AI tool described by the firm in 2022 and designed specifically for matrix mechanics.The approach could be used to tackle optimization problems, says Krenn, or anywhere in science where there are concrete metrics, or simulations, to evaluate what makes a good solution. This could include designing new microscopes, telescope or even materials, he adds.Narrow applicationsIn mathematics, AlphaEvolve seems to allow significant speed-ups in tackling some problems, says Simon Frieder, a mathematician and AI researcher at the University of Oxford, UK. But it will probably be applied only to the “narrow slice” of tasks that can be presented as problems to be solved through code, he says.Other researchers are reserving judgement about the tool’s usefulness until has been trialled outside DeepMind. “Until the systems have been tested by a broader community, I would stay sceptical and take the reported results with a grain of salt,” says Huan Sun, an AI researcher at the Ohio State University in Columbus. Frieder says he will wait until an open-source version is recreated by researchers, rather than a rely on DeepMind’s proprietary system, which could be withdrawn or changed.Although AlphaEvolve requires less computing power to run than AlphaTensor, it is still too resource-intensive to be made freely available on DeepMind’s servers, says Kohli.But the company hopes that announcing the system will encourage researchers to suggest areas of science in which to apply AlphaEvolve. “We are definitely committed to make sure that the most people in the scientific community get access to it,” says Kohli.This article is reproduced with permission and was first published on May 14, 2025.
#new #google #chatbot #tackles #complex

New Google AI Chatbot Tackles Complex Math and Science
May 15, 20253 min readNew Google AI Chatbot Tackles Complex Math and ScienceA Google DeepMind system improves chip designs and addresses unsolved math problems but has not been rolled out to researchers outside the companyBy Elizabeth Gibney & Nature magazine DeepMind says that AlphaEvolve has helped to improve the design of AI chips. MF3d/Getty ImagesGoogle DeepMind has used chatbot models to come up with solutions to major problems in mathematics and computer science.The system, called AlphaEvolve, combines the creativity of a large language modelwith algorithms that can scrutinize the model’s suggestions to filter and improve solutions. It was described in a white paper released by the company on 14 May.“The paper is quite spectacular,” says Mario Krenn, who leads the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany. “I think AlphaEvolve is the first successful demonstration of new discoveries based on general-purpose LLMs.”On supporting science journalismIf you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.As well as using the system to discover solutions to open maths problems, DeepMind has already applied the artificial intelligencetechnique to its own practical challenges, says Pushmeet Kohli, head of science at the firm in London.AlphaEvolve has helped to improve the design of the company’s next generation of tensor processing units — computing chips developed specially for AI — and has found a way to more efficiently exploit Google’s worldwide computing capacity, saving 0.7% of total resources. “It has had substantial impact,” says Kohli.General-purpose AIMost of the successful applications of AI in science so far — including the protein-designing tool AlphaFold — have involved a learning algorithm that was hand-crafted for its task, says Krenn. But AlphaEvolve is general-purpose, tapping the abilities of LLMs to generate code to solve problems in a wide range of domains.DeepMind describes AlphaEvolve as an ‘agent’, because it involves using interacting AI models. But it targets a different point in the scientific process from many other ‘agentic’ AI science systems, which have been used to review the literature and suggest hypotheses.AlphaEvolve is based on the firm’s Gemini family of LLMs. Each task starts with the user inputting a question, criteria for evaluation and a suggested solution, for which the LLM proposes hundreds or thousands of modifications. An ‘evaluator’ algorithm then assesses the modifications against the metrics for a good solution.On the basis of which solutions are judged to be the best, the LLM suggests fresh ideas and over time the system evolves a population of stronger algorithms, says Matej Balog, an AI scientist at DeepMind who co-led the research. “We explore this diverse set of possibilities of how the problem can be solved,” he says.AlphaEvolve builds on the firm’s FunSearch system, which in 2023 was shown to use a similar evolutionary approach to outdo humans in unsolved problems in maths. Compared with FunSearch, AlphaEvolve can handle much larger pieces of code and tackle more complex algorithms across a wide range of scientific domains, says Balog.DeepMind says that AlphaEvolve has come up with a way to perform a calculation, known as matrix multiplication, that in some cases is faster than the fastest-known method, which was developed by German mathematician Volker Strassen in 1969. Such calculations involve multiplying numbers in grids and are used to train neural networks. Despite being general-purpose, AlphaEvolve outperformed AlphaTensor, an AI tool described by the firm in 2022 and designed specifically for matrix mechanics.The approach could be used to tackle optimization problems, says Krenn, or anywhere in science where there are concrete metrics, or simulations, to evaluate what makes a good solution. This could include designing new microscopes, telescope or even materials, he adds.Narrow applicationsIn mathematics, AlphaEvolve seems to allow significant speed-ups in tackling some problems, says Simon Frieder, a mathematician and AI researcher at the University of Oxford, UK. But it will probably be applied only to the “narrow slice” of tasks that can be presented as problems to be solved through code, he says.Other researchers are reserving judgement about the tool’s usefulness until has been trialled outside DeepMind. “Until the systems have been tested by a broader community, I would stay sceptical and take the reported results with a grain of salt,” says Huan Sun, an AI researcher at the Ohio State University in Columbus. Frieder says he will wait until an open-source version is recreated by researchers, rather than a rely on DeepMind’s proprietary system, which could be withdrawn or changed.Although AlphaEvolve requires less computing power to run than AlphaTensor, it is still too resource-intensive to be made freely available on DeepMind’s servers, says Kohli.But the company hopes that announcing the system will encourage researchers to suggest areas of science in which to apply AlphaEvolve. “We are definitely committed to make sure that the most people in the scientific community get access to it,” says Kohli.This article is reproduced with permission and was first published on May 14, 2025. #new #google #chatbot #tackles #complex

New Google AI Chatbot Tackles Complex Math and Science

www.scientificamerican.com
May 15, 20253 min readNew Google AI Chatbot Tackles Complex Math and ScienceA Google DeepMind system improves chip designs and addresses unsolved math problems but has not been rolled out to researchers outside the companyBy Elizabeth Gibney & Nature magazine DeepMind says that AlphaEvolve has helped to improve the design of AI chips. MF3d/Getty ImagesGoogle DeepMind has used chatbot models to come up with solutions to major problems in mathematics and computer science.The system, called AlphaEvolve, combines the creativity of a large language model (LLM) with algorithms that can scrutinize the model’s suggestions to filter and improve solutions. It was described in a white paper released by the company on 14 May.“The paper is quite spectacular,” says Mario Krenn, who leads the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany. “I think AlphaEvolve is the first successful demonstration of new discoveries based on general-purpose LLMs.”On supporting science journalismIf you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.As well as using the system to discover solutions to open maths problems, DeepMind has already applied the artificial intelligence (AI) technique to its own practical challenges, says Pushmeet Kohli, head of science at the firm in London.AlphaEvolve has helped to improve the design of the company’s next generation of tensor processing units — computing chips developed specially for AI — and has found a way to more efficiently exploit Google’s worldwide computing capacity, saving 0.7% of total resources. “It has had substantial impact,” says Kohli.General-purpose AIMost of the successful applications of AI in science so far — including the protein-designing tool AlphaFold — have involved a learning algorithm that was hand-crafted for its task, says Krenn. But AlphaEvolve is general-purpose, tapping the abilities of LLMs to generate code to solve problems in a wide range of domains.DeepMind describes AlphaEvolve as an ‘agent’, because it involves using interacting AI models. But it targets a different point in the scientific process from many other ‘agentic’ AI science systems, which have been used to review the literature and suggest hypotheses.AlphaEvolve is based on the firm’s Gemini family of LLMs. Each task starts with the user inputting a question, criteria for evaluation and a suggested solution, for which the LLM proposes hundreds or thousands of modifications. An ‘evaluator’ algorithm then assesses the modifications against the metrics for a good solution (for example, in the task of assigning Google’s computing jobs, researchers want to waste fewer resources).On the basis of which solutions are judged to be the best, the LLM suggests fresh ideas and over time the system evolves a population of stronger algorithms, says Matej Balog, an AI scientist at DeepMind who co-led the research. “We explore this diverse set of possibilities of how the problem can be solved,” he says.AlphaEvolve builds on the firm’s FunSearch system, which in 2023 was shown to use a similar evolutionary approach to outdo humans in unsolved problems in maths. Compared with FunSearch, AlphaEvolve can handle much larger pieces of code and tackle more complex algorithms across a wide range of scientific domains, says Balog.DeepMind says that AlphaEvolve has come up with a way to perform a calculation, known as matrix multiplication, that in some cases is faster than the fastest-known method, which was developed by German mathematician Volker Strassen in 1969. Such calculations involve multiplying numbers in grids and are used to train neural networks. Despite being general-purpose, AlphaEvolve outperformed AlphaTensor, an AI tool described by the firm in 2022 and designed specifically for matrix mechanics.The approach could be used to tackle optimization problems, says Krenn, or anywhere in science where there are concrete metrics, or simulations, to evaluate what makes a good solution. This could include designing new microscopes, telescope or even materials, he adds.Narrow applicationsIn mathematics, AlphaEvolve seems to allow significant speed-ups in tackling some problems, says Simon Frieder, a mathematician and AI researcher at the University of Oxford, UK. But it will probably be applied only to the “narrow slice” of tasks that can be presented as problems to be solved through code, he says.Other researchers are reserving judgement about the tool’s usefulness until has been trialled outside DeepMind. “Until the systems have been tested by a broader community, I would stay sceptical and take the reported results with a grain of salt,” says Huan Sun, an AI researcher at the Ohio State University in Columbus. Frieder says he will wait until an open-source version is recreated by researchers, rather than a rely on DeepMind’s proprietary system, which could be withdrawn or changed.Although AlphaEvolve requires less computing power to run than AlphaTensor, it is still too resource-intensive to be made freely available on DeepMind’s servers, says Kohli.But the company hopes that announcing the system will encourage researchers to suggest areas of science in which to apply AlphaEvolve. “We are definitely committed to make sure that the most people in the scientific community get access to it,” says Kohli.This article is reproduced with permission and was first published on May 14, 2025.

0 Comentários ·0 Compartilhamentos ·0 Anterior

Faça o login para curtir, compartilhar e comentar!

Atualizar para Plus