Cerca | CGShares

Cerca

Articoli

Blogs

Utenti

Pagine

Gruppi

Microsoft Academic @MicrosoftAcademic ha condiviso un link
2025-06-06 07:52:50 ·

BenchmarkQED: Automated benchmarking of RAG systems

One of the key use cases for generative AI involves answering questions over private datasets, with retrieval-augmented generation as the go-to framework. As new RAG techniques emerge, there’s a growing need to benchmark their performance across diverse datasets and metrics.
To meet this need, we’re introducing BenchmarkQED, a new suite of tools that automates RAG benchmarking at scale, available on GitHub. It includes components for query generation, evaluation, and dataset preparation, each designed to support rigorous, reproducible testing.
BenchmarkQED complements the RAG methods in our open-source GraphRAG library, enabling users to run a GraphRAG-style evaluation across models, metrics, and datasets. GraphRAG uses a large language model to generate and summarize entity-based knowledge graphs, producing more comprehensive and diverse answers than standard RAG for large-scale tasks.
In this post, we walk through the core components of BenchmarkQED that contribute to the overall benchmarking process. We also share some of the latest benchmark results comparing our LazyGraphRAG system to competing methods, including a vector-based RAG with a 1M-token context window, where the leading LazyGraphRAG configuration showed significant win rates across all combinations of quality metrics and query classes.
In the paper, we distinguish between local queries, where answers are found in a small number of text regions, and sometimes even a single region, and global queries, which require reasoning over large portions of or even the entire dataset.
Conventional vector-based RAG excels at local queries because the regions containing the answer to the query resemble the query itself and can be retrieved as the nearest neighbor in the vector space of text embeddings. However, it struggles with global questions, such as, “What are the main themes of the dataset?” which require understanding dataset qualities not explicitly stated in the text.
AutoQ: Automated query synthesis
This limitation motivated the development of GraphRAG a system designed to answer global queries. GraphRAG’s evaluation requirements subsequently led to the creation of AutoQ, a method for synthesizing these global queries for any dataset.
AutoQ extends this approach by generating synthetic queries across the spectrum of queries, from local to global. It defines four distinct classes based on the source and scope of the queryforming a logical progression along the spectrum.
Figure 1. Construction of a 2×2 design space for synthetic query generation with AutoQ, showing how the four resulting query classes map onto the local-global query spectrum.
AutoQ can be configured to generate any number and distribution of synthetic queries along these classes, enabling consistent benchmarking across datasets without requiring user customization. Figure 2 shows the synthesis process and sample queries from each class, using an AP News dataset.
Figure 2. Synthesis process and example query for each of the four AutoQ query classes.

About Microsoft Research
Advancing science and technology to benefit humanity

View our story

Opens in a new tab
AutoE: Automated evaluation framework
Our evaluation of GraphRAG focused on analyzing key qualities of answers to global questions. The following qualities were used for the current evaluation:

Comprehensiveness: Does the answer address all relevant aspects of the question?
Diversity: Does it present varied perspectives or insights?
Empowerment: Does it help the reader understand and make informed judgments?
Relevance: Does it address what the question is specifically asking?

The AutoE component scales evaluation of these qualities using the LLM-as-a-Judge method. It presents pairs of answers to an LLM, along with the query and target metric, in counterbalanced order. The model determines whether the first answer wins, loses, or ties with the second. Over a set of queries, whether from AutoQ or elsewhere, this produces win rates between competing methods. When ground truth is available, AutoE can also score answers on correctness, completeness, and related metrics.
An illustrative evaluation is shown in Figure 3. Using a dataset of 1,397 AP News articles on health and healthcare, AutoQ generated 50 queries per class . AutoE then compared LazyGraphRAG to a competing RAG method, running six trials per query across four metrics, using GPT-4.1 as a judge.
These trial-level results were aggregated using metric-based win rates, where each trial is scored 1 for a win, 0.5 for a tie, and 0 for a loss, and then averaged to calculate the overall win rate for each RAG method.
Figure 3. Win rates of four LazyGraphRAG configurations across methods, broken down by the AutoQ query class and averaged across AutoE’s four metrics: comprehensiveness, diversity, empowerment, and relevance. LazyGraphRAG outperforms comparison conditions where the bar is above 50%.
The four LazyGraphRAG conditionsdiffer by query budgetand chunk size. All used GPT-4o mini for relevance tests and GPT-4o for query expansionand answer generation, except for LGR_b200_c200_mini, which used GPT-4o mini throughout.
Comparison systems were GraphRAG , Vector RAG with 8k- and 120k-token windows, and three published methods: LightRAG, RAPTOR, and TREX. All methods were limited to the same 8k tokens for answer generation. GraphRAG Global Search used level 2 of the community hierarchy.
LazyGraphRAG outperformed every comparison condition using the same generative model, winning all 96 comparisons, with all but one reaching statistical significance. The best overall performance came from the larger budget, smaller chunk size configuration. For DataLocal queries, the smaller budgetperformed slightly better, likely because fewer chunks were relevant. For ActivityLocal queries, the larger chunk sizehad a slight edge, likely because longer chunks provide a more coherent context.
Competing methods performed relatively better on the query classes for which they were designed: GraphRAG Global for global queries, Vector RAG for local queries, and GraphRAG Drift Search, which combines both strategies, posed the strongest challenge overall.
Increasing Vector RAG’s context window from 8k to 120k tokens did not improve its performance compared to LazyGraphRAG. This raised the question of how LazyGraphRAG would perform against Vector RAG with 1-million token context window containing most of the dataset.
Figure 4 shows the follow-up experiment comparing LazyGraphRAG to Vector RAG using GPT-4.1 that enabled this comparison. Even against the 1M-token window, LazyGraphRAG achieved higher win rates across all comparisons, failing to reach significance only for the relevance of answers to DataLocal queries. These queries tend to benefit most from Vector RAG’s ranking of directly relevant chunks, making it hard for LazyGraphRAG to generate answers that have greater relevance to the query, even though these answers may be dramatically more comprehensive, diverse, and empowering overall.
Figure 4. Win rates of LazyGraphRAG  over Vector RAG across different context window sizes, broken down by the four AutoQ query classes and four AutoE metrics: comprehensiveness, diversity, empowerment, and relevance. Bars above 50% indicate that LazyGraphRAG outperformed the comparison condition.
AutoD: Automated data sampling and summarization
Text datasets have an underlying topical structure, but the depth, breadth, and connectivity of that structure can vary widely. This variability makes it difficult to evaluate RAG systems consistently, as results may reflect the idiosyncrasies of the dataset rather than the system’s general capabilities.
The AutoD component addresses this by sampling datasets to meet a target specification, defined by the number of topic clustersand the number of samples per cluster. This creates consistency across datasets, enabling more meaningful comparisons, as structurally aligned datasets lead to comparable AutoQ queries, which in turn support consistent AutoE evaluations.
AutoD also includes tools for summarizing input or output datasets in a way that reflects their topical coverage. These summaries play an important role in the AutoQ query synthesis process, but they can also be used more broadly, such as in prompts where context space is limited.
Since the release of the GraphRAG paper, we’ve received many requests to share the dataset of the Behind the Tech podcast transcripts we used in our evaluation. An updated version of this dataset is now available in the BenchmarkQED repository, alongside the AP News dataset containing 1,397 health-related articles, licensed for open release.
We hope these datasets, together with the BenchmarkQED tools, help accelerate benchmark-driven development of RAG systems and AI question-answering. We invite the community to try them on GitHub.
Opens in a new tab
#benchmarkqedautomatedbenchmarking #ofrag #systems

BenchmarkQED: Automated benchmarking of RAG systems
One of the key use cases for generative AI involves answering questions over private datasets, with retrieval-augmented generation as the go-to framework. As new RAG techniques emerge, there’s a growing need to benchmark their performance across diverse datasets and metrics. To meet this need, we’re introducing BenchmarkQED, a new suite of tools that automates RAG benchmarking at scale, available on GitHub. It includes components for query generation, evaluation, and dataset preparation, each designed to support rigorous, reproducible testing.   BenchmarkQED complements the RAG methods in our open-source GraphRAG library, enabling users to run a GraphRAG-style evaluation across models, metrics, and datasets. GraphRAG uses a large language model to generate and summarize entity-based knowledge graphs, producing more comprehensive and diverse answers than standard RAG for large-scale tasks. In this post, we walk through the core components of BenchmarkQED that contribute to the overall benchmarking process. We also share some of the latest benchmark results comparing our LazyGraphRAG system to competing methods, including a vector-based RAG with a 1M-token context window, where the leading LazyGraphRAG configuration showed significant win rates across all combinations of quality metrics and query classes. In the paper, we distinguish between local queries, where answers are found in a small number of text regions, and sometimes even a single region, and global queries, which require reasoning over large portions of or even the entire dataset. Conventional vector-based RAG excels at local queries because the regions containing the answer to the query resemble the query itself and can be retrieved as the nearest neighbor in the vector space of text embeddings. However, it struggles with global questions, such as, “What are the main themes of the dataset?” which require understanding dataset qualities not explicitly stated in the text.   AutoQ: Automated query synthesis This limitation motivated the development of GraphRAG a system designed to answer global queries. GraphRAG’s evaluation requirements subsequently led to the creation of AutoQ, a method for synthesizing these global queries for any dataset. AutoQ extends this approach by generating synthetic queries across the spectrum of queries, from local to global. It defines four distinct classes based on the source and scope of the queryforming a logical progression along the spectrum. Figure 1. Construction of a 2×2 design space for synthetic query generation with AutoQ, showing how the four resulting query classes map onto the local-global query spectrum. AutoQ can be configured to generate any number and distribution of synthetic queries along these classes, enabling consistent benchmarking across datasets without requiring user customization. Figure 2 shows the synthesis process and sample queries from each class, using an AP News dataset. Figure 2. Synthesis process and example query for each of the four AutoQ query classes. About Microsoft Research Advancing science and technology to benefit humanity View our story Opens in a new tab AutoE: Automated evaluation framework Our evaluation of GraphRAG focused on analyzing key qualities of answers to global questions. The following qualities were used for the current evaluation: Comprehensiveness: Does the answer address all relevant aspects of the question? Diversity: Does it present varied perspectives or insights? Empowerment: Does it help the reader understand and make informed judgments? Relevance: Does it address what the question is specifically asking?   The AutoE component scales evaluation of these qualities using the LLM-as-a-Judge method. It presents pairs of answers to an LLM, along with the query and target metric, in counterbalanced order. The model determines whether the first answer wins, loses, or ties with the second. Over a set of queries, whether from AutoQ or elsewhere, this produces win rates between competing methods. When ground truth is available, AutoE can also score answers on correctness, completeness, and related metrics. An illustrative evaluation is shown in Figure 3. Using a dataset of 1,397 AP News articles on health and healthcare, AutoQ generated 50 queries per class . AutoE then compared LazyGraphRAG to a competing RAG method, running six trials per query across four metrics, using GPT-4.1 as a judge. These trial-level results were aggregated using metric-based win rates, where each trial is scored 1 for a win, 0.5 for a tie, and 0 for a loss, and then averaged to calculate the overall win rate for each RAG method. Figure 3. Win rates of four LazyGraphRAG configurations across methods, broken down by the AutoQ query class and averaged across AutoE’s four metrics: comprehensiveness, diversity, empowerment, and relevance. LazyGraphRAG outperforms comparison conditions where the bar is above 50%. The four LazyGraphRAG conditionsdiffer by query budgetand chunk size. All used GPT-4o mini for relevance tests and GPT-4o for query expansionand answer generation, except for LGR_b200_c200_mini, which used GPT-4o mini throughout. Comparison systems were GraphRAG , Vector RAG with 8k- and 120k-token windows, and three published methods: LightRAG, RAPTOR, and TREX. All methods were limited to the same 8k tokens for answer generation. GraphRAG Global Search used level 2 of the community hierarchy. LazyGraphRAG outperformed every comparison condition using the same generative model, winning all 96 comparisons, with all but one reaching statistical significance. The best overall performance came from the larger budget, smaller chunk size configuration. For DataLocal queries, the smaller budgetperformed slightly better, likely because fewer chunks were relevant. For ActivityLocal queries, the larger chunk sizehad a slight edge, likely because longer chunks provide a more coherent context. Competing methods performed relatively better on the query classes for which they were designed: GraphRAG Global for global queries, Vector RAG for local queries, and GraphRAG Drift Search, which combines both strategies, posed the strongest challenge overall. Increasing Vector RAG’s context window from 8k to 120k tokens did not improve its performance compared to LazyGraphRAG. This raised the question of how LazyGraphRAG would perform against Vector RAG with 1-million token context window containing most of the dataset. Figure 4 shows the follow-up experiment comparing LazyGraphRAG to Vector RAG using GPT-4.1 that enabled this comparison. Even against the 1M-token window, LazyGraphRAG achieved higher win rates across all comparisons, failing to reach significance only for the relevance of answers to DataLocal queries. These queries tend to benefit most from Vector RAG’s ranking of directly relevant chunks, making it hard for LazyGraphRAG to generate answers that have greater relevance to the query, even though these answers may be dramatically more comprehensive, diverse, and empowering overall. Figure 4. Win rates of LazyGraphRAG  over Vector RAG across different context window sizes, broken down by the four AutoQ query classes and four AutoE metrics: comprehensiveness, diversity, empowerment, and relevance. Bars above 50% indicate that LazyGraphRAG outperformed the comparison condition. AutoD: Automated data sampling and summarization Text datasets have an underlying topical structure, but the depth, breadth, and connectivity of that structure can vary widely. This variability makes it difficult to evaluate RAG systems consistently, as results may reflect the idiosyncrasies of the dataset rather than the system’s general capabilities. The AutoD component addresses this by sampling datasets to meet a target specification, defined by the number of topic clustersand the number of samples per cluster. This creates consistency across datasets, enabling more meaningful comparisons, as structurally aligned datasets lead to comparable AutoQ queries, which in turn support consistent AutoE evaluations. AutoD also includes tools for summarizing input or output datasets in a way that reflects their topical coverage. These summaries play an important role in the AutoQ query synthesis process, but they can also be used more broadly, such as in prompts where context space is limited. Since the release of the GraphRAG paper, we’ve received many requests to share the dataset of the Behind the Tech podcast transcripts we used in our evaluation. An updated version of this dataset is now available in the BenchmarkQED repository, alongside the AP News dataset containing 1,397 health-related articles, licensed for open release.   We hope these datasets, together with the BenchmarkQED tools, help accelerate benchmark-driven development of RAG systems and AI question-answering. We invite the community to try them on GitHub. Opens in a new tab #benchmarkqedautomatedbenchmarking #ofrag #systems

BenchmarkQED: Automated benchmarking of RAG systems

www.microsoft.com
One of the key use cases for generative AI involves answering questions over private datasets, with retrieval-augmented generation (RAG) as the go-to framework. As new RAG techniques emerge, there’s a growing need to benchmark their performance across diverse datasets and metrics. To meet this need, we’re introducing BenchmarkQED, a new suite of tools that automates RAG benchmarking at scale, available on GitHub (opens in new tab). It includes components for query generation, evaluation, and dataset preparation, each designed to support rigorous, reproducible testing.   BenchmarkQED complements the RAG methods in our open-source GraphRAG library, enabling users to run a GraphRAG-style evaluation across models, metrics, and datasets. GraphRAG uses a large language model (LLM) to generate and summarize entity-based knowledge graphs, producing more comprehensive and diverse answers than standard RAG for large-scale tasks. In this post, we walk through the core components of BenchmarkQED that contribute to the overall benchmarking process. We also share some of the latest benchmark results comparing our LazyGraphRAG system to competing methods, including a vector-based RAG with a 1M-token context window, where the leading LazyGraphRAG configuration showed significant win rates across all combinations of quality metrics and query classes. In the paper, we distinguish between local queries, where answers are found in a small number of text regions, and sometimes even a single region, and global queries, which require reasoning over large portions of or even the entire dataset. Conventional vector-based RAG excels at local queries because the regions containing the answer to the query resemble the query itself and can be retrieved as the nearest neighbor in the vector space of text embeddings. However, it struggles with global questions, such as, “What are the main themes of the dataset?” which require understanding dataset qualities not explicitly stated in the text.   AutoQ: Automated query synthesis This limitation motivated the development of GraphRAG a system designed to answer global queries. GraphRAG’s evaluation requirements subsequently led to the creation of AutoQ, a method for synthesizing these global queries for any dataset. AutoQ extends this approach by generating synthetic queries across the spectrum of queries, from local to global. It defines four distinct classes based on the source and scope of the query (Figure 1, top) forming a logical progression along the spectrum (Figure 1, bottom). Figure 1. Construction of a 2×2 design space for synthetic query generation with AutoQ, showing how the four resulting query classes map onto the local-global query spectrum. AutoQ can be configured to generate any number and distribution of synthetic queries along these classes, enabling consistent benchmarking across datasets without requiring user customization. Figure 2 shows the synthesis process and sample queries from each class, using an AP News dataset. Figure 2. Synthesis process and example query for each of the four AutoQ query classes. About Microsoft Research Advancing science and technology to benefit humanity View our story Opens in a new tab AutoE: Automated evaluation framework Our evaluation of GraphRAG focused on analyzing key qualities of answers to global questions. The following qualities were used for the current evaluation: Comprehensiveness: Does the answer address all relevant aspects of the question? Diversity: Does it present varied perspectives or insights? Empowerment: Does it help the reader understand and make informed judgments? Relevance: Does it address what the question is specifically asking?   The AutoE component scales evaluation of these qualities using the LLM-as-a-Judge method. It presents pairs of answers to an LLM, along with the query and target metric, in counterbalanced order. The model determines whether the first answer wins, loses, or ties with the second. Over a set of queries, whether from AutoQ or elsewhere, this produces win rates between competing methods. When ground truth is available, AutoE can also score answers on correctness, completeness, and related metrics. An illustrative evaluation is shown in Figure 3. Using a dataset of 1,397 AP News articles on health and healthcare, AutoQ generated 50 queries per class (200 total). AutoE then compared LazyGraphRAG to a competing RAG method, running six trials per query across four metrics, using GPT-4.1 as a judge. These trial-level results were aggregated using metric-based win rates, where each trial is scored 1 for a win, 0.5 for a tie, and 0 for a loss, and then averaged to calculate the overall win rate for each RAG method. Figure 3. Win rates of four LazyGraphRAG (LGR) configurations across methods, broken down by the AutoQ query class and averaged across AutoE’s four metrics: comprehensiveness, diversity, empowerment, and relevance. LazyGraphRAG outperforms comparison conditions where the bar is above 50%. The four LazyGraphRAG conditions (LGR_b200_c200, LGR_b50_c200, LGR_b50_c600, LGR_b200_c200_mini) differ by query budget (b50, b200) and chunk size (c200, c600). All used GPT-4o mini for relevance tests and GPT-4o for query expansion (to five subqueries) and answer generation, except for LGR_b200_c200_mini, which used GPT-4o mini throughout. Comparison systems were GraphRAG (Local, Global, and Drift Search), Vector RAG with 8k- and 120k-token windows, and three published methods: LightRAG (opens in new tab), RAPTOR (opens in new tab), and TREX (opens in new tab). All methods were limited to the same 8k tokens for answer generation. GraphRAG Global Search used level 2 of the community hierarchy. LazyGraphRAG outperformed every comparison condition using the same generative model (GPT-4o), winning all 96 comparisons, with all but one reaching statistical significance. The best overall performance came from the larger budget, smaller chunk size configuration (LGR_b200_c200). For DataLocal queries, the smaller budget (LGR_b50_c200) performed slightly better, likely because fewer chunks were relevant. For ActivityLocal queries, the larger chunk size (LGR_b50_c600) had a slight edge, likely because longer chunks provide a more coherent context. Competing methods performed relatively better on the query classes for which they were designed: GraphRAG Global for global queries, Vector RAG for local queries, and GraphRAG Drift Search, which combines both strategies, posed the strongest challenge overall. Increasing Vector RAG’s context window from 8k to 120k tokens did not improve its performance compared to LazyGraphRAG. This raised the question of how LazyGraphRAG would perform against Vector RAG with 1-million token context window containing most of the dataset. Figure 4 shows the follow-up experiment comparing LazyGraphRAG to Vector RAG using GPT-4.1 that enabled this comparison. Even against the 1M-token window, LazyGraphRAG achieved higher win rates across all comparisons, failing to reach significance only for the relevance of answers to DataLocal queries. These queries tend to benefit most from Vector RAG’s ranking of directly relevant chunks, making it hard for LazyGraphRAG to generate answers that have greater relevance to the query, even though these answers may be dramatically more comprehensive, diverse, and empowering overall. Figure 4. Win rates of LazyGraphRAG (LGR) over Vector RAG across different context window sizes, broken down by the four AutoQ query classes and four AutoE metrics: comprehensiveness, diversity, empowerment, and relevance. Bars above 50% indicate that LazyGraphRAG outperformed the comparison condition. AutoD: Automated data sampling and summarization Text datasets have an underlying topical structure, but the depth, breadth, and connectivity of that structure can vary widely. This variability makes it difficult to evaluate RAG systems consistently, as results may reflect the idiosyncrasies of the dataset rather than the system’s general capabilities. The AutoD component addresses this by sampling datasets to meet a target specification, defined by the number of topic clusters (breadth) and the number of samples per cluster (depth). This creates consistency across datasets, enabling more meaningful comparisons, as structurally aligned datasets lead to comparable AutoQ queries, which in turn support consistent AutoE evaluations. AutoD also includes tools for summarizing input or output datasets in a way that reflects their topical coverage. These summaries play an important role in the AutoQ query synthesis process, but they can also be used more broadly, such as in prompts where context space is limited. Since the release of the GraphRAG paper, we’ve received many requests to share the dataset of the Behind the Tech (opens in new tab) podcast transcripts we used in our evaluation. An updated version of this dataset is now available in the BenchmarkQED repository (opens in new tab), alongside the AP News dataset containing 1,397 health-related articles, licensed for open release.   We hope these datasets, together with the BenchmarkQED tools (opens in new tab), help accelerate benchmark-driven development of RAG systems and AI question-answering. We invite the community to try them on GitHub (opens in new tab). Opens in a new tab

487

· 0 Commenti ·0 condivisioni ·0 Anteprima

Effettua l'accesso per mettere mi piace, condividere e commentare!
9to5Mac @9to5Mac ha condiviso un link
2025-06-01 16:04:02 ·

Gurman: Apple needs a major AI comeback, but this WWDC probably won’t be it

According to Mark Gurman in his latest Power On newsletter, Apple insiders “believe that the conference may be a letdown from an AI standpoint,” highlighting how far behind Apple still is. Still, Apple has a few AI-related announcements slated for June 9.

As previously reported, this year’s biggest AI announcement will be Apple’s plans to open up its on-device foundation models to third-party developers.
These are the same ~3B parameter models Apple currently uses for things like text summarization and autocorrect, and they’ll soon be available for devs to integrate into their own apps.
To be clear, this is a meaningful milestone for Apple’s AI platform. It gives developers a powerful tool to natively integrate into their apps and potentially unlock genuinely useful features.
Still, these on-device models are far less capable than the large-scale, cloud-based systems used by OpenAI and Google, so don’t expect earth-shattering features.
AI features slated for this year’s iOS 26
Elsewhere in its AI efforts, Apple will reportedly:

Launch a new battery power management mode;
Reboot its Translate app, “now integrated with AirPods and Siri”;
Start describing some features within apps like Safari and Photos as “AI-powered”.

As Gurman puts it, this feels like a risky “gap year.” Internally, Apple is aiming to make up for it at WWDC 2026, with bigger swings that “it hopes it can try to convince consumers that it’s an AI innovator.“. However, given how fast the competition is moving, waiting until next year might put Apple even further behind, perception-wise.
What’s still in the works?
Currently, Apple’s ongoing AI developments include an LLM Siri, a revamped Shortcuts app, the ambitious health-related Project Mulberry, and a full-blown ChatGPT competitor with web search capabilities.
According to Gurman, Apple is holding off on previewing some of these features to avoid repeating last year’s mistake, when it showed off Apple Intelligence with features that were nowhere near ready and are still MIA.
Behind the scenes, Gurman reports Apple has made progress. It now has models with 3B, 7B, 33B, and 150B parameters in testing, with the largest ones relying on the cloud.
Internal benchmarks suggest its top model is close to recent ChatGPT updates in quality. Still, concerns over hallucinations and internal debates over Apple’s approach to generative AI are keeping things private, for now.
Apple’s dev AI story
As for Apple’s developer offerings, Gurman reports:

“Developers will see AI get more deeply integrated into Apple’s developer tools, including those for user interface testing. And, in a development that will certainly appease many developers, SwiftUI, a set of Apple frameworks and tools for creating app user interfaces, will finally get a built-in rich text editor.”

And if you’re still waiting for Swift Assist, the AI code-completion tool Apple announced last year, Gurman says Apple is expected to provide an update on it. Still, there is no word yet on whether this update includes releasing the Anthropic-powered code completion version that its employees have been testing for the past few months.

Add 9to5Mac to your Google News feed.

FTC: We use income earning auto affiliate links. More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel
#gurman #apple #needs #major #comeback

Gurman: Apple needs a major AI comeback, but this WWDC probably won’t be it
According to Mark Gurman in his latest Power On newsletter, Apple insiders “believe that the conference may be a letdown from an AI standpoint,” highlighting how far behind Apple still is. Still, Apple has a few AI-related announcements slated for June 9. As previously reported, this year’s biggest AI announcement will be Apple’s plans to open up its on-device foundation models to third-party developers. These are the same ~3B parameter models Apple currently uses for things like text summarization and autocorrect, and they’ll soon be available for devs to integrate into their own apps. To be clear, this is a meaningful milestone for Apple’s AI platform. It gives developers a powerful tool to natively integrate into their apps and potentially unlock genuinely useful features. Still, these on-device models are far less capable than the large-scale, cloud-based systems used by OpenAI and Google, so don’t expect earth-shattering features. AI features slated for this year’s iOS 26 Elsewhere in its AI efforts, Apple will reportedly: Launch a new battery power management mode; Reboot its Translate app, “now integrated with AirPods and Siri”; Start describing some features within apps like Safari and Photos as “AI-powered”. As Gurman puts it, this feels like a risky “gap year.” Internally, Apple is aiming to make up for it at WWDC 2026, with bigger swings that “it hopes it can try to convince consumers that it’s an AI innovator.“. However, given how fast the competition is moving, waiting until next year might put Apple even further behind, perception-wise. What’s still in the works? Currently, Apple’s ongoing AI developments include an LLM Siri, a revamped Shortcuts app, the ambitious health-related Project Mulberry, and a full-blown ChatGPT competitor with web search capabilities. According to Gurman, Apple is holding off on previewing some of these features to avoid repeating last year’s mistake, when it showed off Apple Intelligence with features that were nowhere near ready and are still MIA. Behind the scenes, Gurman reports Apple has made progress. It now has models with 3B, 7B, 33B, and 150B parameters in testing, with the largest ones relying on the cloud. Internal benchmarks suggest its top model is close to recent ChatGPT updates in quality. Still, concerns over hallucinations and internal debates over Apple’s approach to generative AI are keeping things private, for now. Apple’s dev AI story As for Apple’s developer offerings, Gurman reports: “Developers will see AI get more deeply integrated into Apple’s developer tools, including those for user interface testing. And, in a development that will certainly appease many developers, SwiftUI, a set of Apple frameworks and tools for creating app user interfaces, will finally get a built-in rich text editor.” And if you’re still waiting for Swift Assist, the AI code-completion tool Apple announced last year, Gurman says Apple is expected to provide an update on it. Still, there is no word yet on whether this update includes releasing the Anthropic-powered code completion version that its employees have been testing for the past few months. Add 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel #gurman #apple #needs #major #comeback

Gurman: Apple needs a major AI comeback, but this WWDC probably won’t be it

9to5mac.com
According to Mark Gurman in his latest Power On newsletter, Apple insiders “believe that the conference may be a letdown from an AI standpoint,” highlighting how far behind Apple still is. Still, Apple has a few AI-related announcements slated for June 9. As previously reported, this year’s biggest AI announcement will be Apple’s plans to open up its on-device foundation models to third-party developers. These are the same ~3B parameter models Apple currently uses for things like text summarization and autocorrect, and they’ll soon be available for devs to integrate into their own apps. To be clear, this is a meaningful milestone for Apple’s AI platform. It gives developers a powerful tool to natively integrate into their apps and potentially unlock genuinely useful features. Still, these on-device models are far less capable than the large-scale, cloud-based systems used by OpenAI and Google, so don’t expect earth-shattering features. AI features slated for this year’s iOS 26 Elsewhere in its AI efforts, Apple will reportedly: Launch a new battery power management mode; Reboot its Translate app, “now integrated with AirPods and Siri”; Start describing some features within apps like Safari and Photos as “AI-powered”. As Gurman puts it, this feels like a risky “gap year.” Internally, Apple is aiming to make up for it at WWDC 2026, with bigger swings that “it hopes it can try to convince consumers that it’s an AI innovator.“. However, given how fast the competition is moving, waiting until next year might put Apple even further behind, perception-wise. What’s still in the works? Currently, Apple’s ongoing AI developments include an LLM Siri, a revamped Shortcuts app, the ambitious health-related Project Mulberry, and a full-blown ChatGPT competitor with web search capabilities. According to Gurman, Apple is holding off on previewing some of these features to avoid repeating last year’s mistake, when it showed off Apple Intelligence with features that were nowhere near ready and are still MIA. Behind the scenes, Gurman reports Apple has made progress. It now has models with 3B, 7B, 33B, and 150B parameters in testing, with the largest ones relying on the cloud. Internal benchmarks suggest its top model is close to recent ChatGPT updates in quality. Still, concerns over hallucinations and internal debates over Apple’s approach to generative AI are keeping things private, for now. Apple’s dev AI story As for Apple’s developer offerings, Gurman reports: “Developers will see AI get more deeply integrated into Apple’s developer tools, including those for user interface testing. And, in a development that will certainly appease many developers, SwiftUI, a set of Apple frameworks and tools for creating app user interfaces, will finally get a built-in rich text editor.” And if you’re still waiting for Swift Assist, the AI code-completion tool Apple announced last year, Gurman says Apple is expected to provide an update on it. Still, there is no word yet on whether this update includes releasing the Anthropic-powered code completion version that its employees have been testing for the past few months. Add 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel

0 Commenti ·0 condivisioni ·0 Anteprima

Effettua l'accesso per mettere mi piace, condividere e commentare!
9to5Mac @9to5Mac ha condiviso un link
2025-05-31 22:55:02 ·

These are three Apple Intelligence features I’d like to see with iOS 26

Apple Intelligence has been off to a rocky start, especially when it comes to Siri. The assistant still has a lot to be desired, and that should definitely be at the forefront of Apple’s priorities.
Regardless, Bloomberg’s Mark Gurman reports that Apple plans on expanding current Apple Intelligence capabilities to additional apps in iOS 26, and I figured I’d throw out some ideas I’d like to see.

Summaries in more places
I think providing summaries is probably one of the better use cases of on-device large language models. Apple introduced notification summaries in iOS 18, and while there were some major inaccuracies early on, things seem to be mostly fine. Apple recently enabled Apple Intelligence on compatible devices by default, rather than making it an opt-in feature.
For one, I think it’d be neat if there were an API for developers to use summarization models in their apps. I’m sure Apple would put strict guardrails on it, but allowing third-parties to utilize Apple’s summarization models would be a big win. It’d empower indie developers to create AI features without having to worry about an OpenAI bill.
On top of that, I’d really like to see some summarization improvements in the Messages app, particularly in group chats. If you missed out on a 100-message conversation, Apple should provide a more detailed summary than what can fit within two lines.
Or, say you’re a student – imagine being able to summarize the notes you took in a class after the fact. You’d still need to read the notes to get a thorough understanding, but a note summary could serve as a great way to jog your memory if you’re quickly trying to recall something.
Genmoji for everyone
Genmoji is probably one of the most popular Apple Intelligence features unveiled at WWDC24. Unfortunately though, it’s only available on some of the most recent iPhone models: iPhone 15 Pro/Pro Max, iPhone 16e, iPhone 16/16 Plus, and iPhone 16 Pro/Pro Max.
If you have anything older, including the one-year-old iPhone 15, you can’t use Genmoji.
I don’t expect Apple to make its models run locally on less capable hardware, as nice as that would be. However, they did announce Private Cloud Compute – a private server for handling Apple Intelligence requests in the cloud.
Those servers were likely low capacity when they just begun rolling them out, but it’ll have been over a year since the rollout begun by the time iOS 19 releases to the public.
While I don’t expect Apple to give out Private Cloud Compute usage for free, I think it’d be pretty neat if they bundled Genmoji in iCloud+ subscriptions for users with older devices – giving people a taste of what Apple Intelligence offers.

More customizable focus modes
One of my favorite features in iOS 18 has been the new Reduce Interruptions focus mode. In short, it analyzes every notification that comes through, and only presents what it thinks is important. The rest just stay in notification center.

I agree with the more granular focus options. I'd like a focus option for when I connect to my home wifi. A separate focus for different workout options...I'd like to be "more" silent when running outside, and less when working out in the gym. Just a few ideas here, and they should be able to work within the present APP options.
View all comments

I’d really like to see Apple offer additional granularity here. For example, you could configure a focus mode that only triggers on key words that you set up. I could also see the inverse being useful, where you’d normally allow an app to notify you, but you’d like notifications with matching key words to be muted.
That’s just scratching the surface, but I really think there could be a lot of opportunity for AI to enable more granular notification management. The new “Reduce Interruptions” focus is just the start.

My favorite Apple accessory recommendations:
Follow Michael: X/Twitter, Bluesky, Instagram

Add 9to5Mac to your Google News feed.

FTC: We use income earning auto affiliate links. More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel
#these #are #three #apple #intelligence

These are three Apple Intelligence features I’d like to see with iOS 26
Apple Intelligence has been off to a rocky start, especially when it comes to Siri. The assistant still has a lot to be desired, and that should definitely be at the forefront of Apple’s priorities. Regardless, Bloomberg’s Mark Gurman reports that Apple plans on expanding current Apple Intelligence capabilities to additional apps in iOS 26, and I figured I’d throw out some ideas I’d like to see. Summaries in more places I think providing summaries is probably one of the better use cases of on-device large language models. Apple introduced notification summaries in iOS 18, and while there were some major inaccuracies early on, things seem to be mostly fine. Apple recently enabled Apple Intelligence on compatible devices by default, rather than making it an opt-in feature. For one, I think it’d be neat if there were an API for developers to use summarization models in their apps. I’m sure Apple would put strict guardrails on it, but allowing third-parties to utilize Apple’s summarization models would be a big win. It’d empower indie developers to create AI features without having to worry about an OpenAI bill. On top of that, I’d really like to see some summarization improvements in the Messages app, particularly in group chats. If you missed out on a 100-message conversation, Apple should provide a more detailed summary than what can fit within two lines. Or, say you’re a student – imagine being able to summarize the notes you took in a class after the fact. You’d still need to read the notes to get a thorough understanding, but a note summary could serve as a great way to jog your memory if you’re quickly trying to recall something. Genmoji for everyone Genmoji is probably one of the most popular Apple Intelligence features unveiled at WWDC24. Unfortunately though, it’s only available on some of the most recent iPhone models: iPhone 15 Pro/Pro Max, iPhone 16e, iPhone 16/16 Plus, and iPhone 16 Pro/Pro Max. If you have anything older, including the one-year-old iPhone 15, you can’t use Genmoji. I don’t expect Apple to make its models run locally on less capable hardware, as nice as that would be. However, they did announce Private Cloud Compute – a private server for handling Apple Intelligence requests in the cloud. Those servers were likely low capacity when they just begun rolling them out, but it’ll have been over a year since the rollout begun by the time iOS 19 releases to the public. While I don’t expect Apple to give out Private Cloud Compute usage for free, I think it’d be pretty neat if they bundled Genmoji in iCloud+ subscriptions for users with older devices – giving people a taste of what Apple Intelligence offers. More customizable focus modes One of my favorite features in iOS 18 has been the new Reduce Interruptions focus mode. In short, it analyzes every notification that comes through, and only presents what it thinks is important. The rest just stay in notification center. I agree with the more granular focus options. I'd like a focus option for when I connect to my home wifi. A separate focus for different workout options...I'd like to be "more" silent when running outside, and less when working out in the gym. Just a few ideas here, and they should be able to work within the present APP options. View all comments I’d really like to see Apple offer additional granularity here. For example, you could configure a focus mode that only triggers on key words that you set up. I could also see the inverse being useful, where you’d normally allow an app to notify you, but you’d like notifications with matching key words to be muted. That’s just scratching the surface, but I really think there could be a lot of opportunity for AI to enable more granular notification management. The new “Reduce Interruptions” focus is just the start. My favorite Apple accessory recommendations: Follow Michael: X/Twitter, Bluesky, Instagram Add 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel #these #are #three #apple #intelligence

These are three Apple Intelligence features I’d like to see with iOS 26

9to5mac.com
Apple Intelligence has been off to a rocky start, especially when it comes to Siri. The assistant still has a lot to be desired, and that should definitely be at the forefront of Apple’s priorities. Regardless, Bloomberg’s Mark Gurman reports that Apple plans on expanding current Apple Intelligence capabilities to additional apps in iOS 26, and I figured I’d throw out some ideas I’d like to see. Summaries in more places I think providing summaries is probably one of the better use cases of on-device large language models. Apple introduced notification summaries in iOS 18, and while there were some major inaccuracies early on, things seem to be mostly fine. Apple recently enabled Apple Intelligence on compatible devices by default, rather than making it an opt-in feature. For one, I think it’d be neat if there were an API for developers to use summarization models in their apps. I’m sure Apple would put strict guardrails on it, but allowing third-parties to utilize Apple’s summarization models would be a big win. It’d empower indie developers to create AI features without having to worry about an OpenAI bill. On top of that, I’d really like to see some summarization improvements in the Messages app, particularly in group chats. If you missed out on a 100-message conversation, Apple should provide a more detailed summary than what can fit within two lines. Or, say you’re a student – imagine being able to summarize the notes you took in a class after the fact. You’d still need to read the notes to get a thorough understanding, but a note summary could serve as a great way to jog your memory if you’re quickly trying to recall something. Genmoji for everyone Genmoji is probably one of the most popular Apple Intelligence features unveiled at WWDC24. Unfortunately though, it’s only available on some of the most recent iPhone models: iPhone 15 Pro/Pro Max, iPhone 16e, iPhone 16/16 Plus, and iPhone 16 Pro/Pro Max. If you have anything older, including the one-year-old iPhone 15, you can’t use Genmoji. I don’t expect Apple to make its models run locally on less capable hardware, as nice as that would be. However, they did announce Private Cloud Compute – a private server for handling Apple Intelligence requests in the cloud. Those servers were likely low capacity when they just begun rolling them out, but it’ll have been over a year since the rollout begun by the time iOS 19 releases to the public. While I don’t expect Apple to give out Private Cloud Compute usage for free, I think it’d be pretty neat if they bundled Genmoji in iCloud+ subscriptions for users with older devices – giving people a taste of what Apple Intelligence offers. More customizable focus modes One of my favorite features in iOS 18 has been the new Reduce Interruptions focus mode. In short, it analyzes every notification that comes through, and only presents what it thinks is important. The rest just stay in notification center. I agree with the more granular focus options. I'd like a focus option for when I connect to my home wifi. A separate focus for different workout options...I'd like to be "more" silent when running outside, and less when working out in the gym. Just a few ideas here, and they should be able to work within the present APP options. View all comments I’d really like to see Apple offer additional granularity here. For example, you could configure a focus mode that only triggers on key words that you set up. I could also see the inverse being useful, where you’d normally allow an app to notify you, but you’d like notifications with matching key words to be muted. That’s just scratching the surface, but I really think there could be a lot of opportunity for AI to enable more granular notification management. The new “Reduce Interruptions” focus is just the start. My favorite Apple accessory recommendations: Follow Michael: X/Twitter, Bluesky, Instagram Add 9to5Mac to your Google News feed. FTC: We use income earning auto affiliate links. More.You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel

0 Commenti ·0 condivisioni ·0 Anteprima

Effettua l'accesso per mettere mi piace, condividere e commentare!
VentureBeat @VentureBeat ha condiviso un link
2025-05-31 22:53:06 ·

The future of engineering belongs to those who build with AI, not without it

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

When Salesforce CEO Marc Benioff recently announced that the company would not hire any more engineers in 2025, citing a “30% productivity increase on engineering” due to AI, it sent ripples through the tech industry. Headlines quickly framed this as the beginning of the end for human engineers — AI was coming for their jobs.
But those headlines miss the mark entirely. What’s really happening is a transformation of engineering itself. Gartner named agentic AI as its top tech trend for this year. The firm also predicts that 33% of enterprise software applications will include agentic AI by 2028 — a significant portion, but far from universal adoption. The extended timeline suggests a gradual evolution rather than a wholesale replacement. The real risk isn’t AI taking jobs; it’s engineers who fail to adapt and are left behind as the nature of engineering work evolves.
The reality across the tech industry reveals an explosion of demand for engineers with AI expertise. Professional services firms are aggressively recruiting engineers with generative AI experience, and technology companies are creating entirely new engineering positions focused on AI implementation. The market for professionals who can effectively leverage AI tools is extraordinarily competitive.
While claims of AI-driven productivity gains may be grounded in real progress, such announcements often reflect investor pressure for profitability as much as technological advancement. Many companies are adept at shaping narratives to position themselves as leaders in enterprise AI — a strategy that aligns well with broader market expectations.
How AI is transforming engineering work
The relationship between AI and engineering is evolving in four key ways, each representing a distinct capability that augments human engineering talent but certainly doesn’t replace it.
AI excels at summarization, helping engineers distill massive codebases, documentation and technical specifications into actionable insights. Rather than spending hours poring over documentation, engineers can get AI-generated summaries and focus on implementation.
Also, AI’s inferencing capabilities allow it to analyze patterns in code and systems and proactively suggest optimizations. This empowers engineers to identify potential bugs and make informed decisions more quickly and with greater confidence.
Third, AI has proven remarkably adept at converting code between languages. This capability is proving invaluable as organizations modernize their tech stacks and attempt to preserve institutional knowledge embedded in legacy systems.
Finally, the true power of gen AI lies in its expansion capabilities — creating novel content like code, documentation or even system architectures. Engineers are using AI to explore more possibilities than they could alone, and we’re seeing these capabilities transform engineering across industries.
In healthcare, AI helps create personalized medical instruction systems that adjust based on a patient’s specific conditions and medical history. In pharmaceutical manufacturing, AI-enhanced systems optimize production schedules to reduce waste and ensure an adequate supply of critical medications. Major banks have invested in gen AI for longer than most people realize, too; they are building systems that help manage complex compliance requirements while improving customer service.
The new engineering skills landscape
As AI reshapes engineering work, it’s creating entirely new in-demand specializations and skill sets, like the ability to effectively communicate with AI systems. Engineers who excel at working with AI can extract significantly better results.
Similar to how DevOps emerged as a discipline, large language model operationsfocuses on deploying, monitoring and optimizing LLMs in production environments. Practitioners of LLMOps track model drift, evaluate alternative models and help to ensure consistent quality of AI-generated outputs.
Creating standardized environments where AI tools can be safely and effectively deployed is becoming crucial. Platform engineering provides templates and guardrails that enable engineers to build AI-enhanced applications more efficiently. This standardization helps ensure consistency, security and maintainability across an organization’s AI implementations.
Human-AI collaboration ranges from AI merely providing recommendations that humans may ignore, to fully autonomous systems that operate independently. The most effective engineers understand when and how to apply the appropriate level of AI autonomy based on the context and consequences of the task at hand.
Keys to successful AI integration
Effective AI governance frameworks — which ranks No. 2 on Gartner’s top trends list — establish clear guidelines while leaving room for innovation. These frameworks address ethical considerations, regulatory compliance and risk management without stifling the creativity that makes AI valuable.
Rather than treating security as an afterthought, successful organizations build it into their AI systems from the beginning. This includes robust testing for vulnerabilities like hallucinations, prompt injection and data leakage. By incorporating security considerations into the development process, organizations can move quickly without compromising safety.
Engineers who can design agentic AI systems create significant value. We’re seeing systems where one AI model handles natural language understanding, another performs reasoning and a third generates appropriate responses, all working in concert to deliver better results than any single model could provide.
As we look ahead, the relationship between engineers and AI systems will likely evolve from tool and user to something more symbiotic. Today’s AI systems are powerful but limited; they lack true understanding and rely heavily on human guidance. Tomorrow’s systems may become true collaborators, proposing novel solutions beyond what engineers might have considered and identifying potential risks humans might overlook.
Yet the engineer’s essential role — understanding requirements, making ethical judgments and translating human needs into technological solutions — will remain irreplaceable. In this partnership between human creativity and AI, there lies the potential to solve problems we’ve never been able to tackle before — and that’s anything but a replacement.
Rizwan Patel is head of information security and emerging technology at Altimetrik.

Daily insights on business use cases with VB Daily
If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.
Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.
#future #engineering #belongs #those #who

The future of engineering belongs to those who build with AI, not without it
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More When Salesforce CEO Marc Benioff recently announced that the company would not hire any more engineers in 2025, citing a “30% productivity increase on engineering” due to AI, it sent ripples through the tech industry. Headlines quickly framed this as the beginning of the end for human engineers — AI was coming for their jobs. But those headlines miss the mark entirely. What’s really happening is a transformation of engineering itself. Gartner named agentic AI as its top tech trend for this year. The firm also predicts that 33% of enterprise software applications will include agentic AI by 2028 — a significant portion, but far from universal adoption. The extended timeline suggests a gradual evolution rather than a wholesale replacement. The real risk isn’t AI taking jobs; it’s engineers who fail to adapt and are left behind as the nature of engineering work evolves. The reality across the tech industry reveals an explosion of demand for engineers with AI expertise. Professional services firms are aggressively recruiting engineers with generative AI experience, and technology companies are creating entirely new engineering positions focused on AI implementation. The market for professionals who can effectively leverage AI tools is extraordinarily competitive. While claims of AI-driven productivity gains may be grounded in real progress, such announcements often reflect investor pressure for profitability as much as technological advancement. Many companies are adept at shaping narratives to position themselves as leaders in enterprise AI — a strategy that aligns well with broader market expectations. How AI is transforming engineering work The relationship between AI and engineering is evolving in four key ways, each representing a distinct capability that augments human engineering talent but certainly doesn’t replace it. AI excels at summarization, helping engineers distill massive codebases, documentation and technical specifications into actionable insights. Rather than spending hours poring over documentation, engineers can get AI-generated summaries and focus on implementation. Also, AI’s inferencing capabilities allow it to analyze patterns in code and systems and proactively suggest optimizations. This empowers engineers to identify potential bugs and make informed decisions more quickly and with greater confidence. Third, AI has proven remarkably adept at converting code between languages. This capability is proving invaluable as organizations modernize their tech stacks and attempt to preserve institutional knowledge embedded in legacy systems. Finally, the true power of gen AI lies in its expansion capabilities — creating novel content like code, documentation or even system architectures. Engineers are using AI to explore more possibilities than they could alone, and we’re seeing these capabilities transform engineering across industries. In healthcare, AI helps create personalized medical instruction systems that adjust based on a patient’s specific conditions and medical history. In pharmaceutical manufacturing, AI-enhanced systems optimize production schedules to reduce waste and ensure an adequate supply of critical medications. Major banks have invested in gen AI for longer than most people realize, too; they are building systems that help manage complex compliance requirements while improving customer service. The new engineering skills landscape As AI reshapes engineering work, it’s creating entirely new in-demand specializations and skill sets, like the ability to effectively communicate with AI systems. Engineers who excel at working with AI can extract significantly better results. Similar to how DevOps emerged as a discipline, large language model operationsfocuses on deploying, monitoring and optimizing LLMs in production environments. Practitioners of LLMOps track model drift, evaluate alternative models and help to ensure consistent quality of AI-generated outputs. Creating standardized environments where AI tools can be safely and effectively deployed is becoming crucial. Platform engineering provides templates and guardrails that enable engineers to build AI-enhanced applications more efficiently. This standardization helps ensure consistency, security and maintainability across an organization’s AI implementations. Human-AI collaboration ranges from AI merely providing recommendations that humans may ignore, to fully autonomous systems that operate independently. The most effective engineers understand when and how to apply the appropriate level of AI autonomy based on the context and consequences of the task at hand. Keys to successful AI integration Effective AI governance frameworks — which ranks No. 2 on Gartner’s top trends list — establish clear guidelines while leaving room for innovation. These frameworks address ethical considerations, regulatory compliance and risk management without stifling the creativity that makes AI valuable. Rather than treating security as an afterthought, successful organizations build it into their AI systems from the beginning. This includes robust testing for vulnerabilities like hallucinations, prompt injection and data leakage. By incorporating security considerations into the development process, organizations can move quickly without compromising safety. Engineers who can design agentic AI systems create significant value. We’re seeing systems where one AI model handles natural language understanding, another performs reasoning and a third generates appropriate responses, all working in concert to deliver better results than any single model could provide. As we look ahead, the relationship between engineers and AI systems will likely evolve from tool and user to something more symbiotic. Today’s AI systems are powerful but limited; they lack true understanding and rely heavily on human guidance. Tomorrow’s systems may become true collaborators, proposing novel solutions beyond what engineers might have considered and identifying potential risks humans might overlook. Yet the engineer’s essential role — understanding requirements, making ethical judgments and translating human needs into technological solutions — will remain irreplaceable. In this partnership between human creativity and AI, there lies the potential to solve problems we’ve never been able to tackle before — and that’s anything but a replacement. Rizwan Patel is head of information security and emerging technology at Altimetrik. Daily insights on business use cases with VB Daily If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI. Read our Privacy Policy Thanks for subscribing. Check out more VB newsletters here. An error occured. #future #engineering #belongs #those #who

The future of engineering belongs to those who build with AI, not without it

venturebeat.com
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More When Salesforce CEO Marc Benioff recently announced that the company would not hire any more engineers in 2025, citing a “30% productivity increase on engineering” due to AI, it sent ripples through the tech industry. Headlines quickly framed this as the beginning of the end for human engineers — AI was coming for their jobs. But those headlines miss the mark entirely. What’s really happening is a transformation of engineering itself. Gartner named agentic AI as its top tech trend for this year. The firm also predicts that 33% of enterprise software applications will include agentic AI by 2028 — a significant portion, but far from universal adoption. The extended timeline suggests a gradual evolution rather than a wholesale replacement. The real risk isn’t AI taking jobs; it’s engineers who fail to adapt and are left behind as the nature of engineering work evolves. The reality across the tech industry reveals an explosion of demand for engineers with AI expertise. Professional services firms are aggressively recruiting engineers with generative AI experience, and technology companies are creating entirely new engineering positions focused on AI implementation. The market for professionals who can effectively leverage AI tools is extraordinarily competitive. While claims of AI-driven productivity gains may be grounded in real progress, such announcements often reflect investor pressure for profitability as much as technological advancement. Many companies are adept at shaping narratives to position themselves as leaders in enterprise AI — a strategy that aligns well with broader market expectations. How AI is transforming engineering work The relationship between AI and engineering is evolving in four key ways, each representing a distinct capability that augments human engineering talent but certainly doesn’t replace it. AI excels at summarization, helping engineers distill massive codebases, documentation and technical specifications into actionable insights. Rather than spending hours poring over documentation, engineers can get AI-generated summaries and focus on implementation. Also, AI’s inferencing capabilities allow it to analyze patterns in code and systems and proactively suggest optimizations. This empowers engineers to identify potential bugs and make informed decisions more quickly and with greater confidence. Third, AI has proven remarkably adept at converting code between languages. This capability is proving invaluable as organizations modernize their tech stacks and attempt to preserve institutional knowledge embedded in legacy systems. Finally, the true power of gen AI lies in its expansion capabilities — creating novel content like code, documentation or even system architectures. Engineers are using AI to explore more possibilities than they could alone, and we’re seeing these capabilities transform engineering across industries. In healthcare, AI helps create personalized medical instruction systems that adjust based on a patient’s specific conditions and medical history. In pharmaceutical manufacturing, AI-enhanced systems optimize production schedules to reduce waste and ensure an adequate supply of critical medications. Major banks have invested in gen AI for longer than most people realize, too; they are building systems that help manage complex compliance requirements while improving customer service. The new engineering skills landscape As AI reshapes engineering work, it’s creating entirely new in-demand specializations and skill sets, like the ability to effectively communicate with AI systems. Engineers who excel at working with AI can extract significantly better results. Similar to how DevOps emerged as a discipline, large language model operations (LLMOps) focuses on deploying, monitoring and optimizing LLMs in production environments. Practitioners of LLMOps track model drift, evaluate alternative models and help to ensure consistent quality of AI-generated outputs. Creating standardized environments where AI tools can be safely and effectively deployed is becoming crucial. Platform engineering provides templates and guardrails that enable engineers to build AI-enhanced applications more efficiently. This standardization helps ensure consistency, security and maintainability across an organization’s AI implementations. Human-AI collaboration ranges from AI merely providing recommendations that humans may ignore, to fully autonomous systems that operate independently. The most effective engineers understand when and how to apply the appropriate level of AI autonomy based on the context and consequences of the task at hand. Keys to successful AI integration Effective AI governance frameworks — which ranks No. 2 on Gartner’s top trends list — establish clear guidelines while leaving room for innovation. These frameworks address ethical considerations, regulatory compliance and risk management without stifling the creativity that makes AI valuable. Rather than treating security as an afterthought, successful organizations build it into their AI systems from the beginning. This includes robust testing for vulnerabilities like hallucinations, prompt injection and data leakage. By incorporating security considerations into the development process, organizations can move quickly without compromising safety. Engineers who can design agentic AI systems create significant value. We’re seeing systems where one AI model handles natural language understanding, another performs reasoning and a third generates appropriate responses, all working in concert to deliver better results than any single model could provide. As we look ahead, the relationship between engineers and AI systems will likely evolve from tool and user to something more symbiotic. Today’s AI systems are powerful but limited; they lack true understanding and rely heavily on human guidance. Tomorrow’s systems may become true collaborators, proposing novel solutions beyond what engineers might have considered and identifying potential risks humans might overlook. Yet the engineer’s essential role — understanding requirements, making ethical judgments and translating human needs into technological solutions — will remain irreplaceable. In this partnership between human creativity and AI, there lies the potential to solve problems we’ve never been able to tackle before — and that’s anything but a replacement. Rizwan Patel is head of information security and emerging technology at Altimetrik. Daily insights on business use cases with VB Daily If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI. Read our Privacy Policy Thanks for subscribing. Check out more VB newsletters here. An error occured.

0 Commenti ·0 condivisioni ·0 Anteprima

Effettua l'accesso per mettere mi piace, condividere e commentare!
Apple @Apple ha condiviso un link
2025-05-31 09:59:57 ·

What to expect from iOS 26 at WWDC: New games app, redesign, and more

Apple's iOS 26 is expected to have a radical redesign, and it's probably going to mark the beginning of an entirely new numbering system. Here's everything that's rumored to be in the update.Apple may introduce iOS 26 rather than iOS 19 at WWDC.While the iOS 18 update focused largely on Apple Intelligence features, such as Image Playground, email summarization, and a new Clean Up tool in the Photos app, its successor is expected to take a noticeably different approach.Rather than simply being an iterative upgrade, with a few new apps and capabilities added to the mix, the next generation of iOS might introduce a major visual overhaul. An entirely new version numbering system appears to be in the works as well. Continue Reading on AppleInsider | Discuss on our Forums
#what #expect #ios #wwdc #new

What to expect from iOS 26 at WWDC: New games app, redesign, and more
Apple's iOS 26 is expected to have a radical redesign, and it's probably going to mark the beginning of an entirely new numbering system. Here's everything that's rumored to be in the update.Apple may introduce iOS 26 rather than iOS 19 at WWDC.While the iOS 18 update focused largely on Apple Intelligence features, such as Image Playground, email summarization, and a new Clean Up tool in the Photos app, its successor is expected to take a noticeably different approach.Rather than simply being an iterative upgrade, with a few new apps and capabilities added to the mix, the next generation of iOS might introduce a major visual overhaul. An entirely new version numbering system appears to be in the works as well. Continue Reading on AppleInsider | Discuss on our Forums #what #expect #ios #wwdc #new

What to expect from iOS 26 at WWDC: New games app, redesign, and more

appleinsider.com
Apple's iOS 26 is expected to have a radical redesign, and it's probably going to mark the beginning of an entirely new numbering system. Here's everything that's rumored to be in the update.Apple may introduce iOS 26 rather than iOS 19 at WWDC.While the iOS 18 update focused largely on Apple Intelligence features, such as Image Playground, email summarization, and a new Clean Up tool in the Photos app, its successor is expected to take a noticeably different approach.Rather than simply being an iterative upgrade, with a few new apps and capabilities added to the mix, the next generation of iOS might introduce a major visual overhaul. An entirely new version numbering system appears to be in the works as well. Continue Reading on AppleInsider | Discuss on our Forums

0 Commenti ·0 condivisioni ·0 Anteprima

Effettua l'accesso per mettere mi piace, condividere e commentare!
Slashdot @Slashdot ha condiviso un link
2025-05-30 15:20:37 ·

Gmail's AI Summaries Now Appear Automatically

Google has begun automatically generating AI-powered email summaries for Gmail Workspace users, eliminating the need to manually trigger the feature that has been available since last year. The company's Gemini AI will now independently determine when longer email threads or messages with multiple replies would benefit from summarization, displaying these summaries above the email content itself. The automatic summaries currently appear only on mobile devices for English-language emails and may take up to two weeks to roll out to individual accounts, with Google providing no timeline for desktop expansion or availability to non-Workspace Gmail users.

of this story at Slashdot.
#gmail039s #summaries #now #appear #automatically

Gmail's AI Summaries Now Appear Automatically
Google has begun automatically generating AI-powered email summaries for Gmail Workspace users, eliminating the need to manually trigger the feature that has been available since last year. The company's Gemini AI will now independently determine when longer email threads or messages with multiple replies would benefit from summarization, displaying these summaries above the email content itself. The automatic summaries currently appear only on mobile devices for English-language emails and may take up to two weeks to roll out to individual accounts, with Google providing no timeline for desktop expansion or availability to non-Workspace Gmail users. of this story at Slashdot. #gmail039s #summaries #now #appear #automatically

Gmail's AI Summaries Now Appear Automatically

slashdot.org
Google has begun automatically generating AI-powered email summaries for Gmail Workspace users, eliminating the need to manually trigger the feature that has been available since last year. The company's Gemini AI will now independently determine when longer email threads or messages with multiple replies would benefit from summarization, displaying these summaries above the email content itself. The automatic summaries currently appear only on mobile devices for English-language emails and may take up to two weeks to roll out to individual accounts, with Google providing no timeline for desktop expansion or availability to non-Workspace Gmail users. Read more of this story at Slashdot.

0 Commenti ·0 condivisioni ·0 Anteprima

Effettua l'accesso per mettere mi piace, condividere e commentare!
NVIDIA @NVIDIA ha condiviso un link
2025-05-30 13:45:40 ·

Run LLMs on AnythingLLM Faster With NVIDIA RTX AI PCs

Large language models, trained on datasets with billions of tokens, can generate high-quality content. They’re the backbone for many of the most popular AI applications, including chatbots, assistants, code generators and much more.
One of today’s most accessible ways to work with LLMs is with AnythingLLM, a desktop app built for enthusiasts who want an all-in-one, privacy-focused AI assistant directly on their PC.
With new support for NVIDIA NIM microservices on NVIDIA GeForce RTX and NVIDIA RTX PRO GPUs, AnythingLLM users can now get even faster performance for more responsive local AI workflows.
What Is AnythingLLM?
AnythingLLM is an all-in-one AI application that lets users run local LLMs, retrieval-augmented generationsystems and agentic tools.
It acts as a bridge between a user’s preferred LLMs and their data, and enables access to tools, making it easier and more efficient to use LLMs for specific tasks like:

Question answering: Getting answers to questions from top LLMs — like Llama and DeepSeek R1 — without incurring costs.
Personal data queries: Use RAG to query content privately, including PDFs, Word files, codebases and more.
Document summarization: Generating summaries of lengthy documents, like research papers.
Data analysis: Extracting data insights by loading files and querying it with LLMs.
Agentic actions: Dynamically researching content using local or remote resources, running generative tools and actions based on user prompts.

AnythingLLM can connect to a wide variety of open-source local LLMs, as well as larger LLMs in the cloud, including those provided by OpenAI, Microsoft and Anthropic. In addition, the application provides access to skills for extending its agentic AI capabilities via its community hub.
With a one-click install and the ability to launch as a standalone app or browser extension — wrapped in an intuitive experience with no complicated setup required — AnythingLLM is a great option for AI enthusiasts, especially those with GeForce RTX and NVIDIA RTX PRO GPU-equipped systems.
RTX Powers AnythingLLM Acceleration
GeForce RTX and NVIDIA RTX PRO GPUs offer significant performance gains for running LLMs and agents in AnythingLLM — speeding up inference with Tensor Cores designed to accelerate AI.

AnythingLLM runs LLMs with Ollama for on-device execution accelerated through Llama.cpp and ggml tensor libraries for machine learning.
Ollama, Llama.cpp and GGML are optimized for NVIDIA RTX GPUs and the fifth-generation Tensor Cores. Performance on GeForce RTX 5090 is 2.4X compared to an Apple M3 Ultra.
GeForce RTX 5090 delivers 2.4x faster LLM inference in AnythingLLM than Apple M3 Ultra on both Llama 3.1 8B and DeepSeek R1 8B.
As NVIDIA adds new NIM microservices and reference workflows — like its growing library of AI Blueprints — tools like AnythingLLM will unlock even more multimodal AI use cases.
AnythingLLM — Now With NVIDIA NIM
AnythingLLM recently added support for NVIDIA NIM microservices — performance-optimized, prepackaged generative AI models that make it easy to get started with AI workflows on RTX AI PCs with a streamlined API.

NVIDIA NIMs are great for developers looking for a quick way to test a Generative AI model in a workflow. Instead of having to find the right model, download all the files and figure out how to connect everything, they provide a single container that has everything you need. And they can run both on Cloud and PC, making it easy to prototype locally and then deploy on the cloud.
By offering them within AnythingLLM’s user-friendly UI, users have a quick way to test them and experiment with them. And then they can either connect them to their workflows with AnythingLLM, or leverage NVIDIA AI Blueprints and NIM documentation and sample code to plug them directly to their apps or projects.
Explore the wide variety of NIM microservices available to elevate AI-powered workflows, including language and image generation, computer vision and speech processing.
Each week, the RTX AI Garage blog series features community-driven AI innovations and content for those looking to learn more about NIM microservices and AI Blueprints, as well as building AI agents, creative workflows, digital humans, productivity apps and more on AI PCs and workstations.
Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter.
Follow NVIDIA Workstation on LinkedIn and X. See notice regarding software product information.
#run #llms #anythingllm #faster #with

Run LLMs on AnythingLLM Faster With NVIDIA RTX AI PCs
Large language models, trained on datasets with billions of tokens, can generate high-quality content. They’re the backbone for many of the most popular AI applications, including chatbots, assistants, code generators and much more. One of today’s most accessible ways to work with LLMs is with AnythingLLM, a desktop app built for enthusiasts who want an all-in-one, privacy-focused AI assistant directly on their PC. With new support for NVIDIA NIM microservices on NVIDIA GeForce RTX and NVIDIA RTX PRO GPUs, AnythingLLM users can now get even faster performance for more responsive local AI workflows. What Is AnythingLLM? AnythingLLM is an all-in-one AI application that lets users run local LLMs, retrieval-augmented generationsystems and agentic tools. It acts as a bridge between a user’s preferred LLMs and their data, and enables access to tools, making it easier and more efficient to use LLMs for specific tasks like: Question answering: Getting answers to questions from top LLMs — like Llama and DeepSeek R1 — without incurring costs. Personal data queries: Use RAG to query content privately, including PDFs, Word files, codebases and more. Document summarization: Generating summaries of lengthy documents, like research papers. Data analysis: Extracting data insights by loading files and querying it with LLMs. Agentic actions: Dynamically researching content using local or remote resources, running generative tools and actions based on user prompts. AnythingLLM can connect to a wide variety of open-source local LLMs, as well as larger LLMs in the cloud, including those provided by OpenAI, Microsoft and Anthropic. In addition, the application provides access to skills for extending its agentic AI capabilities via its community hub. With a one-click install and the ability to launch as a standalone app or browser extension — wrapped in an intuitive experience with no complicated setup required — AnythingLLM is a great option for AI enthusiasts, especially those with GeForce RTX and NVIDIA RTX PRO GPU-equipped systems. RTX Powers AnythingLLM Acceleration GeForce RTX and NVIDIA RTX PRO GPUs offer significant performance gains for running LLMs and agents in AnythingLLM — speeding up inference with Tensor Cores designed to accelerate AI. AnythingLLM runs LLMs with Ollama for on-device execution accelerated through Llama.cpp and ggml tensor libraries for machine learning. Ollama, Llama.cpp and GGML are optimized for NVIDIA RTX GPUs and the fifth-generation Tensor Cores. Performance on GeForce RTX 5090 is 2.4X compared to an Apple M3 Ultra. GeForce RTX 5090 delivers 2.4x faster LLM inference in AnythingLLM than Apple M3 Ultra on both Llama 3.1 8B and DeepSeek R1 8B. As NVIDIA adds new NIM microservices and reference workflows — like its growing library of AI Blueprints — tools like AnythingLLM will unlock even more multimodal AI use cases. AnythingLLM — Now With NVIDIA NIM AnythingLLM recently added support for NVIDIA NIM microservices — performance-optimized, prepackaged generative AI models that make it easy to get started with AI workflows on RTX AI PCs with a streamlined API. NVIDIA NIMs are great for developers looking for a quick way to test a Generative AI model in a workflow. Instead of having to find the right model, download all the files and figure out how to connect everything, they provide a single container that has everything you need. And they can run both on Cloud and PC, making it easy to prototype locally and then deploy on the cloud. By offering them within AnythingLLM’s user-friendly UI, users have a quick way to test them and experiment with them. And then they can either connect them to their workflows with AnythingLLM, or leverage NVIDIA AI Blueprints and NIM documentation and sample code to plug them directly to their apps or projects. Explore the wide variety of NIM microservices available to elevate AI-powered workflows, including language and image generation, computer vision and speech processing. Each week, the RTX AI Garage blog series features community-driven AI innovations and content for those looking to learn more about NIM microservices and AI Blueprints, as well as building AI agents, creative workflows, digital humans, productivity apps and more on AI PCs and workstations. Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter. Follow NVIDIA Workstation on LinkedIn and X. See notice regarding software product information. #run #llms #anythingllm #faster #with

Run LLMs on AnythingLLM Faster With NVIDIA RTX AI PCs

blogs.nvidia.com
Large language models (LLMs), trained on datasets with billions of tokens, can generate high-quality content. They’re the backbone for many of the most popular AI applications, including chatbots, assistants, code generators and much more. One of today’s most accessible ways to work with LLMs is with AnythingLLM, a desktop app built for enthusiasts who want an all-in-one, privacy-focused AI assistant directly on their PC. With new support for NVIDIA NIM microservices on NVIDIA GeForce RTX and NVIDIA RTX PRO GPUs, AnythingLLM users can now get even faster performance for more responsive local AI workflows. What Is AnythingLLM? AnythingLLM is an all-in-one AI application that lets users run local LLMs, retrieval-augmented generation (RAG) systems and agentic tools. It acts as a bridge between a user’s preferred LLMs and their data, and enables access to tools (called skills), making it easier and more efficient to use LLMs for specific tasks like: Question answering: Getting answers to questions from top LLMs — like Llama and DeepSeek R1 — without incurring costs. Personal data queries: Use RAG to query content privately, including PDFs, Word files, codebases and more. Document summarization: Generating summaries of lengthy documents, like research papers. Data analysis: Extracting data insights by loading files and querying it with LLMs. Agentic actions: Dynamically researching content using local or remote resources, running generative tools and actions based on user prompts. AnythingLLM can connect to a wide variety of open-source local LLMs, as well as larger LLMs in the cloud, including those provided by OpenAI, Microsoft and Anthropic. In addition, the application provides access to skills for extending its agentic AI capabilities via its community hub. With a one-click install and the ability to launch as a standalone app or browser extension — wrapped in an intuitive experience with no complicated setup required — AnythingLLM is a great option for AI enthusiasts, especially those with GeForce RTX and NVIDIA RTX PRO GPU-equipped systems. RTX Powers AnythingLLM Acceleration GeForce RTX and NVIDIA RTX PRO GPUs offer significant performance gains for running LLMs and agents in AnythingLLM — speeding up inference with Tensor Cores designed to accelerate AI. AnythingLLM runs LLMs with Ollama for on-device execution accelerated through Llama.cpp and ggml tensor libraries for machine learning. Ollama, Llama.cpp and GGML are optimized for NVIDIA RTX GPUs and the fifth-generation Tensor Cores. Performance on GeForce RTX 5090 is 2.4X compared to an Apple M3 Ultra. GeForce RTX 5090 delivers 2.4x faster LLM inference in AnythingLLM than Apple M3 Ultra on both Llama 3.1 8B and DeepSeek R1 8B. As NVIDIA adds new NIM microservices and reference workflows — like its growing library of AI Blueprints — tools like AnythingLLM will unlock even more multimodal AI use cases. AnythingLLM — Now With NVIDIA NIM AnythingLLM recently added support for NVIDIA NIM microservices — performance-optimized, prepackaged generative AI models that make it easy to get started with AI workflows on RTX AI PCs with a streamlined API. NVIDIA NIMs are great for developers looking for a quick way to test a Generative AI model in a workflow. Instead of having to find the right model, download all the files and figure out how to connect everything, they provide a single container that has everything you need. And they can run both on Cloud and PC, making it easy to prototype locally and then deploy on the cloud. By offering them within AnythingLLM’s user-friendly UI, users have a quick way to test them and experiment with them. And then they can either connect them to their workflows with AnythingLLM, or leverage NVIDIA AI Blueprints and NIM documentation and sample code to plug them directly to their apps or projects. Explore the wide variety of NIM microservices available to elevate AI-powered workflows, including language and image generation, computer vision and speech processing. Each week, the RTX AI Garage blog series features community-driven AI innovations and content for those looking to learn more about NIM microservices and AI Blueprints, as well as building AI agents, creative workflows, digital humans, productivity apps and more on AI PCs and workstations. Plug in to NVIDIA AI PC on Facebook, Instagram, TikTok and X — and stay informed by subscribing to the RTX AI PC newsletter. Follow NVIDIA Workstation on LinkedIn and X. See notice regarding software product information.

0 Commenti ·0 condivisioni ·0 Anteprima

Effettua l'accesso per mettere mi piace, condividere e commentare!
Reddit @Reddit ha condiviso un link
2025-05-26 12:46:02 ·

Google Is Burying the Web Alive

screen time

Google Is Burying the Web Alive

5:00 A.M.

saved

this article to read it later.

Find this story in your account’s ‘Saved for Later’ section.

Photo-Illustration: Intelligencer

By now, there’s a good chance you’ve encountered Google’s AI Overviews, possibly thousands of times. Appearing as blurbs at the top of search results, they attempt to settle your queries before you scroll — to offer answers, or relevant information, gleaned from websites that you no longer need to click on. The feature was officially rolled out at Google’s developer conference last year and had been in testing for quite some time before that; on the occasion of this year’s conference, the company characterized it as “one of the most successful launches in Search in the past decade,” a strangely narrow claim that is almost certainly true: Google put AI summaries on top of everything else, for everyone, as if to say, “Before you use our main product, see if this works instead.”
This year’s conference included another change to search, this one more profound but less aggressively deployed. “AI Mode,” which has similarly been in beta testing for a while, will appear as an option for all users. It’s not like AI Overviews; that is, it’s not an extra module taking up space on a familiar search-results page but rather a complete replacement for conventional search. It’s Google’s “most powerful AI search, with more advanced reasoning and multimodality, and the ability to go deeper through follow-up questions and helpful links to the web,” the company says, “breaking down your question into subtopics and issuing a multitude of queries simultaneously on your behalf.” It’s available to everyone. It’s a lot like using AI-first chatbots that have search functions, like those from OpenAI, Anthropic, and Perplexity, and Google says it’s destined for greater things than a small tab. “As we get feedback, we’ll graduate many features and capabilities from AI Mode right into the core Search experience,” the company says.
I’ve been testing AI Mode for a few months now, and in some ways it’s less radical than it sounds andfeels. It resembles the initial demos of AI search tools, including those by Google, meaning it responds to many questions with clean, ad-free answers. Sometimes it answers in extended plain language, but it also makes a lot of lists and pulls in familiar little gridded modules — especially when you ask about things you can buy — resulting in a product that, despite its chatty interface, feels an awful lot like … search.
Again, now you can try it yourself, and your mileage may vary; it hasn’t drawn me away from Google proper for a lot of thoughtless rote tasks, but it’s competitive with ChatGPT for the expanding range of searchish tasks you might attempt with a chatbot.
From the very first use, however, AI Mode crystallized something about Google’s priorities and in particular its relationship to the web from which the company has drawn, and returned, many hundreds of billions of dollars of value. AI Overviews demoted links, quite literally pushing content from the web down on the page, and summarizing its contents for digestion without clicking:

Photo-Illustration: Intelligencer; Screenshot: Google

Meanwhile, AI Mode all but buries them, not just summarizing their content for reading within Google’s product but inviting you to explore and expand on those summaries by asking more questions, rather than clicking out. In many cases, links are retained merely to provide backup and sourcing, included as footnotes and appendices rather than destinations:

Photo-Illustration: Intelligencer; Screenshot: Google

This is typical with AI search tools and all but inevitable now that such things are possible. In terms of automation, this means companies like OpenAI and Google are mechanizing some of the “work” that goes into using tools like Google search, removing, when possible, the step where users leave their platforms and reducing, in theory, the time and effort it takes to navigate to somewhere else when necessary. In even broader terms — contra Google’s effort to brand this as “going beyond information to intelligence” — this is an example of how LLMs offer different ways to interact with much of the same information: summarization rather than retrieval, regeneration rather than fact-finding, and vibe-y reconstruction over deterministic reproduction.
This is interesting to think about and often compelling to use but leaves unresolved one of the first questions posed by chatbots-as-search: Where will they get all the data they need to continue to work well? When Microsoft and Google showed off their first neo-search mockups in 2023, which are pretty close to today’s AI mode, it revealed a dilemma:
Search engines still provide the de facto gateway to the broader web, and have a deeply codependent relationship with the people and companies whose content they crawl, index, and rank; a Google that instantly but sometimes unreliably summarizes the websites to which it used to send people would destroy that relationship, and probably a lot of websites, including the ones on which its models were trained.
And, well, yep! Now, both AI Overviews and AI Mode, when they aren’t occasionally hallucinating, produce relatively clean answers that benefit in contrast to increasingly degraded regular search results on Google, which are full of hyperoptimized and duplicative spamlike content designed first and foremost with the demands of Google’s ranking algorithms and advertising in mind. AI Mode feels one step further removed from that ecosystem and once again looks good in contrast, a placid textual escape from Google’s own mountain of links that look like ads and ads that look like links. In its drive to embrace AI, Google is further concealing the raw material that fuels it, demoting links as it continues to ingest them for abstraction. Google may still retain plenty of attention to monetize and perhaps keep even more of it for itself, now that it doesn’t need to send people elsewhere; in the process, however, it really is starving the web that supplies it with data on which to train and
Two years later, Google has become more explicit about the extent to which it’s moving on from the “you provide us results to rank, and we send you visitors to monetize” bargain, with the head of search telling The Verge, “I think the search results page was a construct.” Which is true, as far as it goes, but also a remarkable thing to hear from a company that’s communicated carefully and voluminously to website operators about small updates to its search algorithms for years.
I don’t doubt that Google has been thinking about this stuff for a while and that there are people at the company who deem it strategically irrelevant or at least of secondary importance to winning the AI race — the fate of the web might not sound terribly important when your bosses are talking nonstop about cashing out its accumulated data and expertise for AGI. I also don’t want to be precious about the web as it actually exists in 2025, nor do I suggest that websites working with or near companies like Meta and Google should have expected anything but temporary, incidental alignment with their businesses. If I had to guess, the future of Google search looks more like AI Overviews than AI mode — a jumble of widgets and modules including and united by AI-generated content, rather than a clean break — if only for purposes of sustaining Google’s multi-hundred-billion-dollar advertising business.
But I also don’t want to assume Google knows exactly how this stuff will play out for Google, much less what it will actually mean for millions of websites, and their visitors, if Google stops sending as many people beyond its results pages. Google’s push into productizing generative AI is substantially fear-driven, faith-based, and informed by the actions of competitors that are far less invested in and dependent on the vast collection of behaviors — websites full of content authentic and inauthentic, volunteer and commercial, social and antisocial, archival and up-to-date — that make up what’s left of the web and have far less to lose. Maybe, in a few years, a fresh economy will grow around the new behaviors produced by searchlike AI tools; perhaps companies like OpenAI and Google will sign a bunch more licensing deals; conceivably, this style of search automation simply collapses the marketplace supported by search, leveraging training based on years of scraped data to do more with less. In any case, the signals from Google — despite its unconvincing suggestions to the contrary — are clear: It’ll do anything to win the AI race. If that means burying the web, then so be it.

Sign Up for the Intelligencer Newsletter
Daily news about the politics, business, and technology shaping our world.

This site is protected by reCAPTCHA and the Google
Privacy Policy and
Terms of Service apply.

By submitting your email, you agree to our Terms and Privacy Notice and to receive email correspondence from us.

Tags:

Google Is Burying the Web Alive
#google #burying #web #alive

Google Is Burying the Web Alive
screen time Google Is Burying the Web Alive 5:00 A.M. saved this article to read it later. Find this story in your account’s ‘Saved for Later’ section. Photo-Illustration: Intelligencer By now, there’s a good chance you’ve encountered Google’s AI Overviews, possibly thousands of times. Appearing as blurbs at the top of search results, they attempt to settle your queries before you scroll — to offer answers, or relevant information, gleaned from websites that you no longer need to click on. The feature was officially rolled out at Google’s developer conference last year and had been in testing for quite some time before that; on the occasion of this year’s conference, the company characterized it as “one of the most successful launches in Search in the past decade,” a strangely narrow claim that is almost certainly true: Google put AI summaries on top of everything else, for everyone, as if to say, “Before you use our main product, see if this works instead.” This year’s conference included another change to search, this one more profound but less aggressively deployed. “AI Mode,” which has similarly been in beta testing for a while, will appear as an option for all users. It’s not like AI Overviews; that is, it’s not an extra module taking up space on a familiar search-results page but rather a complete replacement for conventional search. It’s Google’s “most powerful AI search, with more advanced reasoning and multimodality, and the ability to go deeper through follow-up questions and helpful links to the web,” the company says, “breaking down your question into subtopics and issuing a multitude of queries simultaneously on your behalf.” It’s available to everyone. It’s a lot like using AI-first chatbots that have search functions, like those from OpenAI, Anthropic, and Perplexity, and Google says it’s destined for greater things than a small tab. “As we get feedback, we’ll graduate many features and capabilities from AI Mode right into the core Search experience,” the company says. I’ve been testing AI Mode for a few months now, and in some ways it’s less radical than it sounds andfeels. It resembles the initial demos of AI search tools, including those by Google, meaning it responds to many questions with clean, ad-free answers. Sometimes it answers in extended plain language, but it also makes a lot of lists and pulls in familiar little gridded modules — especially when you ask about things you can buy — resulting in a product that, despite its chatty interface, feels an awful lot like … search. Again, now you can try it yourself, and your mileage may vary; it hasn’t drawn me away from Google proper for a lot of thoughtless rote tasks, but it’s competitive with ChatGPT for the expanding range of searchish tasks you might attempt with a chatbot. From the very first use, however, AI Mode crystallized something about Google’s priorities and in particular its relationship to the web from which the company has drawn, and returned, many hundreds of billions of dollars of value. AI Overviews demoted links, quite literally pushing content from the web down on the page, and summarizing its contents for digestion without clicking: Photo-Illustration: Intelligencer; Screenshot: Google Meanwhile, AI Mode all but buries them, not just summarizing their content for reading within Google’s product but inviting you to explore and expand on those summaries by asking more questions, rather than clicking out. In many cases, links are retained merely to provide backup and sourcing, included as footnotes and appendices rather than destinations: Photo-Illustration: Intelligencer; Screenshot: Google This is typical with AI search tools and all but inevitable now that such things are possible. In terms of automation, this means companies like OpenAI and Google are mechanizing some of the “work” that goes into using tools like Google search, removing, when possible, the step where users leave their platforms and reducing, in theory, the time and effort it takes to navigate to somewhere else when necessary. In even broader terms — contra Google’s effort to brand this as “going beyond information to intelligence” — this is an example of how LLMs offer different ways to interact with much of the same information: summarization rather than retrieval, regeneration rather than fact-finding, and vibe-y reconstruction over deterministic reproduction. This is interesting to think about and often compelling to use but leaves unresolved one of the first questions posed by chatbots-as-search: Where will they get all the data they need to continue to work well? When Microsoft and Google showed off their first neo-search mockups in 2023, which are pretty close to today’s AI mode, it revealed a dilemma: Search engines still provide the de facto gateway to the broader web, and have a deeply codependent relationship with the people and companies whose content they crawl, index, and rank; a Google that instantly but sometimes unreliably summarizes the websites to which it used to send people would destroy that relationship, and probably a lot of websites, including the ones on which its models were trained. And, well, yep! Now, both AI Overviews and AI Mode, when they aren’t occasionally hallucinating, produce relatively clean answers that benefit in contrast to increasingly degraded regular search results on Google, which are full of hyperoptimized and duplicative spamlike content designed first and foremost with the demands of Google’s ranking algorithms and advertising in mind. AI Mode feels one step further removed from that ecosystem and once again looks good in contrast, a placid textual escape from Google’s own mountain of links that look like ads and ads that look like links. In its drive to embrace AI, Google is further concealing the raw material that fuels it, demoting links as it continues to ingest them for abstraction. Google may still retain plenty of attention to monetize and perhaps keep even more of it for itself, now that it doesn’t need to send people elsewhere; in the process, however, it really is starving the web that supplies it with data on which to train and Two years later, Google has become more explicit about the extent to which it’s moving on from the “you provide us results to rank, and we send you visitors to monetize” bargain, with the head of search telling The Verge, “I think the search results page was a construct.” Which is true, as far as it goes, but also a remarkable thing to hear from a company that’s communicated carefully and voluminously to website operators about small updates to its search algorithms for years. I don’t doubt that Google has been thinking about this stuff for a while and that there are people at the company who deem it strategically irrelevant or at least of secondary importance to winning the AI race — the fate of the web might not sound terribly important when your bosses are talking nonstop about cashing out its accumulated data and expertise for AGI. I also don’t want to be precious about the web as it actually exists in 2025, nor do I suggest that websites working with or near companies like Meta and Google should have expected anything but temporary, incidental alignment with their businesses. If I had to guess, the future of Google search looks more like AI Overviews than AI mode — a jumble of widgets and modules including and united by AI-generated content, rather than a clean break — if only for purposes of sustaining Google’s multi-hundred-billion-dollar advertising business. But I also don’t want to assume Google knows exactly how this stuff will play out for Google, much less what it will actually mean for millions of websites, and their visitors, if Google stops sending as many people beyond its results pages. Google’s push into productizing generative AI is substantially fear-driven, faith-based, and informed by the actions of competitors that are far less invested in and dependent on the vast collection of behaviors — websites full of content authentic and inauthentic, volunteer and commercial, social and antisocial, archival and up-to-date — that make up what’s left of the web and have far less to lose. Maybe, in a few years, a fresh economy will grow around the new behaviors produced by searchlike AI tools; perhaps companies like OpenAI and Google will sign a bunch more licensing deals; conceivably, this style of search automation simply collapses the marketplace supported by search, leveraging training based on years of scraped data to do more with less. In any case, the signals from Google — despite its unconvincing suggestions to the contrary — are clear: It’ll do anything to win the AI race. If that means burying the web, then so be it. Sign Up for the Intelligencer Newsletter Daily news about the politics, business, and technology shaping our world. This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply. By submitting your email, you agree to our Terms and Privacy Notice and to receive email correspondence from us. Tags: Google Is Burying the Web Alive #google #burying #web #alive

Google Is Burying the Web Alive

nymag.com
screen time Google Is Burying the Web Alive 5:00 A.M. saved Save this article to read it later. Find this story in your account’s ‘Saved for Later’ section. Photo-Illustration: Intelligencer By now, there’s a good chance you’ve encountered Google’s AI Overviews, possibly thousands of times. Appearing as blurbs at the top of search results, they attempt to settle your queries before you scroll — to offer answers, or relevant information, gleaned from websites that you no longer need to click on. The feature was officially rolled out at Google’s developer conference last year and had been in testing for quite some time before that; on the occasion of this year’s conference, the company characterized it as “one of the most successful launches in Search in the past decade,” a strangely narrow claim that is almost certainly true: Google put AI summaries on top of everything else, for everyone, as if to say, “Before you use our main product, see if this works instead.” This year’s conference included another change to search, this one more profound but less aggressively deployed. “AI Mode,” which has similarly been in beta testing for a while, will appear as an option for all users. It’s not like AI Overviews; that is, it’s not an extra module taking up space on a familiar search-results page but rather a complete replacement for conventional search. It’s Google’s “most powerful AI search, with more advanced reasoning and multimodality, and the ability to go deeper through follow-up questions and helpful links to the web,” the company says, “breaking down your question into subtopics and issuing a multitude of queries simultaneously on your behalf.” It’s available to everyone. It’s a lot like using AI-first chatbots that have search functions, like those from OpenAI, Anthropic, and Perplexity, and Google says it’s destined for greater things than a small tab. “As we get feedback, we’ll graduate many features and capabilities from AI Mode right into the core Search experience,” the company says. I’ve been testing AI Mode for a few months now, and in some ways it’s less radical than it sounds and (at first) feels. It resembles the initial demos of AI search tools, including those by Google, meaning it responds to many questions with clean, ad-free answers. Sometimes it answers in extended plain language, but it also makes a lot of lists and pulls in familiar little gridded modules — especially when you ask about things you can buy — resulting in a product that, despite its chatty interface, feels an awful lot like … search. Again, now you can try it yourself, and your mileage may vary; it hasn’t drawn me away from Google proper for a lot of thoughtless rote tasks, but it’s competitive with ChatGPT for the expanding range of searchish tasks you might attempt with a chatbot. From the very first use, however, AI Mode crystallized something about Google’s priorities and in particular its relationship to the web from which the company has drawn, and returned, many hundreds of billions of dollars of value. AI Overviews demoted links, quite literally pushing content from the web down on the page, and summarizing its contents for digestion without clicking: Photo-Illustration: Intelligencer; Screenshot: Google Meanwhile, AI Mode all but buries them, not just summarizing their content for reading within Google’s product but inviting you to explore and expand on those summaries by asking more questions, rather than clicking out. In many cases, links are retained merely to provide backup and sourcing, included as footnotes and appendices rather than destinations: Photo-Illustration: Intelligencer; Screenshot: Google This is typical with AI search tools and all but inevitable now that such things are possible. In terms of automation, this means companies like OpenAI and Google are mechanizing some of the “work” that goes into using tools like Google search, removing, when possible, the step where users leave their platforms and reducing, in theory, the time and effort it takes to navigate to somewhere else when necessary. In even broader terms — contra Google’s effort to brand this as “going beyond information to intelligence” — this is an example of how LLMs offer different ways to interact with much of the same information: summarization rather than retrieval, regeneration rather than fact-finding, and vibe-y reconstruction over deterministic reproduction. This is interesting to think about and often compelling to use but leaves unresolved one of the first questions posed by chatbots-as-search: Where will they get all the data they need to continue to work well? When Microsoft and Google showed off their first neo-search mockups in 2023, which are pretty close to today’s AI mode, it revealed a dilemma: Search engines still provide the de facto gateway to the broader web, and have a deeply codependent relationship with the people and companies whose content they crawl, index, and rank; a Google that instantly but sometimes unreliably summarizes the websites to which it used to send people would destroy that relationship, and probably a lot of websites, including the ones on which its models were trained. And, well, yep! Now, both AI Overviews and AI Mode, when they aren’t occasionally hallucinating, produce relatively clean answers that benefit in contrast to increasingly degraded regular search results on Google, which are full of hyperoptimized and duplicative spamlike content designed first and foremost with the demands of Google’s ranking algorithms and advertising in mind. AI Mode feels one step further removed from that ecosystem and once again looks good in contrast, a placid textual escape from Google’s own mountain of links that look like ads and ads that look like links (of course, Google is already working on ads for both Overviews and AI Mode). In its drive to embrace AI, Google is further concealing the raw material that fuels it, demoting links as it continues to ingest them for abstraction. Google may still retain plenty of attention to monetize and perhaps keep even more of it for itself, now that it doesn’t need to send people elsewhere; in the process, however, it really is starving the web that supplies it with data on which to train and Two years later, Google has become more explicit about the extent to which it’s moving on from the “you provide us results to rank, and we send you visitors to monetize” bargain, with the head of search telling The Verge, “I think the search results page was a construct.” Which is true, as far as it goes, but also a remarkable thing to hear from a company that’s communicated carefully and voluminously to website operators about small updates to its search algorithms for years. I don’t doubt that Google has been thinking about this stuff for a while and that there are people at the company who deem it strategically irrelevant or at least of secondary importance to winning the AI race — the fate of the web might not sound terribly important when your bosses are talking nonstop about cashing out its accumulated data and expertise for AGI. I also don’t want to be precious about the web as it actually exists in 2025, nor do I suggest that websites working with or near companies like Meta and Google should have expected anything but temporary, incidental alignment with their businesses. If I had to guess, the future of Google search looks more like AI Overviews than AI mode — a jumble of widgets and modules including and united by AI-generated content, rather than a clean break — if only for purposes of sustaining Google’s multi-hundred-billion-dollar advertising business. But I also don’t want to assume Google knows exactly how this stuff will play out for Google, much less what it will actually mean for millions of websites, and their visitors, if Google stops sending as many people beyond its results pages. Google’s push into productizing generative AI is substantially fear-driven, faith-based, and informed by the actions of competitors that are far less invested in and dependent on the vast collection of behaviors — websites full of content authentic and inauthentic, volunteer and commercial, social and antisocial, archival and up-to-date — that make up what’s left of the web and have far less to lose. Maybe, in a few years, a fresh economy will grow around the new behaviors produced by searchlike AI tools; perhaps companies like OpenAI and Google will sign a bunch more licensing deals; conceivably, this style of search automation simply collapses the marketplace supported by search, leveraging training based on years of scraped data to do more with less. In any case, the signals from Google — despite its unconvincing suggestions to the contrary — are clear: It’ll do anything to win the AI race. If that means burying the web, then so be it. Sign Up for the Intelligencer Newsletter Daily news about the politics, business, and technology shaping our world. This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply. By submitting your email, you agree to our Terms and Privacy Notice and to receive email correspondence from us. Tags: Google Is Burying the Web Alive

0 Commenti ·0 condivisioni ·0 Anteprima

Effettua l'accesso per mettere mi piace, condividere e commentare!

Passa a Pro