Google talked AI for 2 hours. It didnt mention hallucinations.

поделился ссылкой

2025-05-21 02:30:47 ·

The era of AI Search is officially here.
Credit: Google

This year, Google I/O 2025 had one focus: Artificial intelligence.We've already covered all of the biggest news to come out of the annual developers conference: a new AI video generation tool called Flow. A AI Ultra subscription plan. Tons of new changes to Gemini. A virtual shopping try-on feature. And critically, the launch of the search tool AI Mode to all users in the United States.Yet over nearly two hours of Google leaders talking about AI, one word we didn't hear was "hallucination".

You May Also Like

Hallucinations remain one of the most stubborn and concerning problems with AI models. The term refers to invented facts and inaccuracies that large-language models "hallucinate" in their replies. And according to the big AI brands' own metrics, hallucinations are getting worse — with some models hallucinating more than 40 percent of the time. But if you were watching Google I/O 2025, you wouldn't know this problem existed. You'd think models like Gemini never hallucinate; you would certainly be surprised to see the warning appended to every Google AI Overview.Mashable Light Speed

Want more out-of-this world tech, space and science stories?
Sign up for Mashable's weekly Light Speed newsletter.

By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.

Thanks for signing up!

The closest Google came to acknowledging the hallucination problem came during a segment of the presentation on AI Mode and Gemini's Deep Search capabilities. The model would check its own work before delivering an answer, we were told — but without more detail on this process, it sounds more like the blind leading the blind than genuine fact-checking.For AI skeptics, the degree of confidence Silicon Valley has in these tools seems divorced from actual results. Real users notice when AI tools fail at simple tasks like counting, spellchecking, or answering questions like "Will water freeze at 27 degrees Fahrenheit?"Google was eager to remind viewers that its newest AI model, Gemini 2.5 Pro, sits atop many AI leaderboards. But when it comes to truthfulness and the ability to answer simple questions, AI chatbots are graded on a curve. Gemini 2.5 Pro is Google's most intelligent AI model, yet it scores just a 52.9 percent on the Functionality SimpleQA benchmarking test. According to an OpenAI research paper, the SimpleQA test is "a benchmark that evaluates the ability of language models to answer short, fact-seeking questions."Related Stories

A Google representative declined to discuss the SimpleQA benchmark, or hallucinations in general — but did point us to Google's official Explainer on AI Mode and AI Overviews. Here's what it has to say:uses a large language model to help answer queries and it is possible that, in rare cases, it may sometimes confidently present information that is inaccurate, which is commonly known as 'hallucination.' As with AI Overviews, in some cases this experiment may misinterpret web content or miss context, as can happen with any automated system in Search...We’re also using novel approaches with the model’s reasoning capabilities to improve factuality. For example, in collaboration with Google DeepMind research teams, we use agentic reinforcement learningin our custom training to reward the model to generate statements it knows are more likely to be accurateand also backed up by inputs.Is Google wrong to be optimistic? Hallucinations may yet prove to be a solvable problem, after all. But it seems increasingly clear from the research that hallucinations from LLMs are not a solvable problem right now. That hasn't stopped companies like Google and OpenAI from sprinting ahead into the era of AI Search — and that's likely to be an error-filled era, unless we're the ones hallucinating.

Timothy Beck Werth
Tech Editor

Timothy Beck Werth is the Tech Editor at Mashable, where he leads coverage and assignments for the Tech and Shopping verticals. Tim has over 15 years of experience as a journalist and editor, and he has particular experience covering and testing consumer technology, smart home gadgets, and men’s grooming and style products. Previously, he was the Managing Editor and then Site Director of SPY.com, a men's product review and lifestyle website. As a writer for GQ, he covered everything from bull-riding competitions to the best Legos for adults, and he’s also contributed to publications such as The Daily Beast, Gear Patrol, and The Awl.Tim studied print journalism at the University of Southern California. He currently splits his time between Brooklyn, NY and Charleston, SC. He's currently working on his second novel, a science-fiction book.
#google #talked #hours #didnt #mention

Google talked AI for 2 hours. It didnt mention hallucinations.

The era of AI Search is officially here. Credit: Google This year, Google I/O 2025 had one focus: Artificial intelligence.We've already covered all of the biggest news to come out of the annual developers conference: a new AI video generation tool called Flow. A AI Ultra subscription plan. Tons of new changes to Gemini. A virtual shopping try-on feature. And critically, the launch of the search tool AI Mode to all users in the United States.Yet over nearly two hours of Google leaders talking about AI, one word we didn't hear was "hallucination". You May Also Like Hallucinations remain one of the most stubborn and concerning problems with AI models. The term refers to invented facts and inaccuracies that large-language models "hallucinate" in their replies. And according to the big AI brands' own metrics, hallucinations are getting worse — with some models hallucinating more than 40 percent of the time. But if you were watching Google I/O 2025, you wouldn't know this problem existed. You'd think models like Gemini never hallucinate; you would certainly be surprised to see the warning appended to every Google AI Overview.Mashable Light Speed Want more out-of-this world tech, space and science stories? Sign up for Mashable's weekly Light Speed newsletter. By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy. Thanks for signing up! The closest Google came to acknowledging the hallucination problem came during a segment of the presentation on AI Mode and Gemini's Deep Search capabilities. The model would check its own work before delivering an answer, we were told — but without more detail on this process, it sounds more like the blind leading the blind than genuine fact-checking.For AI skeptics, the degree of confidence Silicon Valley has in these tools seems divorced from actual results. Real users notice when AI tools fail at simple tasks like counting, spellchecking, or answering questions like "Will water freeze at 27 degrees Fahrenheit?"Google was eager to remind viewers that its newest AI model, Gemini 2.5 Pro, sits atop many AI leaderboards. But when it comes to truthfulness and the ability to answer simple questions, AI chatbots are graded on a curve. Gemini 2.5 Pro is Google's most intelligent AI model, yet it scores just a 52.9 percent on the Functionality SimpleQA benchmarking test. According to an OpenAI research paper, the SimpleQA test is "a benchmark that evaluates the ability of language models to answer short, fact-seeking questions."Related Stories A Google representative declined to discuss the SimpleQA benchmark, or hallucinations in general — but did point us to Google's official Explainer on AI Mode and AI Overviews. Here's what it has to say:uses a large language model to help answer queries and it is possible that, in rare cases, it may sometimes confidently present information that is inaccurate, which is commonly known as 'hallucination.' As with AI Overviews, in some cases this experiment may misinterpret web content or miss context, as can happen with any automated system in Search...We’re also using novel approaches with the model’s reasoning capabilities to improve factuality. For example, in collaboration with Google DeepMind research teams, we use agentic reinforcement learningin our custom training to reward the model to generate statements it knows are more likely to be accurateand also backed up by inputs.Is Google wrong to be optimistic? Hallucinations may yet prove to be a solvable problem, after all. But it seems increasingly clear from the research that hallucinations from LLMs are not a solvable problem right now. That hasn't stopped companies like Google and OpenAI from sprinting ahead into the era of AI Search — and that's likely to be an error-filled era, unless we're the ones hallucinating. Timothy Beck Werth Tech Editor Timothy Beck Werth is the Tech Editor at Mashable, where he leads coverage and assignments for the Tech and Shopping verticals. Tim has over 15 years of experience as a journalist and editor, and he has particular experience covering and testing consumer technology, smart home gadgets, and men’s grooming and style products. Previously, he was the Managing Editor and then Site Director of SPY.com, a men's product review and lifestyle website. As a writer for GQ, he covered everything from bull-riding competitions to the best Legos for adults, and he’s also contributed to publications such as The Daily Beast, Gear Patrol, and The Awl.Tim studied print journalism at the University of Southern California. He currently splits his time between Brooklyn, NY and Charleston, SC. He's currently working on his second novel, a science-fiction book. #google #talked #hours #didnt #mention

MASHABLE.COM

Google talked AI for 2 hours. It didnt mention hallucinations.

The era of AI Search is officially here. Credit: Google This year, Google I/O 2025 had one focus: Artificial intelligence.We've already covered all of the biggest news to come out of the annual developers conference: a new AI video generation tool called Flow. A $250 AI Ultra subscription plan. Tons of new changes to Gemini. A virtual shopping try-on feature. And critically, the launch of the search tool AI Mode to all users in the United States.Yet over nearly two hours of Google leaders talking about AI, one word we didn't hear was "hallucination". You May Also Like Hallucinations remain one of the most stubborn and concerning problems with AI models. The term refers to invented facts and inaccuracies that large-language models "hallucinate" in their replies. And according to the big AI brands' own metrics, hallucinations are getting worse — with some models hallucinating more than 40 percent of the time. But if you were watching Google I/O 2025, you wouldn't know this problem existed. You'd think models like Gemini never hallucinate; you would certainly be surprised to see the warning appended to every Google AI Overview. ("AI responses may include mistakes".) Mashable Light Speed Want more out-of-this world tech, space and science stories? Sign up for Mashable's weekly Light Speed newsletter. By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy. Thanks for signing up! The closest Google came to acknowledging the hallucination problem came during a segment of the presentation on AI Mode and Gemini's Deep Search capabilities. The model would check its own work before delivering an answer, we were told — but without more detail on this process, it sounds more like the blind leading the blind than genuine fact-checking.For AI skeptics, the degree of confidence Silicon Valley has in these tools seems divorced from actual results. Real users notice when AI tools fail at simple tasks like counting, spellchecking, or answering questions like "Will water freeze at 27 degrees Fahrenheit?"Google was eager to remind viewers that its newest AI model, Gemini 2.5 Pro, sits atop many AI leaderboards. But when it comes to truthfulness and the ability to answer simple questions, AI chatbots are graded on a curve. Gemini 2.5 Pro is Google's most intelligent AI model (according to Google), yet it scores just a 52.9 percent on the Functionality SimpleQA benchmarking test. According to an OpenAI research paper, the SimpleQA test is "a benchmark that evaluates the ability of language models to answer short, fact-seeking questions." (Emphasis ours.) Related Stories A Google representative declined to discuss the SimpleQA benchmark, or hallucinations in general — but did point us to Google's official Explainer on AI Mode and AI Overviews. Here's what it has to say:[AI Mode] uses a large language model to help answer queries and it is possible that, in rare cases, it may sometimes confidently present information that is inaccurate, which is commonly known as 'hallucination.' As with AI Overviews, in some cases this experiment may misinterpret web content or miss context, as can happen with any automated system in Search...We’re also using novel approaches with the model’s reasoning capabilities to improve factuality. For example, in collaboration with Google DeepMind research teams, we use agentic reinforcement learning (RL) in our custom training to reward the model to generate statements it knows are more likely to be accurate (not hallucinated) and also backed up by inputs.Is Google wrong to be optimistic? Hallucinations may yet prove to be a solvable problem, after all. But it seems increasingly clear from the research that hallucinations from LLMs are not a solvable problem right now. That hasn't stopped companies like Google and OpenAI from sprinting ahead into the era of AI Search — and that's likely to be an error-filled era, unless we're the ones hallucinating. Timothy Beck Werth Tech Editor Timothy Beck Werth is the Tech Editor at Mashable, where he leads coverage and assignments for the Tech and Shopping verticals. Tim has over 15 years of experience as a journalist and editor, and he has particular experience covering and testing consumer technology, smart home gadgets, and men’s grooming and style products. Previously, he was the Managing Editor and then Site Director of SPY.com, a men's product review and lifestyle website. As a writer for GQ, he covered everything from bull-riding competitions to the best Legos for adults, and he’s also contributed to publications such as The Daily Beast, Gear Patrol, and The Awl.Tim studied print journalism at the University of Southern California. He currently splits his time between Brooklyn, NY and Charleston, SC. He's currently working on his second novel, a science-fiction book.

·196 Просмотры

Вступить

Языки

Google talked AI for 2 hours. It didnt mention hallucinations.