Gemini 2.5 Pro is here with bigger numbers and great vibes
arstechnica.com
Supposedly Smarter Gemini 2.5 Pro is here with bigger numbers and great vibes Google's new "thinking" model is ready to think for you. Ryan Whitwam Mar 26, 2025 11:36 am | 11 Credit: Google Credit: Google Story textSizeSmallStandardLargeWidth *StandardWideLinksStandardOrange* Subscribers only Learn moreJust a few months after releasing its first Gemini 2.0 AI models, Google is upgrading again. The company says the new Gemini 2.5 Pro Experimental is its "most intelligent" model yet, offering a massive context window, multimodality, and reasoning capabilities. Google points to a raft of benchmarks that show the new Gemini clobbering other large language models (LLMs), and our testing seems to back that upGemini 2.5 Pro is one of the most impressive generative AI models we've seen.Gemini 2.5, like all Google's models going forward, has reasoning built in. The AI essentially fact-checks itself along the way to generating an output. We like to call this "simulated reasoning," as there's no evidence that this process is akin to human reasoning. However, it can go a long way to improving LLM outputs. Google specifically cites the model's "agentic" coding capabilities as a beneficiary of this process. Gemini 2.5 Pro Experimental can, for example, generate a full working video game from a single prompt. We've tested this, and it works with the publicly available version of the model. Gemini 2.5 Pro builds a game in one step. Google says a lot of things about Gemini 2.5 Pro; it's smarter, it's context-aware, it thinksbut it's hard to quantify what constitutes improvement in generative AI bots. There are some clear technical upsides, though. Gemini 2.5 Pro comes with a 1 million token context window, which is common for the big Gemini models but massive compared to competing models like OpenAI GPT or Anthropic Claude. You could feed multiple very long books to Gemini 2.5 Pro in a single prompt, and the output maxes out at 64,000 tokens. That's the same as Flash 2.0, but it's still objectively a lot of tokens compared to other LLMs.Naturally, Google has run Gemini 2.5 Experimental through a battery of benchmarks, in which it scores a bit higher than other AI systems. For example, it squeaks past OpenAI's o3-mini in GPQA and AIME 2025, which measure how well the AI answers complex questions about science and math, respectively. It also set a new record in the Humanitys Last Exam benchmark, which consists of 3,000 questions curated by domain experts. Google's new AI managed a score of 18.8 percent to OpenAI's 14 percent.It's not clear how effective these attempts at objectively measuring AI capabilities are. Sometimes, a subjective assessment of AI can be more helpful"vibemarking" if you will. Google's new AI is already at the top of the LMSYS Chatbot arena leaderboard, which is a notable feat. This shows that users generally prefer Gemini 2.5 Pro Experimental's output to what you'd get from OpenAI o3-mini, Grok, DeepSeek, and others.Instant AI upgradeThe vibes we're getting while using Gemini 2.5 Pro Experimental are good, too. We threw some complex tasks at Gemini 2.5things that often confused the 2.0 modelsand the updated AI handled them much better. Coding, math, and science questions are also trending better than what we saw with previous versions of Gemini.Google's new pro model is also blazing fast. It still ticks along like other models, outputting tokens as it "reasons" its way to an answer, but everything feels faster than even the latest OpenAI and Anthropic models. Google has a ton of AI compute at its disposal, which is clearly being leveraged to great effect here. This is also why Gemini models like Gemini 2.5 Pro Experimental have such beefy context windowsin this case, it's about five times the size of o3-mini's input limit. And this is just the first stop. Google says the context window will be increased to 2 million tokens soon. Google just can't stop releasing new Gemini models. Credit: Ryan Whitwam Google just can't stop releasing new Gemini models. Credit: Ryan Whitwam Google's 2.0 Pro model seemed reasonably impressive when it launched a few months ago, but that AI is gone. Google says Gemini 2.5 Pro is a drop-in replacement for 2.0 that will be available across Google's products for anyone with a Gemini Advanced subscription ($20 per month). The new model is available right now in the mobile app and on the web, as well as in Google's AI Studio. It will be in Vertex AI soon.Google has not yet announced API pricing for Gemini 2.5 Pro Experimental, but you won't be able to do much with it right now anyway. Google has set the same 50-message daily limit as its older experimental models, and it's free for the moment. That will change, though. Google's Logan Kilpatrick said on X (formerly Twitter) that 2.5 Pro Experimental will be the first experimental model with higher API limits and pricing. There will be an announcement on that later.Ryan WhitwamSenior Technology ReporterRyan WhitwamSenior Technology Reporter Ryan Whitwam is a senior technology reporter at Ars Technica, covering the ways Google, AI, and mobile technology continue to change the world. Over his 20-year career, he's written for Android Police, ExtremeTech, Wirecutter, NY Times, and more. He has reviewed more phones than most people will ever own. You can follow him on Bluesky, where you will see photos of his dozens of mechanical keyboards. 11 Comments
0 التعليقات ·0 المشاركات ·69 مشاهدة