arstechnica.com
Into the Gemmaverse Googles Gemma 3 is an open source, single-GPU AI with a 128K context window Gemma 3 is optimized to run on powerful multi-GPU PCs or a single smartphone. Ryan Whitwam Mar 12, 2025 1:15 pm | 12 Credit: Google Credit: Google Story textSizeSmallStandardLargeWidth *StandardWideLinksStandardOrange* Subscribers only Learn moreMost new AI models go bigmore parameters, more tokens, more everything. Google's newest AI model has some big numbers, but it's also tuned for efficiency. Google says the Gemma 3 open source model is the best in the world for running on a single GPU or AI accelerator. The latest Gemma model is aimed primarily at developers who need to create AI to run in various environments, be it a data center or a smartphone. And you can tinker with Gemma 3 right now.Google claims Gemma 3 will be able to tackle more challenging tasks compared to the older open source Google models. The context window, a measure of how much data you can input, has been expanded to 128,000 from 8,192 tokens in previous Gemma models. Gemma 3, which is based on the proprietary Gemini 2.0 foundation, is also a multimodal model capable of processing text, high-resolution images, and even video. Google also has a new solution for image safety called ShieldGemma 2, which can be integrated with Gemma to help block unwanted images in three content categories: dangerous, sexual, or violent.Most of the popular AI models you've heard of run on collections of servers in a data center, filled to the brim with AI computing power. Many of them are far too large to run on the kind of hardware you have at home or in the office. The release of the first Gemma models last year gave developers and enthusiasts another low-hardware option to compete with the likes of Meta Llama3. There has been a drive for efficiency in AI lately, with models like DeepSeek R1 gaining traction on the basis of lower computing costs. Google: What is Gemma 3? Google says Gemma 3 is the "worlds best single-accelerator model." However, not all versions of the model are ideal for local processing. It comes in various sizes, from a petite text-only 1 billion-parameter model that can run on almost anything to the chunky 27 billion-parameter version that gobbles up RAM. In lower-precision modes, the smallest Gemma 3 model could occupy less than a gigabyte of memory, but the super-size versions need 20GB30GB even at 4-bit precision.But how good is Gemma 3? Google has provided some data that appears to show substantial improvements over most other open source models. Using the Elo metric, which measures user preference, Gemma 3 27B blows past Gemma 2, Meta Llama3, OpenAI o3-mini, and others in chat capabilities. It doesn't quite catch up to DeepSeek R1 in this relatively subjective test. However, it runs on a single Nvidia H100 accelerator here, whereas most other models need a gaggle of GPUs. Google says Gemma 3 is also more capable when it comes to math, coding, and following complex instructions. It does not offer any numbers to back that up, though. The subjective user preference Elo score shows people dig Gemma 3 as a chatbot. Credit: Google The subjective user preference Elo score shows people dig Gemma 3 as a chatbot. Credit: Google Google has the latest Gemma model available online in Google AI Studio. You can also fine-tune the model's training using tools like Google Colab and Vertex AIor simply use your own GPU. The new Gemma 3 models are open source, so you can download them from repositories like Kagle or Hugging Face. However, Google's license agreement limits what you can do with them. Regardless, Google won't know what you're exploring on your own hardware, which is the advantage of having more efficient local models like Gemma 3.No matter what you want to do, there's a Gemma model that will fit on your hardware. Need inspiration? Google has a new "Gemmaverse" community to highlight applications built with Gemma models.Ryan WhitwamSenior Technology ReporterRyan WhitwamSenior Technology Reporter Ryan Whitwam is a senior technology reporter at Ars Technica, covering the ways Google, AI, and mobile technology continue to change the world. Over his 20-year career, he's written for Android Police, ExtremeTech, Wirecutter, NY Times, and more. He has reviewed more phones than most people will ever own. You can follow him on Bluesky, where you will see photos of his dozens of mechanical keyboards. 12 Comments