
OpenAI just released GPT-4.5 and says it is its biggest and best chat model yet
www.technologyreview.com
OpenAI has just released GPT-4.5, a new version of its flagship large language model. The company claims it is its biggest and best model for all-round chat yet. It's really a step forward for us, says Mia Glaese, a research scientist at OpenAI. Since the releases of its so-called reasoning models o1 and o3, OpenAI has been pushing two product lines. GPT-4.5 is part of the non-reasoning line-up, what Glaese's colleague Nick Ryder, also a research scientist, calls an installment in the classic GPT series. People with a $200-a-month ChatGPT Pro account can try out GPT-4.5 today. OpenAI says it will begin rolling out to other users next week. With each release of its GPT models, OpenAI has shown that bigger means better. But there has been a lot of talk about how that approach is hitting a wallincluding from OpenAIs former chief scientist Ilya Sutskever. The companys claims about GPT-4.5 feel like a thumb in the eye to the naysayers. All large language models pick up patterns across the billions of documents they are trained on. Smaller models learned syntax and basic facts. Bigger models can find more specific patterns like emotional cues, such as when a speakers words signal hostility, says Ryder: All of these subtle patterns that come through a human conversationthose are the bits that these larger and larger models will pick up on. It has the ability to engage in warm, intuitive, natural, flowing conversations, says Glaese. And we think that it has a stronger understanding of what users mean, especially when their expectations are more implicit, leading to nuanced and thoughtful responses. We kind of know what the engine looks like at this point and now it's really about making it hum, says Ryder. This is primarily an exercise in scaling up the compute, scaling up the data, finding more efficient training methods, and then pushing the frontier. OpenAI wont say exactly how big its new model is. But it says the jump in scale from GPT-4o to GPT-4.5 is the same as the jump from GPT-3.5 to GPT-4o. Experts have estimated that GPT-4 could have as many as 1.8 trillion parameters, the values that get tweaked when a model is trained. GPT-4.5 was trained using similar techniques to its predecessor GPT-4o, including human-led fine-tuning and reinforcement learning with human feedback. The key to creating intelligent systems is a recipe we've been following for many years, which is to find scalable paradigms where we can pour more and more resources in to get more intelligent systems out, says Ryder. Unlike reasoning models, such as o1 and o3, which work through answers step by step, normal large language models like GPT-4.5 spit out the first response they come up with. But GPT-4.5 is more general-purpose. Tested on SimpleQA, a kind of general-knowledge quiz that includes questions on a wide range of topics, from science and technology to TV shows and video games, GPT-4.5 scores 62.5% compared to 38.6% for GPT-4o and 15% for o3-mini. Whats more, OpenAI claims that GPT-4.5 responds with far fewer made-up answers (known as hallucinations). On the same test GPT-4.5 made-up answers 37.1% of the time, compared to 59.8% for GPT-4o and 80.3% o3-mini. But on other benchmarks, including MMLU, a standard test for multimodal language models, gains on OpenAIs previous models were marginal. And on standard science and math benchmarks, GPT-4.5 scores worse than o3. GPT-4.5s special charm seems to be its conversation. Human testers employed by OpenAI say they preferred GPT-4.5s answers to GPT-4o for everyday queries, professional queries and creative tasks, including coming up with poems. (Ryder says it is also great at old-school internet ACSII art.) But after years at the top, OpenAI now has a tough crowd. The focus on emotional intelligence and creativity is cool for niche use cases like writing coaches and brainstorming buddies, says Waseem AlShikh, cofounder and CTO of Writer, a startup that develops large language models for enterprise customers. But GPT-4.5 feels like a shiny new coat of paint on the same old car, he says. Throwing more compute and data at a model can make it sound smoother, but its not a game-changer. The juice isnt worth the squeeze when you consider the energy costs, the complexity, and the fact that most users wont notice the difference in daily use, he says. Id rather see them pivot to efficiency or niche problem-solving than keep supersizing the same recipe. Sam Altman has said that GPT-4.5 will be the last release in OpenAIs classic line up and that GPT-5 will be a hybrid that combines a general-purpose large language model with a reasoning model. GPT-4.5 is OpenAI phoning it in while they cook up something bigger behind closed doors," says AlShikh. Until then, this feels like a pit stop. And yet OpenAI is convinced that its supersized approach still has legs. Personally, Im very optimistic about finding ways through those bottlenecks and continuing to scale, says Ryder. I think there's something extremely profound and exciting about pattern-matching across all of human knowledge.
0 Reacties
·0 aandelen
·70 Views