CMU research shows compression alone may unlock AI puzzle-solving abilities
arstechnica.com
Tis the season for a squeezin' CMU research shows compression alone may unlock AI puzzle-solving abilities New research challenges prevailing idea that AI needs massive datasets to solve problems. Benj Edwards Mar 6, 2025 6:22 pm | 5 Credit: Eugene Mymrin via Getty Images Credit: Eugene Mymrin via Getty Images Story textSizeSmallStandardLargeWidth *StandardWideLinksStandardOrange* Subscribers only Learn moreA pair of Carnegie Mellon University researchers recently discovered hints that the process of compressing information can solve complex reasoning tasks without pre-training on a large number of examples. Their system tackles some types of abstract pattern-matching tasks using only the puzzles themselves, challenging conventional wisdom about how machine learning systems acquire problem-solving abilities."Can lossless information compression by itself produce intelligent behavior?" ask Isaac Liao, a first-year PhD student, and his advisor Professor Albert Gu from CMU's Machine Learning Department. Their work suggests the answer might be yes. To demonstrate, they created CompressARC and published the results in a comprehensive post on Liao's website.The pair tested their approach on the Abstraction and Reasoning Corpus (ARC-AGI), an unbeaten visual benchmark created in 2019 by machine learning researcher Franois Chollet to test AI systems' abstract reasoning skills. ARC presents systems with grid-based image puzzles where each provides several examples demonstrating an underlying rule, and the system must infer that rule to apply it to a new example.For instance, one ARC-AGI puzzle shows a grid with light blue rows and columns dividing the space into boxes. The task requires figuring out which colors belong in which boxes based on their position: black for corners, magenta for the middle, and directional colors (red for up, blue for down, green for right, and yellow for left) for the remaining boxes. Here are three other example ARC-AGI puzzles, taken from Liao's website: Three example ARC-AGI benchmarking puzzles. Credit: Isaac Liao / Albert Gu The puzzles test capabilities that some experts believe may be fundamental to general human-like reasoning (often called "AGI" for artificial general intelligence). Those properties include understanding object persistence, goal-directed behavior, counting, and basic geometry without requiring specialized knowledge. The average human solves 76.2 percent of the ARC-AGI puzzles, while human experts reach 98.5 percent.OpenAI made waves in December for the claim that its o3 simulated reasoning model earned a record-breaking score on the ARC-AGI benchmark. In testing with computational limits, o3 scored 75.7 percent on the test, while in high-compute testing (basically unlimited thinking time), it reached 87.5 percent, which OpenAI says is comparable to human performance.CompressARC achieves 34.75 percent accuracy on the ARC-AGI training set (the collection of puzzles used to develop the system) and 20 percent on the evaluation set (a separate group of unseen puzzles used to test how well the approach generalizes to new problems). Each puzzle takes about 20 minutes to process on a consumer-grade RTX 4070 GPU, compared to top-performing methods that use heavy-duty data center-grade machines and what the researchers describe as "astronomical amounts of compute."Not your typical AI approachCompressARC takes a completely different approach than most current AI systems. Instead of relying on pre-trainingthe process where machine learning models learn from massive datasets before tackling specific tasksit works with no external training data whatsoever. The system trains itself in real-time using only the specific puzzle it needs to solve."No pretraining; models are randomly initialized and trained during inference time. No dataset; one model trains on just the target ARC-AGI puzzle and outputs one answer," the researchers write, describing their strict constraints.When the researchers say "No search," they're referring to another common technique in AI problem-solving where systems try many different possible solutions and select the best one. Search algorithms work by systematically exploring optionslike a chess program evaluating thousands of possible movesrather than directly learning a solution. CompressARC avoids this trial-and-error approach, relying solely on gradient descenta mathematical technique that incrementally adjusts the network's parameters to reduce errors, similar to how you might find the bottom of a valley by always walking downhill. A block diagram of the CompressARC architecture, created by the researchers. Credit: Isaac Liao / Albert Gu The system's core principle uses compressionfinding the most efficient way to represent information by identifying patterns and regularitiesas the driving force behind intelligence. CompressARC searches for the shortest possible description of a puzzle that can accurately reproduce the examples and the solution when unpacked.While CompressARC borrows some structural principles from transformers (like using a residual stream with representations that are operated upon), it's a custom neural network architecture designed specifically for this compression task. It's not based on an LLM or standard transformer model.Unlike typical machine learning methods, CompressARC uses its neural network only as a decoder. During encoding (the process of converting information into a compressed format), the system fine-tunes the network's internal settings and the data fed into it, gradually making small adjustments to minimize errors. This creates the most compressed representation while correctly reproducing known parts of the puzzle. These optimized parameters then become the compressed representation that stores the puzzle and its solution in an efficient format. An animated GIF showing the multi-step process of CompressARC solving an ARC-AGI puzzle. Credit: Isaac Liao "The key challenge is to obtain this compact representation without needing the answers as inputs," the researchers explain. The system essentially uses compression as a form of inference.This approach could prove valuable in domains where large datasets don't exist or when systems need to learn new tasks with minimal examples. The work suggests that some forms of intelligence might emerge not from memorizing patterns across vast datasets, but from efficiently representing information in compact forms.The compression-intelligence connectionThe potential connection between compression and intelligence may sound strange at first glance, but it has deep theoretical roots in computer science concepts like Kolmogorov complexity (the shortest program that produces a specified output) and Solomonoff inductiona theoretical gold standard for prediction equivalent to an optimal compression algorithm.To compress information efficiently, a system must recognize patterns, find regularities, and "understand" the underlying structure of the dataabilities that mirror what many consider intelligent behavior. A system that can predict what comes next in a sequence can compress that sequence efficiently. As a result, some computer scientists over the decades have suggested that compression may be equivalent to general intelligence. Based on these principles, the Hutter Prize has offered awards to researchers who can compress a 1GB file to the smallest size.We previously wrote about intelligence and compression in September 2023, when a DeepMind paper discovered that large language models can sometimes outperform specialized compression algorithms. In that study, researchers found that DeepMind's Chinchilla 70B model could compress image patches to 43.4 percent of their original size (beating PNG's 58.5 percent) and audio samples to just 16.4 percent (outperforming FLAC's 30.3 percent). Credit: Getty Images That 2023 research suggested a deep connection between compression and intelligencethe idea that truly understanding patterns in data enables more efficient compression, which aligns with this new CMU research. While DeepMind demonstrated compression capabilities in an already-trained model, Liao and Gu's work takes a different approach by showing that the compression process can generate intelligent behavior from scratch.This new research matters because it challenges the prevailing wisdom in AI development, which typically relies on massive pre-training datasets and computationally expensive models. While leading AI companies push toward ever-larger models trained on more extensive datasets, CompressARC suggests intelligence emerging from a fundamentally different principle."CompressARC's intelligence emerges not from pretraining, vast datasets, exhaustive search, or massive computebut from compression," the researchers conclude. "We challenge the conventional reliance on extensive pretraining and data, and propose a future where tailored compressive objectives and efficient inference-time computation work together to extract deep intelligence from minimal input."Limitations and looking aheadEven with its successes, Liao and Gu's system comes with clear limitations that may prompt skepticism. While it successfully solves puzzles involving color assignments, infilling, cropping, and identifying adjacent pixels, it struggles with tasks requiring counting, long-range pattern recognition, rotations, reflections, or simulating agent behavior. These limitations highlight areas where simple compression principles may not be sufficient.The research has not been peer-reviewed, and the 20 percent accuracy on unseen puzzles, though notable without pre-training, falls significantly below both human performance and top AI systems. Critics might argue that CompressARC could be exploiting specific structural patterns in the ARC puzzles that might not generalize to other domains, challenging whether compression alone can serve as a foundation for broader intelligence rather than just being one component among many required for robust reasoning capabilities.And yet as AI development continues its rapid advance, if CompressARC holds up to further scrutiny, it offers a glimpse of a possible alternative path that might lead to useful intelligent behavior without the resource demands of today's dominant approaches. Or at the very least, it might unlock an important component of general intelligence in machines, which is still poorly understood.Benj EdwardsSenior AI ReporterBenj EdwardsSenior AI Reporter Benj Edwards is Ars Technica's Senior AI Reporter and founder of the site's dedicated AI beat in 2022. He's also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC. 5 Comments
0 Commenti ·0 condivisioni ·87 Views