Upgrade to Pro

ARSTECHNICA.COM
New Lego-building AI creates models that actually stand up in real life
Another brick in the wall New Lego-building AI creates models that actually stand up in real life Carnegie Mellon "LegoGPT" system uses physics checks to ensure models don't collapse. Benj Edwards – May 9, 2025 5:29 pm | 14 Several examples of shapes created by LegoGPT. Credit: Pun et al. Several examples of shapes created by LegoGPT. Credit: Pun et al. Story text Size Small Standard Large Width * Standard Wide Links Standard Orange * Subscribers only   Learn more On Thursday, researchers at Carnegie Mellon University unveiled LegoGPT, an AI model that creates physically stable Lego structures from text prompts. The new system not only designs Lego models that match text descriptions (prompts) but also ensures they can be built brick by brick in the real world, either by hand or with robotic assistance. "To achieve this, we construct a large-scale, physically stable dataset of LEGO designs, along with their associated captions," the researchers wrote in their paper, which was posted on arXiv, "and train an autoregressive large language model to predict the next brick to add via next-token prediction." This trained model generates Lego designs that match text prompts like "a streamlined, elongated vessel" or "a classic-style car with a prominent front grille." The resulting designs are simple, using just a few brick types to create primitive shapes—but they stand up. As one Ars Technica staffer joked this morning upon seeing the research, "It builds Lego like it's 1974." A LegoGPT demo video assembled by the research team. In the paper titled "Generating Physically Stable and Buildable Lego Designs from Text," the research team led by Ava Pun explained that many existing 3D generation models focus on making diverse objects with detailed geometry, but these digital designs often can't be physically made. "Without proper support, parts of the design can collapse, float, or remain disconnected," they wrote. Unlike previous attempts at autonomous Lego modeling, LegoGPT reportedly produces step-by-step instructions for building Lego creations that don't fall apart. You can see demos of the system in action on the project's website. How LegoGPT works To build LegoGPT, the Carnegie Mellon team repurposed the technology behind large language models (LLMs), similar to the kind that run ChatGPT, for "next-brick prediction" instead of next-word prediction. To do so, the team fine-tuned LLaMA-3.2-1B-Instruct, an instruction-following language model from Meta. The team then augmented the brick-predicting model with a separate software tool that can verify physical stability using mathematical models that simulate gravity and structural forces. To train the model, the team assembled a new dataset called "StableText2Lego," which contained over 47,000 stable Lego structures paired with descriptive captions generated by a separate AI model, OpenAI's GPT-4o. Each structure underwent physics analysis to ensure it could be built in the real world. To build the Lego dataset, the team fed images rendered from 24 different viewpoints into GPT-4o and let that model write captions for each Lego structure, asking it to focus on geometric features while omitting color information. Credit: Pun et al. LegoGPT works by first generating a sequence of precisely placed Lego bricks. For each new brick in the sequence, the system makes sure it doesn't collide with existing bricks and that it fits within the building space. After completing a design, it uses the aforementioned mathematical models to verify that the model can stand upright without falling apart. If parts would collapse in real life, the system identifies the first unstable brick and backtracks, removing it and all subsequent bricks before trying a different approach. This "physics-aware rollback" method proved essential to the team's approach. Without it, only 24 percent of designs remained standing, compared to 98.8 percent with the full system. The LegoGPT system works in three parts, shown in this diagram. Credit: Pun et al. The researchers also expanded the system's abilities by adding texture and color options. For example, using an appearance prompt like "Electric guitar in metallic purple," LegoGPT can generate a guitar model, with bricks assigned a purple color. Testing with robots and humans To prove their designs worked in real life, the researchers had robots assemble the AI-created Lego models. They used a dual-robot arm system with force sensors to pick up and place bricks according to the AI-generated instructions. Human testers also built some of the designs by hand, showing that the AI creates genuinely buildable models. "Our experiments show that LegoGPT produces stable, diverse, and aesthetically pleasing Lego designs that align closely with the input text prompts," the team noted in its paper. When tested against other AI systems for 3D creation, LegoGPT stands out through its focus on structural integrity. The team tested against several alternatives, including LLaMA-Mesh and other 3D generation models, and found its approach produced the highest percentage of stable structures. A video of two robot arms building a LegoGPT creation, provided by the researchers. Still, there are some limitations. The current version of LegoGPT only works within a 20×20×20 building space and uses a mere eight standard brick types. "Our method currently supports a fixed set of commonly used Lego bricks," the team acknowledged. "In future work, we plan to expand the brick library to include a broader range of dimensions and brick types, such as slopes and tiles." The researchers also hope to scale up their training dataset to include more objects than the 21 categories currently available. Meanwhile, others can literally build on their work—the researchers released their dataset, code, and models on their project website and GitHub. Benj Edwards Senior AI Reporter Benj Edwards Senior AI Reporter Benj Edwards is Ars Technica's Senior AI Reporter and founder of the site's dedicated AI beat in 2022. He's also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC. 14 Comments
·12 Views