A hot potato: The open-source project llama2.c is designed to run a lightweight version of the Llama 2 model entirely in C code. This "baby" Llama 2 model is inspired by llama.cpp, a project created to enable LLM inference across a wide range of hardware, from local devices to cloud-based platforms. These compact code experiments are now being leveraged to run AI technology on virtually any device with a chip, highlighting the growing accessibility and versatility of AI tools. After seeing Exo Labs run a large language model on an ancient Pentium II running Windows 98, developer Andrei David decided to take on an even more unconventional challenge. Dusting off his Xbox 360 console, he set out to force the nearly two-decade-old machine to load an AI model from Meta AI's Llama family of LLMs.David shared on X that he successfully ported llama2.c to Microsoft's 2005-era gaming console. However, the process wasn't without significant hurdles. The Xbox 360's PowerPC CPU is a big-endian architecture, which required extensive endianness conversion for both the model's configuration and weights. Additionally, he had to deal with substantial adjustments and optimizations to the original code to make it work on the aging hardware.Memory management posed yet another significant challenge. The 60MB llama2 model had to be carefully structured to fit within the Xbox 360's unified memory architecture, where the CPU and GPU share the same pool of RAM. According to David, the Xbox 360's memory architecture was remarkably forward-thinking for its time, foreshadowing the memory management techniques now standard in modern gaming consoles and APUs.After extensive coding and optimization, David successfully ran llama2 on his Xbox 360 using a simple prompt: "Sleep Joe said." Despite the llama2 model being just 700 lines of C code with no external dependencies, David noted that it can deliver "surprisingly" strong performance when tailored to a sufficiently narrow domain.David explained that working within the constraints of a limited platform like the Xbox 360 forces you to prioritize efficient memory usage above all else. In response, another X user suggested that the 512MB of memory on Microsoft's old console might be sufficient to run other small LLM implementations, such as smolLM, created by AI startup Hugging Face.The developer gladly accepted the challenge, so we will likely see additional LLM experiments on Xbox 360 in the not-so-distant future. // Related Stories