WWW.TECHNEWSWORLD.COM
Nvidia Blackwell Is One Hot Processor
Nvidia has faced scrutiny this month because some servers with a whopping 72 Blackwell processors were overheating. The issue arose because some initial OEM deployments were not properly water-cooled, which Lenovo aggressively identified and mitigated with its Neptune warm water-cooling solutions.As AI advances, well need more highly dense, incredibly powerful AI processors, which suggests that air cooling in server rooms may become obsolete.Lets talk about Blackwell, water cooling, and why Lenovos Neptune solution stands out at the moment. Well close with my Product of the Week: Microsofts Windows 365 Link, which could be the missing link between PCs and terminals that could forever change desktop computing.BlackwellBlackwell is Nvidias premier, AI-focused GPU. When it was announced, it was so far over what most would have thought practical that it almost seemed more like a pipe dream than a solution. But it works, and there is nothing close to its class right now. However, it is massively dense in terms of technology and generates a lot of heat.Some argue it is a potential ecological disaster. Dont get me wrong, it does pull a lot of power and generate a tremendous amount of heat. But its performance is so high compared to the kind of load that youd typically get with more conventional parts that it is relatively economical to run.Its like comparing a semi-truck with three trailers to a U-Haul van. Yes, the semi will get comparatively crappy gas mileage, but it will also hold more cargo than 10 U-Haul vans and use a lot less gas than those 10 vans, making it more ecologically friendly. The same is true of Blackwell. It is so far beyond its competition in terms of performance that its relatively high energy use is below what otherwise would be required for a competitive AI server.But Blackwell chips do run hot, and most servers today are air-cooled. So, it shouldnt be surprising that some Blackwell servers were configured with air cooling and those with 72 or more Blackwell processors on a rack overheated. While 72 Blackwells in a rack is unusual today, as AI advances, it will become more common, given Nvidia is currently the king of AI.You can only go so far with air-cooled technology in terms of performance before you have to move to liquid cooling. While Nvidia did respond to this issue with a water-cooled rack specification that Dell is now using, Lenovo was way ahead of the curve with its Neptune water-cooling solution.Lenovo NeptuneLenovo was the first to realize this, mainly because it is currently the market leader in its class in terms of water cooling a technology initially acquired from IBM, which has been doing water cooling for decades.What is important with water cooling isnt just the technology but the knowledge of how to deploy it safely. Mixing water and high-amperage electronics can be a disaster if you dont know what youre doing. As a result of the IBM server acquisition, Lenovo has decades of water cooling experience that it calls Neptune.Given Nvidia has specified a water-cooled rack, what makes Neptune better? The answer is experience. Most that will use the Nvidia-specified solution, including Nvidia, dont often deploy water-cooled solutions. As a result, particularly with these high-end Blackwell implementations, theyll essentially be learning on the job.It can be really dangerous when you mix water with high-amperage electronics. Water and electricity dont mix. Not only can a leak fry an expensive part or even an entire rack, but if a person is present, it can fry them, too, if the breakers dont set in. In a raised-floor environment, unless it has been designed with leaks in mind, terrible things can happen. I observed this myself decades ago when I was at IBM, and it turned out they hadnt stress-tested the water-cooling system for our massive (for the time) data center. The site lost a transformer that shut off the water-cooling system, which hadnt been stress-tested for a sudden stop. The pipes burst, and the data center became a dangerous swimming pool. Most of the hardware, costing hundreds of millions of dollars, was lost, and the building was flooded, doing additional damage.Through experiences like this, IBM became the leading OEM for safe water cooling, and Lenovo acquired that knowledge and experience when it bought the IBM x86 server group. Now, Lenovo, along with IBM, knows how to do water cooling better than most, which means that you can rest assured that a Lenovo Blackwell server wont overheat or suddenly begin to leak.Plus, Lenovos expertise is in warm water cooling, a far safer and far less expensive way to cool servers than cold water cooling, which requires huge, inefficient evaporators or chillers.Implementing this technology is no trivial task. Unlike automobiles or PCs that are water-cooled, servers have to have hot swapping capabilities, which means you need exceptional and highly tested drip-free connections, aggressive alerting, preventive maintenance schedules based on past knowledge of components, and technicians experienced with working with this level of water-cooling tech.Wrapping Up: A Future of Warm-Water-Cooled Data CentersBlackwell is only the first of these incredibly powerful processors to hit the market because as AI pushes the envelope, Nvidias competitors will also have to push into something similar, suggesting all servers may eventually need to be warm water cooled.As warm-water cooling moves into the market more aggressively, these data centers will quiet down, making them far more pleasant places to work. That will make many of us who have to work in them very happy.Windows 365 LinkImage Credit: MicrosoftEver since we replaced terminals with PCs, IT has wanted the terminal experience back. Terminals were like pre-smart TVs in that you didnt have to do patches or OS upgrades or deal with the blue screen of death. If the thing broke, it was pretty easy to fix or was relatively inexpensive to replace. From an IT perspective, terminals were a ton better than PCs.But on the PC side, terminals sucked. You couldnt run what you wanted to run without getting IT support, and it could take months for IT to respond to a request.Terminals were connected to aging mainframes that couldnt run modern applications at the time (they can now). New applications were usually custom-built, but a gap in communication between users and IT frequently led to problems. Users struggled to articulate their needs, and IT often failed to probe for better specifications, resulting in frequently unusable applications.Well at Microsoft Ignite last week, Microsoft announced the Windows 365 Link which may be the closest thing to a perfect wired (theres no laptop solution yet) terminal with PC-like features and performance.While we call the class a thin client, Microsoft calls this a Cloud PC. At $349 and the size of a micro-PC, it appears to have the closest weve seen in terms of a near-perfect PC/terminal blend.Windows 365 Link will be more reliable, cheaper, secure, and far smaller than most desktop PCs, making it very attractive for IT. At the same time, it connects to a Cloud PC instance, providing the user with a very PC-like experience.It only targets enterprise accounts right now, mainly because they have the greatest need and the necessary infrastructure. I see this moving to markets like travel, education, government, manufacturing, and other vertical markets with similar needs. Although it doesnt yet address mobile users, fully deployed 5G and the coming 6G specification should allow future mobile implementations.Given Microsoft was one of the companies that launched the PC and made terminals obsolete, it seems ironic and poetic that Microsoft takes the lead in making them obsolete, eventually. Well see if that happens. For now, the Windows 365 Link is my Product of the Week.
0 Yorumlar
0 hisse senetleri
20 Views