
Is Jensen Huang Nvidias Chief Revenue Destruction Officer?
www.forbes.com
SAN JOSE, CALIFORNIA - MARCH 18: Nvidia CEO Jensen Huang delivers a keynote address during the ... More Nvidia GTC Conference at SAP Center on March 18, 2024 in San Jose, California. Photo by Justin Sullivan/Getty ImagesGetty ImagesAt this year's GTC event in San Jose, Nvidia CEO Jensen Huang held over 25,000 people in the palm of his hand, captivated by his vision of AI and how it could transform the world we live in. Some folks in the audience couldn't keep up and started fiddling with their phones. (Among many other semiconductor vendors in the AI space, Nvidia is a client of my firm, Cambrian-AI Research).But most were hanging on every word to a) understand the potential of AI going forward, b) get a feel for the sustainability of AI demand, and c) assess whether Nvidia is moving too fast for its customers. Moving to a yearly technology cadence can help one stay ahead of slower competitors. Still, it can also make it difficult for customers to adopt the latest tech after buying a trainload full of the latest ones.Jensen half-jokingly said nobody should buy a Hopper GPU now that Blackwell is in full production, playing the role of what he called the company's "Chief Revenue Destruction Officer." He noted the frustration his sellers would feel upon hearing his advice. Investors did not appreciate his sarcasm either, and the stock is down 2.6% since GTC25 kicked off. However, according to Jensen, AI inference requires 100 times more computing now than a year ago, thanks mainly to the introduction of "reasoning" and agentic AI.Is Nvidia Blackwell So Hot That Nobody Wants Hopper?While Blackwell is amazingly fast, with up to 40 times more tokens/second for inference, it also requires significant data center power and water cooling upgrades for the AI Factories Jensen is pushing. However, Hopper may be just fine for many AI developers for now. Nvidia has already shipped nearly three times the number of Blackwell GPUs compared to all Hopper chips in 2024. There is no doubt now that the $11B of Blackwell-based systems Nvidia shipped in Q4 is the start of a new demand cycle. The following system, to be shipped later this year, the Blackwell Ultra, can plug right into the Oberon Blackwell NVL72 chassis. So, the following revenue transition should be much smoother.Moving to an annual product cycle creates tension, but disclosing the roadmap is essential to prepare the supply chain and ecosystem for the next two years of innovation. No data center can power an 800 Kilowatt rack required for Rubin Ultra today, but they will need to power and cool the beast if they want to remain competitive when Nvidia starts shipping its future racks.Jensen explained that the Blackwell GPU is 40 times faster than Hopper. He then half-jokingly said nobody should buy a Hopper GPU now that Blackwell is in full production, playing the role of the company's "Chief Revenue Destruction Officer." But Jensen told everyone how fast Blackwell would be a year ago, so sellers and buyers should not be surprised. Last year, he said 30X, but the new Dynamo software had a significant impact.With the new Dynamo inference management software, Blackwell is 40 times faster than hopper. NvidiaNvidia Dynamo: Another Defensive Moat?When the rest of the industry is trying to match (unsuccessfully) Nvidia GPU performance, Nvidia is optimizing the entire AI factory. The new Dynamo "AI Factory OS" and Co-packaged optical networking are two examples.Nvidia has wholly rewritten its Triton inference software to manage and optimize the distributed inference processing needed to perform agentic and reasoning AI. Nvidia Dynamo is an open-source modular inference framework for serving generative AI models in distributed environments. Think of it as the OS or the Kubernetes for an AI factory. It enables scaling inference workloads across large (potentially millions) GPU fleets with dynamic resource scheduling, intelligent request routing, optimized memory management, and accelerated data transfer. The memory management concept in Dynamo is Key Value Cache (KV Cache) distribution. It enables the entire AI factory to access previous reasoning results and serve them up instantly, avoiding costly recalculations.When serving the open-source DeepSeek-R1 671B reasoning model on Nvidia GB200 NVL72, Nvidia Dynamo increased the number of requests by up to 30x, helping AI factories lower costs to maximize token revenue generation..Co-Packaged Optical Photonic Scale-outMy colleague Jim McGregor of Tirias Research has already covered this innovation in Forbes, so I wont belabor the point. The bottom line is that these co-packaged optics were not expected to be ready by any vendor for another couple years. Now Nvidia will ship CPO networking for scale-out to millions of GPUs later this year.The new and unexpected Nvidia Photonics portfolio, for both Ethernet and InfiniBand scale-out ... More networking.NvidiaThe Updated Nvidia Hardware RoadmapHeres the new Nvidia GPU lineup through 2028. The annual increase in computing power, memory, and networking should be awe-inspiring, but the audience at GTC seemed to have expected nothing less from the company that practically reinvented computing.While it took Jensen Huang over two hours to get there, here's the new Nvidia Chip RoadmapNvidiaFirst up, Blackwell Ultra due later this year:Blackwell Ultra NvidiaAfter another year, in late 2026, Rubin shows up in the same Oberon Rack. So Nvidia is delivering three generations of GPUs into the same rack infrastructure. At this point, the naming changes to identify the number of GPUs in the NVLink-connected rack (144) instead of the number of dual-CPU packages (72), all in prep for Rubin Ultra. Yeah, its confusing, but it was the right long-term marketing call.Vera Rubin NVL144. Its not twice the GPUs. Nvidia has changed its naming convention for mult-die ... More packages. By this nomenclature, Blackwell is also 144 GPUs in a rack,NvidiaRubin Ultra will be a huge step up in density to 576 GPUs in a single Kyber rack with four full-sized GPU dies in a package, a new Arm Vera CPU, and the new NVlink7 with twelve times the throughput. This rack will deliver some 15 Exaflops of FP4 performance (14 times that of the next Blackwell GB300) and 4.6 Petabytes of HBM4e memory.Rubin Ultra NVL576NvidiaI think the chart below says it all. Rubin will be 900 times faster than last years Hopper, at 3% of the TCO. The industry has never seen anything like this pace of performance increase in two years.Rubin will be 900 times faster than Hopper. Mind blown.NvidiaAdding in the new 1 PFlop DGX Spark, 20 PFlop DGX Station, etc, this is this years Nvidia Enterprise AI Lineup. Note that all of these products are available from Nvidia partners.The Nvidia Lineup now include AI PCs starting at $4,000.NvidiaNvidia: The Foundational AI CompanyWhen asked what Nvidia is becoming at a small-group analyst session, Jensen said the company is already transitioning to its future place in the industry as THE foundational AI company. Jensen showed slides from many large enterprise clients that made this point; each had green icons with up to a dozen Nvidia Inference Micro-Services (NIMS) modules and hardware embedded in their AI stacks; once these green icons are in there, they wont come out easily.Nvidia shared the AI stacks from eight enterprise companies, demonstrating how embedded Nvidia ... More technology has become.NvidiaWhile Nvidia will certainly face increasing competition from its hyper-scale customers and chip vendors, I reiterate my world view: while the rest of the entire industry is just trying to match their GPU performance, and failing to do so, Nvidia is completely revamping the industry landscape. I havent even touched on all the new software Jensen announced at his keynote, nor the new AI PC and workstations that blow every AI PC out of the water by some 1-2 orders of magnitude.
0 Comments
·0 Shares
·18 Views