AMD has a few things in its favor as it takes on Nvidia in the high-performance computing (HPC) and artificial intelligence (AI) arena. Credit: Gordon Mah Ung Last week, AMD announced it was ready to take on Nvidia in the GPU space for the data center, a market the company had basically ignored for the last several years in its struggle just to survive. But now, buoyed by its new CPU business, AMD is ready to take the fight to Nvidia. It would seem a herculean task. Or perhaps Quixotic. Nvidia has spent the past decade tilling the soil for artificial intelligence (AI) and high-performance computing (HPC), but it turns out AMD has a few things in its favor. For starters, it has a CPU and GPU business, and it can tie them together in a way Nvidia and Intel cannot. Yes, Intel has a GPU product line, but they are integrated with their consumer CPUs and not on the Xeons. And Nvidia has no x86 line. AMD’s next-generation Epyc server processors AMD is preparing the next generation of its Epyc server processors under the codename “Rome,” and they look like monsters: 7nm design while Intel is still stuck at 14nm. That means twice as many transistors in the same space as the existing chip and Intel’s Xeon, which means better performance. 64 cores and 128 threads per socket. An I/O chip in the middle of each chip to handle DDR4, Infinity Fabric, PCIe, and other I/O. PCIe Gen4 support, providing twice the bandwidth of PCIe 3. Greatly improved Infinity Fabric speeds, enabling inter-chip and memory communication. Most important, the ability to connect GPUs to CPUs and do inter-GPU communication with the CPU. The design of Epyc 2 is actually eight “chiplets” with eight cores each, connected by the fabric, with the I/O chip sitting in between the chiplets. Communication between the CPU and GPU, however, is done with PCI Express 4, which is not as fast but still mighty quick, and it gives AMD the advantage. One thing I have learned is that AMD is not at so great a disadvantage after all. It turns out that just because Nvidia has the CUDA language and a huge support base, CUDA or any other proprietary language is not needed to bring GPUs to bear. “If you are just walking into the market for the first time looking to develop some AI algorithm, then you’re either going to try and grab some software or write your own. If you write your own, then you use whatever language you are most comfortable in,” said Jon Peddie, president of Jon Peddie Research, which follows the graphics market. Google’s Tensor AI is written in C/C++ and Python, he noted. AI apps for training that use CUDA because the developers knew it, not because it was necessary. Nvidia’s advantages The one advantage Nvidia has is container technology that takes code written in one language and translates it into a language Nvidia understands. “As far as I know, AMD doesn’t have a container,” said Peddie. Nvidia has other technological advantages, as well. It put Tensor cores in the new Turing generation of GPUs to offer basic matrix math engines, like Google’s Tensor processor does. That makes the Turing generation well suited for matrix math, the foundational math in AI training. Peddie also noted that Nvidia has mindshare. Its status rivals that of Intel, and it could be argued Nvidia has eclipsed Intel. Nvidia shareholders would certainly agree. AMD’s “biggest challenge is the challenge they’ve always faced: Can they market? Nvidia is one of the most powerful brands you’ve ever heard of, up there with Sony and Apple,” Peddie said. AMD has competitive GPUs, but as Peddie put it, “they got the ammo. They need to figure out how to pull the trigger.” Related content news High-bandwidth memory nearly sold out until 2026 While it might be tempting to blame Nvidia for the shortage of HBM, it’s not alone in driving high-performance computing and demand for the memory HPC requires. By Andy Patrizio May 13, 2024 3 mins CPUs and Processors High-Performance Computing Data Center news CHIPS Act to fund $285 million for semiconductor digital twins Plans call for building an institute to develop digital twins for semiconductor manufacturing and share resources among chip developers. By Andy Patrizio May 10, 2024 3 mins CPUs and Processors Data Center news HPE launches storage system for HPC and AI clusters The HPE Cray Storage Systems C500 is tuned to avoid I/O bottlenecks and offers a lower entry price than Cray systems designed for top supercomputers. By Andy Patrizio May 07, 2024 3 mins Supercomputers Enterprise Storage Data Center news Lenovo ships all-AMD AI systems New systems are designed to support generative AI and on-prem Azure. By Andy Patrizio Apr 30, 2024 3 mins CPUs and Processors Data Center PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe