Americas

  • United States

Nvidia unveils a new GPU architecture designed for AI data centers

News Analysis
Mar 22, 20224 mins
Data Center

Nvidia’s H100 GPU is the first in its new family of Hopper processors that the company claims will underpin the world’s fastest supercomputer.

nvidia hopper gpu
Credit: Nvidia

While the rest of the computing industry struggles to get to one exaflop of computing, Nvidia is about to blow past everyone with an 18-exaflop supercomputer powered by a new GPU architecture.

The H100 GPU, has 80 billion transistors (the previous generation, Ampere, had 54 billion) with nearly 5TB/s of external connectivity and support for PCIe Gen5, as well as High Bandwidth Memory 3 (HBM3), enabling 3TB/s of memory bandwidth, the company says. It is the first in a new family of GPUs codenamed “Hopper,” after Admiral Grace Hopper, the computing pioneer who created COBOL and coined the term “computer bug.” It is due in the third quarter.

This GPU is meant to power data centers designed to handle heavy AI workloads, and Nvidia claims that 20 of them could sustain the equivalent of the entire world’s Internet traffic.

Hopper also comes with the second generation of Nvidia’s Secure Multi-Instance GPU (MIG) technology, allowing a single GPU to be partitioned to support security in multi-tenant uses. The key change with H100 is the MIGs are now fully isolated with I/O virtualization and independently secured with confidential computing capabilities each instance.

Researchers with smaller workloads were required to rent a full A100 CSP instance for isolation. With H100, they can use MIG to securely isolate a portion of a GPU, being assured that their data is secure.

“Now this computing power can be securely divided between different users and cloud tenants,” said, Paresh Kharya, Nvidia’s senior director of Data Center Computing on the pre-briefing call. “That’s seven times the MIG capabilities of the previous generation.”

New to the H100 is a function called confidential computing, which protects AI models and customer data while they are being processed. Kharya noted that currently, sensitive data is often encrypted at rest and in transit over the network, but is often unprotected during use. Confidential computing addresses this gap by protecting  data in use, he said.

Hopper also has the fourth-generation NVLink, Nvidia’s high-speed interconnect technology. Combined with a new external NVLink Switch, the new NVlink can connect up to 256 H100 GPUs at nine times higher bandwidth versus the previous generation.

Finally, Hopper adds new DPX instructions to accelerate dynamic programming, the practice of breaking down problems with combinatorial complexity to simpler subproblems. It is employed in a wide range of algorithms that are used in genomics and graph optimizations. Hopper’s DP instructions will accelerate dynamic programming by seven times, Kharya said.

Promise of the fastest supercomputer

Pieced together, this technology will be used to create Nvidia DGX H100 systems, 5U rack-mounted units, the building block for powerful DGX SuperPOD supercomputers.

Kharya said the new DGX H100 would offer 32 petaflops of AI performance, six times more than DGX A100 currently on the market. And when combined with the NVLink switch system would create a 32 node DGX SuperPOD that will offer one exaflop of AI performance. It will also offer a bisection bandwidth of 70 terabytes per second, 11 times higher than the DGX A100 SuperPOD.

To show off the H100 capabilities, Nvidia is building a supercomputer called Eos with 18 DGX H100 SuperPODs that have 4,608 H100 GPUs joined by fourth generation NVLink and InfiniBand switches, for a total of 18 exaflops of AI performance. To put that in perspective, according to the most recent Top500 list of supercomputers, the peak 8-bit performance of the fastest supercomputer, Fugaku, reaches four exaflops; Nvidia is promising to go four times faster than that.

Eos will provide bare-metal performance with multi-tenant isolation, as well as performance isolation to ensure that one application does not impact any other, said Kharya.

“Eos will be used by our AI research teams, as well as by a numerous other software engineers and teams who are creating our products, including autonomous vehicle platform and conversational AI software,” he said.

Nvidia did not offer a timeline for when Eos would be deployed. DGX H100 PODs and SuperPODs are expected later this year.

Andy Patrizio is a freelance journalist based in southern California who has covered the computer industry for 20 years and has built every x86 PC he’s ever owned, laptops not included.

The opinions expressed in this blog are those of the author and do not necessarily represent those of ITworld, Network World, its parent, subsidiary or affiliated companies.