The Nvidia H200 Tensor Core GPU is the first GPU to employ HBM3e memory, packing 141GB and offering 4.8TB/s of memory bandwidth, nearly doubling the capacity and providing 2.4x more bandwidth compared to its predecessor, the Nvidia A100. When put together in an NVIDIA HGX H200 system, it can pack up to 1.1TB of HMB3e memory, which is an impressive increase compared to the previous platform.
The Nvidia H200 can be paired with NVIDIA Grace CPUs with an ultra-fast NVLink-C2C interconnect to create the GH200 Grace Hopper Superchip with HBM3e aimed for giant-scale HPC and AI applications.
Nvidia says that the H200-powered system from server manufacturers and cloud service providers is expected to begin shipping in the second quarter of 2024, while server boards in four- or eight-way configurations, will be coming from all server makers, including ASRock Rack, ASUS, Dell Technologies, Eviden, GIGABYTE, Hewlett Packard Enterprise, Ingrasys, Lenovo, QCT, Supermicro, Wistron and Wiwynn.
Nvidia was keen to note that Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure will be among the first cloud service providers to deploy H200-based instances starting next year, in addition to CoreWeave, Lambda, and Vultr.
“To create intelligence with generative AI and HPC applications, vast amounts of data must be efficiently processed at high speed using large, fast GPU memory. With NVIDIA H200, the industry’s leading end-to-end AI supercomputing platform just got faster to solve some of the world’s most important challenges,” said Ian Buck, vice president of hyperscale and HPC at NVIDIA.