IBM says the ESS 3500 storage devices will improve AI training by up to 70%. Credit: Quest Software IBM has added a new member to its Spectrum Scale Enterprise Storage Server (ESS) portfolio that featuers a faster controller CPU and more throughput and that is designed to work with Nvidia’s DGX dense compute servers for AI training. The new ESS 3500 is a 2U design with 24 drive bays and a maximum raw capacity of 368TB. But it can achieve up to 1PB through LZ4 compression, a first for the series that earlier ESS versions do not have. The ESS 3500 can achieve up to 91GB/s of throughput performance, better than the 80GB/s of the older models. The 3500 runs Spectrum Scale, IBM ’s scale-out parallel file system that spans on-premises, cloud, and edge networks. It uses dual active controllers with either 100Gbit Ethernet or 200Gbit HDR InfiniBand ports and a 48-core AMD Epyc processor on each controller. The 3500 directly targets Nvidia’s DGX dense compute systems, which are all GPUs and memory but no storage. It does this through use of Nvidia’s GPUDirect Storage technology, which creates a direct data path between GPUs and storage via NVMe or NVMe over Fabrics (NVMe-oF). Normally, data needs to be loaded into the CPU and main memory before being moved to the GPU for processing. GPUDirect allows the system to bypass the CPU and main memory completely and provides a direct connection between storage and GPU memory. IBM claims that with this system, auto parts maker Continental was able to improve AI training time for self-driving vehicles by as much as 70% using IBM Spectrum Scale and IBM ESS 3500 with a DGX system. The ESS 3500 is available now. Related content news High-bandwidth memory nearly sold out until 2026 While it might be tempting to blame Nvidia for the shortage of HBM, it’s not alone in driving high-performance computing and demand for the memory HPC requires. By Andy Patrizio May 13, 2024 3 mins CPUs and Processors High-Performance Computing Data Center news CHIPS Act to fund $285 million for semiconductor digital twins Plans call for building an institute to develop digital twins for semiconductor manufacturing and share resources among chip developers. By Andy Patrizio May 10, 2024 3 mins CPUs and Processors Data Center news HPE launches storage system for HPC and AI clusters The HPE Cray Storage Systems C500 is tuned to avoid I/O bottlenecks and offers a lower entry price than Cray systems designed for top supercomputers. By Andy Patrizio May 07, 2024 3 mins Supercomputers Enterprise Storage Data Center news Lenovo ships all-AMD AI systems New systems are designed to support generative AI and on-prem Azure. By Andy Patrizio Apr 30, 2024 3 mins CPUs and Processors Data Center PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe