Nvidia released a video on Thursday unveiling the architecture of Eos, their latest enterprise-focused supercomputer tailored for advanced AI development on a large scale within data centers. Eos stands out as Nvidia’s fastest AI supercomputer and currently holds the 9th position in the global Top 500 list for supercomputers, excelling in FP64 performance and likely leading in pure AI tasks. This cutting-edge machine, utilized internally by Nvidia, offers insights into the innovative technology driving AI research and development.
At its core, Eos is powered by 576 DGX H100 systems, each featuring eight Nvidia H100 GPUs optimized for artificial intelligence and high-performance computing workloads. With a total of 1,152 Intel Xeon Platinum 8480C processors (56 cores per CPU) and 4,608 H100 GPUs, Eos boasts impressive performance metrics, achieving Rmax 121.4 FP64 PetaFLOPS and 18.4 FP8 ExaFLOPS for HPC and AI tasks, respectively.
Eos is meticulously designed for AI workloads and scalability, leveraging Nvidia’s Mellanox Quantum-2 InfiniBand with In-Network Computing technology to facilitate data transfer speeds of up to 400 Gb/s. This feature is crucial for efficiently training large AI models and enabling seamless scaling capabilities.
Complementing its robust hardware, Nvidia’s Eos is equipped with sophisticated software tailored for AI development and deployment, enhancing its versatility across various applications, from generative AI models like ChatGPT to AI factories. The integrated software stack includes AI-specific tools for orchestration, cluster management, accelerated compute storage and network libraries, and an AI-optimized operating system.
By drawing on the expertise gained from previous Nvidia DGX supercomputers like Saturn 5 and Selene, Eos exemplifies Nvidia’s proficiency in AI technology. The creation of Eos, described as an “AI factory,” empowers enterprises to tackle ambitious projects and realize their AI goals effectively both now and in the future.
While the exact cost of Eos remains undisclosed, the pricing of Nvidia’s DGX H100 systems is confidential and subject to various factors such as volume. Considering that each Nvidia H100 GPU may range from \(30,000 to \)40,000 based on volume considerations, the investment required for Eos underscores the significant resources involved in deploying cutting-edge AI infrastructure.