Intel Xeon Max 9480/9468 Show Significant Uplift In HPC & AI Workloads With HBM2e

Written by Michael Larabel in Processors on 28 June 2023 at 05:00 PM EDT. Page 2 of 7. 26 Comments.
OpenFOAM benchmark with settings of Input: drivaerFastback, Large Mesh Size, Execution Time. Xeon Max 9480 2P: HBM Only was the fastest.

With the OpenFOAM leading open-source computational fluid dynamics (CFD) software there were massive time savings and a significant leap in performance with the HBM-only mode. These results show the significant potential for Intel HBM2e-enabled server processors where able to fit the dataset/workload within the 64GB of HBM2e per socket.

OpenFOAM benchmark with settings of Input: drivaerFastback, Large Mesh Size, Execution Time. Xeon Max 9480 2P: HBM Only was the fastest.

The IPMI-reported power consumption of the Super Micro server was similar between the tested modes while running OpenFOAM CFD.

OpenFOAM benchmark with settings of Input: drivaerFastback, Medium Mesh Size, Mesh Time. Xeon Max 9468 2P: HBM Only was the fastest.
OpenFOAM benchmark with settings of Input: drivaerFastback, Medium Mesh Size, Execution Time. Xeon Max 9480 2P: HBM Only was the fastest.

The OpenFOAM benefits from Xeon Max are very substantial and interesting to see for this open-source CFD solution. Though with the Xeon Max 9480 only topping out at 56 cores, it will be interesting in a future article to look at how much benefit is provided by the HBM2e relative to higher core count Sapphire Rapids (non-Max) processors or the competition.

NAS Parallel Benchmarks benchmark with settings of Test / Class: SP.C. Xeon Max 9480 2P: HBM Only was the fastest.

The Xeon Max processors also enjoyed a nice lift in performance-per-Watt thanks to the HBM2e memory.

miniBUDE benchmark with settings of Implementation: OpenMP, Input Deck: BM1. Xeon Max 9468 2P: HBM Only was the fastest.
miniBUDE benchmark with settings of Implementation: OpenMP, Input Deck: BM2. Xeon Max 9468 2P: HBM Only was the fastest.
nekRS benchmark with settings of Input: TurboPipe Periodic. Xeon Max 9468 2P: HBM Only was the fastest.

In many of the common HPC workloads I benchmark with at Phoronix, the Xeon Max processors showed significant advantage when making use of the HBM2E memory and for the workloads where able to fit within the 128GB memory capacity / ~1+ GB per core.

nekRS benchmark with settings of Input: TurboPipe Periodic. Xeon Max 9468 2P: HBM Only was the fastest.

When operating in HBM-only mode there were some power savings when monitoring the overall AC system power draw using IPMI thanks to not having to power the sixteen DDR5 DIMMs.

nekRS benchmark with settings of Input: TurboPipe Periodic. Xeon Max 9468 2P: HBM Only was the fastest.

But when engaging the HBM-only mode, there was also slightly higher CPU power use (monitored via RAPL/PowerCap sysfs interface) for many of the HPC benchmarks, so it's not as great of a reduction as removing sixteen DDR5 DIMMs outright.


Related Articles