Intel Advanced Matrix Extensions [AMX] Performance With Xeon Scalable Sapphire Rapids

Written by Michael Larabel in Processors on 16 January 2023 at 04:00 PM EST. Page 4 of 6. 20 Comments.

While the oneDNN 3.0 benchmarks show how the BF16 performance can benefit from Intel AMX for this low-level open-source library, but what about at a higher level for AI with the OpenVINO performance? That's the next portion of today's benchmarks providing this early look at the Advanced Matrix Extensions performance potential. The open-source OpenVINO 2022.3 was used for benchmarking while again making use of oneDNN's environment variable option for the CPU dispatch control of different ISA levels. Like with oneDNN and testing configurations I've already been benchmarking for years, the OpenVINO testing was done with models already in-use and didn't undergo any special optimizations or other changes compared to how OpenVINO is normally benchmarked.

With the OpenVINO 2022.3 release and the face detection model, the throughput nearly doubled thanks to AMX compared to just using AVX-512...

With this particular configuration the latency was roughly the same.

Like with the oneDNN 3.0 benchmarks, the Xeon Platinum 8490H combined power consumption was less when using AMX compared to only AVX-512. The out-of-the-box OpenVINO 2022.3 with AMX yielded an average power consumption of 615 Watts with AMX in use compared to around 640 Watts for the two 8490H processors when just engaging AVX-512.

Again, this translated in to some minor CPU thermal/cooling benefits as well with AMX.

With the weld porosity model, the AMX-enabled run saw the performance go up by 2.6x compared to only using AVX-512 on Sapphire Rapids...

The latency was also more than halved with AMX in use.


Related Articles