Initial Benchmarks Of The Intel Downfall Mitigation Performance Impact

Written by Michael Larabel in Software on 9 August 2023 at 04:00 PM EDT. Page 3 of 5. 46 Comments.

Aside from Intel's own oneAPI software components exhibiting slower performance, some AI workloads were also showing slower performance as a result of the Downfall microcode mitigation.

Neural Magic DeepSparse benchmark with settings of Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.
Neural Magic DeepSparse benchmark with settings of Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.
Neural Magic DeepSparse benchmark with settings of Model: ResNet-50, Sparse INT8, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.
Neural Magic DeepSparse benchmark with settings of Model: ResNet-50, Sparse INT8, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.
Neural Magic DeepSparse benchmark with settings of Model: CV Detection, YOLOv5s COCO, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.
Neural Magic DeepSparse benchmark with settings of Model: BERT-Large, NLP Question Answering, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.
Neural Magic DeepSparse benchmark with settings of Model: CV Detection, YOLOv5s COCO, Sparse INT8, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.
Neural Magic DeepSparse benchmark with settings of Model: NLP Text Classification, DistilBERT mnli, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.

Neural Magic's DeepSparse software for AI inference that allows for "offering GPU-class performance on CPUs" was impacted by the Downfall microcode. For a number of the models the throughput and latency were worse off as a result of the new microcode on the Xeon Platinum 8380 two socket server.

Neural Magic DeepSparse benchmark with settings of Model: BERT-Large, NLP Question Answering, Sparse INT8, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.
Neural Magic DeepSparse benchmark with settings of Model: BERT-Large, NLP Question Answering, Sparse INT8, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.
Neural Magic DeepSparse benchmark with settings of Model: NLP Text Classification, BERT base uncased SST2, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.
Neural Magic DeepSparse benchmark with settings of Model: NLP Text Classification, BERT base uncased SST2, Scenario: Asynchronous Multi-Stream. 0xd000390 was the fastest.

Some of the slowdowns as a result of the new Ice Lake microcode were quite significant.

NCNN benchmark with settings of Target: CPU, Model: vgg16. 0xd000390 was the fastest.
NCNN benchmark with settings of Target: CPU, Model: googlenet. 0xd000390 was the fastest.
NCNN benchmark with settings of Target: CPU, Model: resnet18. 0xd000390 was the fastest.

Another AI software package impacted by yesterday's microcode update was Tencent's NCNN inference framework.

QMCPACK benchmark with settings of Input: simple-H2O. 0xd000390 was the fastest.
QMCPACK benchmark with settings of Input: FeCO6_b3lyp_gms. 0xd000390 was the fastest.

Another area impacted by Downfall mitigations was the QMCPACK Quantum Monte Carlo code was also showing some measurable slowdowns with this new CPU microcode for Downfall.


Related Articles