Announcement

**Anux** · 06 September 2022, 11:13 AM

Michael can you please recheck the results for SVT-AV1 1080p? I find it highly unlikely that nosmt performs better than unmitigated in this one instance.

**guspitts** · 06 September 2022, 12:04 PM

These mitigations, combined, can have a seriously bad effect at times. One question I have a hard time to get a straight answer is what is the real security risk of running a system with `mitigations=off`? Are there actual attacks in the wild that use these vulnerabilities, or is it still theoretic at this point?

Maybe these mitigations make sense on a laptop that accesses any kind of websites from whichever WiFi network is available. But what about a compute node in a cluster behind a firewall? (or any number of scenarios between those extremes)

I wish there was a resource available to help judge when the various mitigations are necessary (e.g., remotely exploitable via JS, or via an ssh server, or via local access only), based on actual exploits, not just potential threats.

**coder** · 06 September 2022, 12:12 PM

Originally posted by Anux View Post

Michael can you please recheck the results for SVT-AV1 1080p? I find it highly unlikely that nosmt performs better than unmitigated in this one instance.

AVX-intensive workloads tend to perform better without SMT. That's probably because a single thread can typically utilize the core's vector pipelines at near full capacity, so SMT would mean having to share cache without much upside.

The only real data I have on that is here: https://www.anandtech.com/show/16778...eview-part-2/5

In the SPECint workloads, more threads is always better. In the SPECfp workloads, the EPYC configurations with 1 thread per core always win vs 2 threads on the same CPU(s). However, in the case of Xeons, the extra threads still seem to help (though not as much as in SPECint). I don't know if that says more about Xeons' cache subsystem or their vector ALUs.

**Linuxxx** · 06 September 2022, 01:02 PM

Originally posted by coder View Post

AVX-intensive workloads tend to perform better without SMT. That's probably because a single thread can typically utilize the core's vector pipelines at near full capacity, so SMT would mean having to share cache without much upside.

The only real data I have on that is here: https://www.anandtech.com/show/16778...eview-part-2/5

In the SPECint workloads, more threads is always better. In the SPECfp workloads, the EPYC configurations with 1 thread per core always win vs 2 threads on the same CPU(s). However, in the case of Xeons, the extra threads still seem to help (though not as much as in SPECint). I don't know if that says more about Xeons' cache subsystem or their vector ALUs.

Or rather:

Schedutil once again screwing the results, which is especially common with video transcoding workloads!

(Looks like schedutil's throughput worsens with increasing thread counts...)

**loganj** · 06 September 2022, 01:21 PM

a few games graphs would be good. or there is no real performance impact for games in general?

**birdie** · 06 September 2022, 01:22 PM

Originally posted by guspitts View Post

These mitigations, combined, can have a seriously bad effect at times. One question I have a hard time to get a straight answer is what is the real security risk of running a system with `mitigations=off`? Are there actual attacks in the wild that use these vulnerabilities, or is it still theoretic at this point?

Maybe these mitigations make sense on a laptop that accesses any kind of websites from whichever WiFi network is available. But what about a compute node in a cluster behind a firewall? (or any number of scenarios between those extremes)

I wish there was a resource available to help judge when the various mitigations are necessary (e.g., remotely exploitable via JS, or via an ssh server, or via local access only), based on actual exploits, not just potential threats.

You only run local trusted verified code? There's zero need for mitigations.

You run someone else's code, i.e. JS? Enable them. You have a shared environment, i.e. you provide shared hosting? Enable as much as humanly possible or/and stick virtual guests to certain physical cores, so that different VMs always ran on their own dedicated cores.

Originally posted by loganj View Post

a few games graphs would be good. or there is no real performance impact for games in general?

Triple A games are normally GPU bound. Older games already run fast enough, it doesn't matter if mitigations are on or off. Just a waste of time. Lastly, there are basically no games for Linux aside from rare Indies.

**stormcrow** · 06 September 2022, 01:35 PM

Originally posted by birdie View Post

You only run local trusted verified code? There's zero need for mitigations.

You run someone else's code, i.e. JS? Enable them. You have a shared environment, i.e. you provide shared hosting? Enable as much as humanly possible or/and stick virtual guests to certain physical cores, so that different VMs always ran on their own dedicated cores.

Triple A games are normally GPU bound. Older games already run fast enough, it doesn't matter if mitigations are on or off. Just a waste of time. Lastly, there are basically no games for Linux aside from rare Indies.

Birdie, seriously, educate yourself before spouting off. There's plenty of Linux commercial games available on Steam and GOG including those often overrated "AAA" published games.

Otherwise... yeah Spectre mitigations usually don't affect game performance very much... however, big caveat. No one has thoroughly explored what security problems GPUs have brought to the table beyond attacking the drivers, either.

**Anux** · 06 September 2022, 01:44 PM

Originally posted by coder View Post

AVX-intensive workloads tend to perform better without SMT. That's probably because a single thread can typically utilize the core's vector pipelines at near full capacity, so SMT would mean having to share cache without much upside.

If that were the case we would have seen the same on the 4K samples.

Originally posted by Linuxxx View Post

Schedutil once again screwing the results, which is especially common with video transcoding workloads!

(Looks like schedutil's throughput worsens with increasing thread counts...)

Then again, why only 1080p and not 4K. It looks more like an error at labeling.

**Michael** · 06 September 2022, 01:51 PM

Originally posted by Anux View Post

If that were the case we would have seen the same on the 4K samples.

Then again, why only 1080p and not 4K. It looks more like an error at labeling.

At 4K encoding it can scale better to higher core counts.

No errors ib labeling, it's all automated.

Announcement

Following Retbleed, The Combined CPU Security Mitigation Impact For AMD Zen 2 / Ryzen 9 3950X

Following Retbleed, The Combined CPU Security Mitigation Impact For AMD Zen 2 / Ryzen 9 3950X

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment