Announcement

**scottishduck** · 25 June 2021, 12:07 PM

It appears that coding a compiler sensibly leads to it acting far more predictably

**loganj** · 25 June 2021, 01:34 PM

"Intel Tiger Lake-H chipset"?

**rene** · 25 June 2021, 04:09 PM

but -Os and -Oz? ;-)

**intelfx** · 25 June 2021, 06:00 PM

Yay, meme flags.

**skeevy420** · 25 June 2021, 07:19 PM

The various -O2 results have me wondering where -Os and -Oz with -march=native and -flto would stack up with the rest and if the old 90s and 00s anecdote of "smaller targeted binaries are faster overall" still trends true or if modern CPUs having more cache renders those settings moot.

Overall these results are what I'd expect to see based on their names and descriptions. I like it when everything works out like that.

**Snaipersky** · 25 June 2021, 08:45 PM

Originally posted by skeevy420 View Post

The various -O2 results have me wondering where -Os and -Oz with -march=native and -flto would stack up with the rest and if the old 90s and 00s anecdote of "smaller targeted binaries are faster overall" still trends true or if modern CPUs having more cache renders those settings moot.

Overall these results are what I'd expect to see based on their names and descriptions. I like it when everything works out like that.

Alpine Linux and Void had significant performance benefits owing to their smaller binaries, but the performance delta was markedly reduced on high-cache processors.

**coder** · 25 June 2021, 11:52 PM

Originally posted by loganj View Post

"Intel Tiger Lake-H chipset"?

Good catch. The H-series laptop chips need an external southbridge, similar to desktop chips. My guess is that, for Tiger Lake H, Intel just reused the same southbridge that some Rocket Lake boards use, and that explains why it got detected as such.

**coder** · 25 June 2021, 11:59 PM

Originally posted by skeevy420 View Post

The various -O2 results have me wondering where -Os and -Oz with -march=native and -flto would stack up with the rest and if the old 90s and 00s anecdote of "smaller targeted binaries are faster overall" still trends true or if modern CPUs having more cache renders those settings moot.

Not all benchmarks put equal pressure on instruction cache. In cases that are more limited by it, perhaps you could get a net benefit with that combination.

However, in cases where the hotspots are dominated by a small number of loops, aggressive inlining, unrolling, and vectorization is going to be the winning strategy.

**DanglingPointer** · 26 June 2021, 02:01 AM

Would be good to combine the Clang-12 and GCC-11 results.
Also some sort of a final mean in the end on mean winner and first places winner.

Announcement

LLVM Clang 12 Benchmarks At Varying Optimization Levels, LTO

LLVM Clang 12 Benchmarks At Varying Optimization Levels, LTO

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment