Announcement

**microcode** · 26 September 2017, 02:20 PM

Somebody really needs to set up global regression tracking for GCC, it seems they often don't notice some regressions on common open source cases until after release. It'd be nice to have something like the most popular couple hundred packages of a distro built daily, with some automated test for each.

**willmore** · 26 September 2017, 02:29 PM

What's up with BLAKE? That's a 2x difference and doesn't vary within a compiler family. Either GCC is completly missing a trick or there's an ASM switch that didn't get set right for the GCC case.

**tildearrow** · 26 September 2017, 02:36 PM

A question: Did x264 make use of all 32 cores?

Also,

Originally posted by phoronix View Post

In other instances, optimizing for Jaguar (btver2) provides most of the benefit with the Zen optimizations (znver1) providing just a slight increase.

Do you mean decrease? Compared to opteron-sse3 it's still a huge increase to me...

**oooverclocker** · 26 September 2017, 03:25 PM

Originally posted by microcode View Post

Somebody really needs to set up global regression tracking for GCC, it seems they often don't notice some regressions on common open source cases until after release. It'd be nice to have something like the most popular couple hundred packages of a distro built daily, with some automated test for each.

Well they usually should have such automated tests. I also think that it is really weird not to notice these regressions when you change code... And these are just simple benchmarks lol

**willmore** · 26 September 2017, 03:40 PM

Originally posted by oooverclocker View Post

Well they usually should have such automated tests. I also think that it is really weird not to notice these regressions when you change code... And these are just simple benchmarks lol

There are a lot of simple benchmarks, though. But, yeah, they could do better to do performance regression testing. If performance isn't a concern, then what's the point of working on the compiler anymore? Bug fixes? New language support? I guess.

**jrch2k8** · 26 September 2017, 03:42 PM

Originally posted by tildearrow View Post

A question: Did x264 make use of all 32 cores?

Also,

Do you mean decrease? Compared to opteron-sse3 it's still a huge increase to me...

from what I read on other reviews x264 and specially x265 start showing bad scaling after 16+ threads but I'm not actually 100% sure since the scaling depends a lot of what options you use in the conversion

**sdack** · 26 September 2017, 04:29 PM

Originally posted by willmore View Post

What's up with BLAKE? That's a 2x difference and doesn't vary within a compiler family. Either GCC is completly missing a trick or there's an ASM switch that didn't get set right for the GCC case.

I cannot tell you what's up with BLAKE, but I know from experience that the compilers can produce code, which just happens to be fast, but wasn't specifically intended to be this way. The compilers simply don't yet cover all the aspects of a CPU and thus cannot absolutely and perfectly optimize your code. It's then a matter of chance when a resulting code happens to be very fast or only suboptimal.

To give an example... So can one core within a module of a CPU sometimes utilize a second unit, which it shares with another core of the same module. I.e. when two cores share two integer units can one core dispatch more integer operations per clock when the other core is idle. This makes it difficult to come up with an instruction scheduling algorithm for a compiler, because the instruction times are less deterministic and predictable.

The trend to more complexity is also getting worse. So has AMD introduced neural networking to branch prediction and compiler developers then either have to implement the same mechanism in their compilers to predict its behaviour or, until it's been implemented, to pray and to hope it works in their favour.

**cb88** · 26 September 2017, 04:48 PM

People complain about h246 and h265 encoder scaling problems... why not just transcode 32 videos at once... I mean sure if you are editing one video at a time is a pretty common case, but if you are bulk transcoding you should transcode multiple videos at once rather than try to make something that tends to be single threaded split across 32 cores.

**eltomito** · 26 September 2017, 04:53 PM

When I watch a llvm vs gcc match, I'm totally rooting for gcc without any rational reason. I guess it's just because GCC RULES! YEAH! I wanna blow a vuvuzela for gcc!

Announcement

GCC & LLVM Clang Compiler Benchmarks On AMD's EPYC 7601

GCC & LLVM Clang Compiler Benchmarks On AMD's EPYC 7601

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment