[QUOTE=XorEaxEax;155688]While these tests are great (kudos Phoronix!) it's unfortunate that they don't test some of the more advanced optimizations that has come during the later releases. While testing PGO (profile-guided optimization) would be a bit unfair since Clang/LLVM doesn't have this optimization.../QUOTE]
How would that be unfair? What's the point in comparing either compiler with anything less than it's strongest capabilities? If Clang/LLVM doesn't do PGO, that's their problem, nobody elses...
The issue with testing PGO is that you have to train the application, which can introduce all sorts of complications into testing. Ideally, the test framework itself would be able to script something but that's a lot of work.
Originally Posted by XorEaxEax
Yes, the downside with PGO is that it's not just adding another flag and away we go. It needs to gather necessary data about how the program runs which means it's a two-stage process. First you compile it using -fprofile-generate which inserts alot of information-gathering code into your program, you then run the program and try to touch as much parts of the code as possible (not like going through every level in a game but rather to make sure different parts of the code are executed), once you exit from the compiled program it will dump all the gathered data into files which are then used in the second (final) stage of compilation (-fprofile-use). Here all the gathered data provides a plethora of information for the compiler to use when judging what/when and how to optimize.
From my experience PGO usually brings ~10-20% performance increase on cpu intensive code which is a real fine boon, but the two stage-compilation process makes it a non-trivial optimization to use. Hence it's most often applied on projects that really need all the performance they can get, encoders, compressors, emulators etc.
And like Smitty said, if you plan on routinely using you should likely make a script to automate it, I know projects like Firefox and x264 does this.
Usually -O3 will lose to -O2 when there are only a megabyte or two of L2 and L3 cache. If the L2 and L3 cache are say 128KB, then not only will -O3 lose to -O2, but -O2 will lose to -Os.
Originally Posted by XorEaxEax
I agree, that would be very interesting. The problem though is that they target different architectures.
Originally Posted by Yezu
I have no experience with CodeWarrior, but they seem to target embedded platforms.
I would add PathScale to the list, they have x86 compilers, and used to be our favourite in the past with AMD systems. But lately it is all Intel.
Also the IBM compilers should be good, but not available for x86. So assuming you have access to a POWER or PowerPC machine, you can only compare it to GCC on the same machine.
GCC 4.6 compile times
Maybe I missed it, but I didn't see any mention of using --enable-checking=release or --disable-checking for the GCC 4.6 snapshot build.
By default snapshots have lots of checks, which make compile time MUCH slower. Those checks are disabled for releases. That's presumably the equivalent of Clang's --disable-assertions
If you didn't build GCC 4.6 without checking, that would definitely explain the slow compile times for 4.6
I was also wondering if that was the case, those compile times just seem to out of whack otherwise.
Originally Posted by redi
i tested gcc 4.3, 4.4 and 4.5 (shortly before its release) for a fortran code i use. -O3 beats -O2, and there is a trend of improvement between releases.
Tags for this Thread