
Originally Posted by
russofris
As many gentoo users have found, compiler flags generally have one of the following effects.
1: No effect at all
2: The compiled binary does not work
3: The compiled binary is slower
4: The compiled binary is faster
In addition, performance increases often require certain combinations of compiler flags, making the tweak more complex than adding "-march"
I once had a bash script for a number of CLI binaries (notably ffmpeg, faac, flac) which would iterate through cflag combinations and compilers (gcc versus icc) After each iteration, the script would run an automated benchmark on the resulting binary. Results were dumped to a file and sorted. The issue that I ran into was that the results would change depending on factors such as platform arch, available memory, and CPU affinity. Other issues involved (pre)linking and libraries, killing misbehaving binaries, memory reclamation, etc.
Overall, a system with local optimizations performed approximately 50% better on average than a generic "-o2" solution. The problem is that you will never be able to find and fix all of the minor issues caused by the optimizations across all binaries to a level required by a distro. My conclusion was that compiler optimizations are of great benefit to single-task servers (a transcoding server in my case), but are currently out of reach for a general desktop.
Then I bought a Mac....