I talked with Dark Shikari on #x264 about the results. He said A.) PGO would help more with the hand asm, since it apparently does not benefit what the pure C build spends most of its time doing (DSP functions), and B.) I screwed something up with the PGO, because it should be ~1% faster. I did the build correctly from what I can tell (make fprofiled VIDS="videohere.y4m"), but I couldn't be be bothered to recompile, retest, etc. etc. to confirm a ~1% performance increase
Originally Posted by XorEaxEax
Weird, why would PGO help more with the asm version? Should be the ecaxt opposite imo since the compiler can't optimize that hand-written assembly in any way, but it should atleast be able to do some optimizing with the c code. Anyway I can understand why you wouldn't want to redo the tests since the argument was regarding hand-optimized assembly vs compiler generated code, maybe I'll give it a shot myself since I'm curious as to what Shikari said. Again thanks for the benchmarks!
I believe this test are needed to be rebuild with recent gcc and clang versions like
gcc4.7 vs clang3.2
gcc4.8svn vs clang3.3svn
I wonder the results.
Tags for this Thread