Results 1 to 10 of 10

Thread: AMD Piledriver/Trinity A10-5800K Compiler Tuning

  1. #1
    Join Date
    Jan 2007
    Posts
    14,359

    Default AMD Piledriver/Trinity A10-5800K Compiler Tuning

    Phoronix: AMD Piledriver/Trinity A10-5800K Compiler Tuning

    With the initial Linux results for the AMD A10-5800K Trinity APU now out of the way along with the Radeon HD 7660D graphics performance, in this article are some benchmarks looking at the impact of compiler tuning for the Piledriver cores using the common GCC compiler and testing different CPU micro-architecture targets.

    http://www.phoronix.com/vr.php?view=18002

  2. #2
    Join Date
    Jan 2012
    Posts
    43

    Lightbulb Inline assembly?

    Does anyone know how many of the programs tested in the article use inline assembly? I'm not horribly familiar with any of them. That would surely taint the meaningfullness of testing different compiler switches--when the core of the code is an 'as given' assembly blob.

  3. #3

    Default

    i would like to see a -march=native flag in there. it would show if gcc is correctly detecting the CPU, but would also show if the addition information about cache size and layout helped.

    it would be good if developers tried harder to help the compiler auto vectorise, rather than putting in their own assembly. that way the code would automatically benefit on new architectures. http://locklessinc.com/articles/vectorize/ has some examples of hind that can be given.

  4. #4
    Join Date
    Feb 2012
    Posts
    70

    Default

    Are there any actual software that can actually positively use FMA3 ? AFAIK, scientific software can possibly use FMA3, but i havent seen any real world example.

  5. #5
    Join Date
    Jan 2012
    Posts
    43

    Default

    Quote Originally Posted by mayankleoboy1 View Post
    Are there any actual software that can actually positively use FMA3 ? AFAIK, scientific software can possibly use FMA3, but i havent seen any real world example.
    Any time you do A=A+B*C, you can benefit from FMA3. That's pretty common in any matrix math--which is used heavily in graphics as well. All FFTs can benefit from FMA3 as well. I wonder if any of these programs link to libraries that could benefit from FMA3. We might not really be seeing the full effect of these different compiler settings if the libraries aren't making use of it as well.

    Someone correct me if I'm wrong, but x86, SSE, and AVX all have separate registers, right? So, any code mixing SSE (say, from a library) and AVX (from the calling program) will hit a register copy penalty.

  6. #6
    Join Date
    Oct 2008
    Location
    Romania+Finland
    Posts
    49

    Default

    As a simple user, how can I relate what I see in this benchmark to a common distribution -- e.g. the latest Ubuntu? Are the Ubuntu binaries built with any of the benchmarked CPU targets? Since I'm planning to buy an A10 5800 the exact same day when it will arrive in my town, it would be interesting to know this to understand what difference it would make if I could compile the binaries for my (future) CPU.

  7. #7
    Join Date
    Jun 2009
    Posts
    2,926

    Default

    Ubuntu binaries are compiled to run on most processors, pretty generic stuff, with generic optimisations. If you want system-wide improvements, you'll have to compile your own system, Gentoo or Arch style, but in reality, the practical gains of doing this are moderate.

    What you CAN do is compile specific software that you need to optimize, like your scientific software, or video encoder or something similar that's processor-intensive. This is really worth doing.

  8. #8

    Default

    Quote Originally Posted by geamandura View Post
    As a simple user, how can I relate what I see in this benchmark to a common distribution -- e.g. the latest Ubuntu? Are the Ubuntu binaries built with any of the benchmarked CPU targets? Since I'm planning to buy an A10 5800 the exact same day when it will arrive in my town, it would be interesting to know this to understand what difference it would make if I could compile the binaries for my (future) CPU.
    most binary distros are quite conservative with build options, so wont turn on most of these optimisations. the 64bit editions will generally run on any x86-64 CPU, so the most you can assume is SSE2. for debian based systems there is apt-build that can rebuild packages.

    if you are interested in rebuilding lots of packages, then you might want to look at gentoo (or a derivative). but be aware that gentoo stable currently has GCC 4.5, and only 4.6 in unstable. gcc 4.7 is hard masked ( http://packages.gentoo.org/package/sys-devel/gcc ).

  9. #9
    Join Date
    Mar 2009
    Location
    in front of my box :p
    Posts
    769

    Default

    Good news for Gentoo users I think.
    And it is interesting to see that it help pretty much on most scenarios while some others don't seem to be influenced.

  10. #10
    Join Date
    Sep 2007
    Posts
    294

    Default generic tuning?

    Hello!

    @Michael: How about comparing these results with the results of a run with -mtune=generic, which is uses in standard distributions? That way one might get a glimpse how well the bulldozer will perform there.

    Best,

    Olaf

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •