Page 2 of 2 FirstFirst 12
Results 11 to 19 of 19

Thread: GCC 4.8 Release Brings Improved C++11, Optimizations

  1. #11

    Default

    Quote Originally Posted by fenrus View Post
    eh how?
    if you think that the compiler can insert a better prefetchw than the hardware prefetchers.. please speak up with an example...


    (disclaimer: I work for Intel on Linux, and also have my own hobby OS that I build in my spare time... I look at compilers and compiler options a lot ;-) )
    I think it is hypothetically possible that the GCC authors might create an optimization pass that uses prefetchw in a useful way at some point in the future. However, I have no example code. If any such code existed, I imagine that Intel would use it to improve the design of their next chip. That is the fate of all such microarchitecture-specific optimizations.

    With that said, I doubt that it is possible for anyone outside of Intel to produce such code for unreleased products without assistance from Intel in the form of either an engineering sample of the chip or extremely accurate emulation software. You likely knew that though.

  2. #12
    Join Date
    Jul 2008
    Location
    Berlin, Germany
    Posts
    822

    Default

    Quote Originally Posted by ryao View Post
    the only people that will benefit from it are those building software themselves (i.e. Gentoo users).
    Not necessarily, you can build several code paths and switch at runtime between them.
    Also some Ubuntu users have recognized that you can get dramatic performance increases in certain situations by rebuilding specific packages optimized for their CPU with apt-build:

    (Gentoo is running in a VM there so the results are not 100% comparable)

  3. #13
    Join Date
    Mar 2013
    Posts
    21

    Default

    Quote Originally Posted by chithanh View Post
    Not necessarily, you can build several code paths and switch at runtime between them.
    Also some Ubuntu users have recognized that you can get dramatic performance increases in certain situations by rebuilding specific packages optimized for their CPU with apt-build:

    (Gentoo is running in a VM there so the results are not 100% comparable)

    oh compiling with a new enough CPU is a huge gain at times (just see the graphs of my distro that I posted earlier in this thread).
    prefetchw or prefetch are not part of that however.

  4. #14
    Join Date
    Mar 2013
    Posts
    21

    Default

    Quote Originally Posted by chithanh View Post
    Not necessarily, you can build several code paths and switch at runtime between them.
    Also some Ubuntu users have recognized that you can get dramatic performance increases in certain situations by rebuilding specific packages optimized for their CPU with apt-build:

    (Gentoo is running in a VM there so the results are not 100% comparable)
    there is absolutely room for real performance improvements using compiler flags (see url I posted way earlier in the thread).
    prefetch/prefetchw is not part of that however.

  5. #15

    Default

    Quote Originally Posted by chithanh View Post
    Not necessarily, you can build several code paths and switch at runtime between them.
    As far as I know, GCC does not support that. ICC does though.

    Quote Originally Posted by fenrus View Post
    there is absolutely room for real performance improvements using compiler flags (see url I posted way earlier in the thread).
    prefetch/prefetchw is not part of that however.
    You cannot prove that. Say that I had a program that could solve NP-complete problems in polynomial time and a description of Haswell. I am certain that the two could be combined to find ways to make programs perform better on Haswell with the use of prefetchw.

    With that said, it has been demonstrated that doing tricks with prefetching can improve performance in certain areas:

    http://arstechnica.com/business/2012...o-collaborate/

  6. #16
    Join Date
    Feb 2008
    Location
    Linuxland
    Posts
    5,103

    Default

    Not sure about compiler adding it, but I have gotten a 10% increase in throughput by adding select few __builtin_prefetch's in my code manually.

  7. #17
    Join Date
    Jul 2008
    Location
    Berlin, Germany
    Posts
    822

    Default

    Quote Originally Posted by ryao View Post
    Quote Originally Posted by chithanh View Post
    you can build several code paths and switch at runtime between them.
    As far as I know, GCC does not support that. ICC does though.
    mplayer for example supports runtime CPU detection even with gcc. It is less optimal than building for a specific CPU but still.

  8. #18
    Join Date
    Nov 2012
    Posts
    164

    Default

    Quote Originally Posted by ryao View Post
    I think it is hypothetically possible that the GCC authors might create an optimization pass that uses prefetchw in a useful way at some point in the future. However, I have no example code. If any such code existed, I imagine that Intel would use it to improve the design of their next chip. That is the fate of all such microarchitecture-specific optimizations.
    That code already exists. Remember AMD chips has had a prefetch instruction since K6, and so has most non-x86 architectures. This is not new stuff, it is only new for Intel CPU.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •