Announcement

Collapse
No announcement yet.

GCC 12 Enables Auto-Vectorization For -O2 Optimization Level

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GCC 12 Enables Auto-Vectorization For -O2 Optimization Level

    Phoronix: GCC 12 Enables Auto-Vectorization For -O2 Optimization Level

    A month ago was talk of GCC developers enabling the vectorizer at the common "-O2" optimization level and now that change has landed into the GCC 12 development code-base...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    I wonder if this change will be backported into previous GCC stables. Unless the capability wasn't there before, but would be cool to improve -O2 for all of the previous GCC 8-11s versions.

    Comment


    • #3
      Originally posted by perpetually high View Post
      I wonder if this change will be backported into previous GCC stables. Unless the capability wasn't there before, but would be cool to improve -O2 for all of the previous GCC 8-11s versions.
      That's unlikely to happen, as this option changes the behavior significantly.
      Reading the commit, this commits enabled the following extra arguments when doing -O2:
      Code:
      -ftree-loop-vectorize -ftree-slp-vectorize -fvect-cost-model=very-cheap
      You can always manually enable it.

      Comment


      • #4
        Originally posted by maarten View Post
        That's unlikely to happen, as this option changes the behavior significantly.
        Yeah, backports are usually done only for regressions and wrong code bugs. Not for slight performance improvements.

        Reading the commit, this commits enabled the following extra arguments when doing -O2:
        Code:
        -ftree-loop-vectorize -ftree-slp-vectorize -fvect-cost-model=very-cheap
        You can always manually enable it.
        Indeed. And if you care that much about smallish performance improvements, you have likely already figured out a good set of compile options for your application on your hardware.

        Comment


        • #5
          Originally posted by perpetually high View Post
          I wonder if this change will be backported into previous GCC stables. Unless the capability wasn't there before, but would be cool to improve -O2 for all of the previous GCC 8-11s versions.
          There may have been some changes to the vectorization code since some of those releases. Too much for me to look through to see if that's true, but if it is, just backporting this change won't be enough :P

          Comment


          • #6
            I've been using "-O2 -ftree-vectorize" for many years now. Just use that. There's no reason to stick to the "cheap model" from what I can tell, unless you really, really care about every last millisecond in compile times.

            Comment


            • #7
              Too bad it will mostly only work on integer code, without also enabling some flexibility in how floating points are handled similar to what icc does by default. fast-math is too invasive for a default, but we could atleast have -fno-math-errno -fno-rounding-math , which just disables some rarely used global variables, and especially -fno-signed-zeros and -fno-trapping-math which disables stuff not really accessible by C/C++ anyway. Those features break most optimizating operations on FP. Though you could argue it is the Intel compiler that is being wrong since at least some of those changes official C/C++ behavior.
              Last edited by carewolf; 09 October 2021, 03:42 AM.

              Comment


              • #8
                Originally posted by RealNC View Post
                I've been using "-O2 -ftree-vectorize" for many years now. Just use that. There's no reason to stick to the "cheap model" from what I can tell, unless you really, really care about every last millisecond in compile times.
                I was wondering what applications would see a noticeable improvement?

                Comment


                • #9
                  Originally posted by cl333r View Post

                  I was wondering what applications would see a noticeable improvement?
                  If there was only a website in the world that did benchmarks

                  Comment


                  • #10
                    Originally posted by RealNC View Post
                    I've been using "-O2 -ftree-vectorize" for many years now. Just use that. There's no reason to stick to the "cheap model" from what I can tell, unless you really, really care about every last millisecond in compile times.
                    This option makes the binary code significantly fatter while not always making it faster. I used it many years ago, then actually tested a number of applications, found no significant improvements in most of them and disabled it for good.

                    Comment

                    Working...
                    X