Page 1 of 2 12 LastLast
Results 1 to 10 of 25

Thread: Optimizing Mesa Performance With Compiler Flags

Hybrid View

  1. #1
    Join Date
    Jan 2007
    Posts
    14,378

    Default Optimizing Mesa Performance With Compiler Flags

    Phoronix: Optimizing Mesa Performance With Compiler Flags

    Compiler tuning can lead to performance improvements for many computational benchmarks by toying with the CFLAGS/CXXFLAGS, but is there much gain out of optimizing your Mesa build? Here's some benchmark results...

    http://www.phoronix.com/vr.php?view=MTI4NTY

  2. #2

    Default

    I would be very interested in how the -O1, -O2, -O3 compares to -Os (optimize for size). When code is smaller you get fewer cache misses which leads to faster execution. Ruby is known to run faster with -Os.

  3. #3
    Join Date
    Mar 2011
    Posts
    90

    Default

    Quote Originally Posted by ncopa View Post
    I would be very interested in how the -O1, -O2, -O3 compares to -Os (optimize for size). When code is smaller you get fewer cache misses which leads to faster execution. Ruby is known to run faster with -Os.
    my friend please provide a bench or too to back that claim up. I have found no tests for ruby using -0s. I would welcome a link for those tests.
    PS In that search I found the "Falcon patch" which regardless of the flags made ruby much faster

  4. #4
    Join Date
    Jun 2012
    Posts
    8

    Default

    I guess the bottleneck of most videogames is not OpenGL, unless the game is designed for high-end graphics card. Check this with any profiler: gl... calls are almost unnoticeable amoung game physics and logic. Compiling the actual software and main libraries instead of driver could give a very different result.

  5. #5
    Join Date
    Apr 2011
    Posts
    35

    Default

    so the flags do exactly what the manpage says: -O2 is a good, stable optimization, while -O3 needs more compile time and may or may not improve the resulting binary so it is mostly a waste of energy and time (except you like playing and consider compiling Linux with all flag permutations as a game). I would only enable it for single applications if I am not satisfied with -O2 (it seemed that ffmpeg gained a little performance from -O3 but I did not benchmark this).
    In my experience in most cases -O3 does not improve the performance noticably (like in the article) and additionally the -Os and -O3 flags can break programs because of unpredicted segfaults.
    So the only compile flags I use for years are -march=..., -O2 and for gcc: -pipe
    For software it is better anyways to use efficient algorithms to solve a problem, no compiler optimization can improve an exponential algorithm into a linear one, it just creates a little better exponential code (or not).

  6. #6
    Join Date
    Feb 2008
    Location
    Linuxland
    Posts
    4,995

    Default

    There are quite big differences between O2 and O3 with some software, especially if it's C++ with templates.

    Bullet physics was close to 10x slower with O2, same result with Os, when compared to O3 last I tested.

  7. #7
    Join Date
    Aug 2011
    Location
    Hillsboro, Oregon
    Posts
    129

    Default

    Quote Originally Posted by Lockal View Post
    I guess the bottleneck of most videogames is not OpenGL, unless the game is designed for high-end graphics card. Check this with any profiler: gl... calls are almost unnoticeable amoung game physics and logic. Compiling the actual software and main libraries instead of driver could give a very different result.
    Not in my experience. I've run a lot of benchmarks and games, and 'sysprof' often shows that _mesa_* calls (which are the actual implementation of the gl* calls) are a very noticable percentage.

  8. #8
    Join Date
    Feb 2008
    Location
    Linuxland
    Posts
    4,995

    Default

    I've always* built Mesa with -O3 and not once had an issue that was because of that.

    * not built git in the last 3-4 months since it requires newer autofoo and I'm too lazy.

  9. #9

    Default

    Quote Originally Posted by ryszardzonk View Post
    my friend please provide a bench or too to back that claim up.
    Use the link in my post. (click on "known").

  10. #10

    Default

    Quote Originally Posted by ncopa View Post
    I would be very interested in how the -O1, -O2, -O3 compares to -Os (optimize for size). When code is smaller you get fewer cache misses which leads to faster execution. Ruby is known to run faster with -Os.
    It does not matter if this code is not a bottleneck.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •