Announcement

Collapse
No announcement yet.

OpenMandriva Appears To Be Experimenting With Profile Guided Optimizations

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • OpenMandriva Appears To Be Experimenting With Profile Guided Optimizations

    Phoronix: OpenMandriva Appears To Be Experimenting With Profile Guided Optimizations

    OpenMandriva has been toying with some performance optimizations in recent times like preferring the LLVM Clang compiler over GCC, spinning an AMD Zen "znver1" optimized version of the OS/packages, and apparently now exploring possible Profile Guided Optimizations...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    OpenMandriva already uses LTO almost everywhere -- PGO is more tricky because it's usually hard to get profile data that matches real world use cases (for compressors/decompressors, it's fairly easy -- but PGO for a package like Qt (usually used interactively) or Mesa (generate profile data by replaying a scene from game X and game X will work great -- but you may hurt game Y and CAD application Z a lot...) is a lot harder.
    We're starting by adding PGO to packages where it's easy to do without much risk of regressions - but of course will try to get beyond the obvious ones.

    Comment


    • #3
      Originally posted by berolinux View Post
      OpenMandriva already uses LTO almost everywhere -- PGO is more tricky because it's usually hard to get profile data that matches real world use cases (for compressors/decompressors, it's fairly easy -- but PGO for a package like Qt (usually used interactively) or Mesa (generate profile data by replaying a scene from game X and game X will work great -- but you may hurt game Y and CAD application Z a lot...) is a lot harder.
      We're starting by adding PGO to packages where it's easy to do without much risk of regressions - but of course will try to get beyond the obvious ones.
      I've always been curious about PGO and how they'd generate profile data for some scenarios. All I know is PGO won't let me compile Firefox on /tmp. Made it eat up 30GB of space...really thought I covered my ass with a 30GB /tmp.

      You think some optimizing freak will read your post and think "I wonder if I can compile Mesa using PGO and scenes Dirt Rally and have a DR optimized Mesa". Guess what game I'm playing. My gods...just imagine if that actually worked well and all the freakin benchmarks Michael would have to run...not to mention bug reports like: We've determined that Mesa 19.2.4 with F1 2018 PGO optimizations was the cause of the new Witcher 3 hair bug.

      Comment


      • #4
        Originally posted by skeevy420 View Post
        I've always been curious about PGO and how they'd generate profile data for some scenarios. All I know is PGO won't let me compile Firefox on /tmp. Made it eat up 30GB of space...really thought I covered my ass with a 30GB /tmp.
        Yes, PGO data can get pretty big -- it has to record everything being done after all.

        Originally posted by skeevy420 View Post
        You think some optimizing freak will read your post and think "I wonder if I can compile Mesa using PGO and scenes Dirt Rally and have a DR optimized Mesa".
        That's doable if you have plenty of storage. I haven't tried this yet, but it should be pretty much the same we're doing for mozjpeg (where we record what it takes to decode and encode KDE desktop backgrounds -- should be enough variance to teach it about both photos and artificial patterns).

        If you're using clang (it's similar, but not 100% the same, with gcc):
        Build the version of Mesa you want to use with -fprofile-instr-generate
        Install it (or put it in LD_LIBRARY_PATH to make sure it's used instead of system mesa)
        export LLVM_PROFILE_FILE=code-%p.profclangr
        Run the game and play
        Quit
        unset LLVM_PROFILE_FILE
        llvm-profdata merge --output=code.profclangd *.profclangr
        Rebuild Mesa again, this time with -fprofile-instr-use=/path/to/code.profclangd instead of -fprofile-instr-generate
        Install it

        I'd definitely be interested in knowing if it made a difference at all (I'd guess it won't make that much of a difference given most of the rendering will happen in the GPU anyway), and if [or to what extent] it hurts performance with other games, desktops or CAD applications.

        Originally posted by skeevy420 View Post
        My gods...just imagine if that actually worked well and all the freakin benchmarks Michael would have to run...not to mention bug reports like: We've determined that Mesa 19.2.4 with F1 2018 PGO optimizations was the cause of the new Witcher 3 hair bug.
        And someone will hack up a distro PGO-ed for Phoronix Test Suite (or other specific benchmarks) instead of the real world in 5... 4... 3... 2...

        Comment


        • #5
          Originally posted by berolinux View Post
          And someone will hack up a distro PGO-ed for Phoronix Test Suite (or other specific benchmarks) instead of the real world in 5... 4... 3... 2...
          You're being paranoid, sorry and no offence.

          Chances are anyone who would do this might actually produce a fast distro as a result, because PTS tests a lot of different components using multiple benchmarks. If you wanted to cheat here then you'd have to modify a lot of your distro, which may actually benefit many other use cases as well. The compilers themselves use heuristics and make guesses for code paths that haven't been taken to still produce some optimisation. Many of PTS's benchmarks try to give a good coverage of the things they measure and to cheat there without actually giving something useful back should be increasingly difficult. PTS just isn't one benchmark, it's a whole lot of them really.

          Comment


          • #6
            Originally posted by sdack View Post
            You're being paranoid, sorry and no offence.

            Chances are anyone who would do this might actually produce a fast distro as a result, because PTS tests a lot of different components using multiple benchmarks. If you wanted to cheat here then you'd have to modify a lot of your distro, which may actually benefit many other use cases as well. The compilers themselves use heuristics and make guesses for code paths that haven't been taken to still produce some optimisation. Many of PTS's benchmarks try to give a good coverage of the things they measure and to cheat there without actually giving something useful back should be increasingly difficult. PTS just isn't one benchmark, it's a whole lot of them really.
            ~~~~~~~~~~~~~~~~~The Joke~~~~~~~~~~~~~~~~~>>


            You

            Comment


            • #7
              Some packages in SUSE's Tumbleweed are also built with profile feedback (clearly not Firefox and Chrome as noticed in earlier post because the build system prevents them from starting which will hopefully be solved soon). These are mostly things that are easy to train and optimize overall build time of the distro. Like bash, python, etc.

              Comment

              Working...
              X