Announcement

Collapse
No announcement yet.

The Performance Cost To A Proposed Fedora 37 CFLAGS/CXXFLAGS Change

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • The Performance Cost To A Proposed Fedora 37 CFLAGS/CXXFLAGS Change

    Phoronix: The Performance Cost To A Proposed Fedora 37 CFLAGS/CXXFLAGS Change

    Coming about last week was a Fedora 37 change proposal to improve the profiling and debugging of Fedora packages but with possible performance costs. That suggested change is about adding "-fno-omit-frame-pointer" to the default CFLAGS/CXXFLAGS when building packages so the frame pointer is always available for improving the debugging/profiling of the stock Fedora packages. Unfortunately, it can come with significant performance costs as these benchmarks show.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    now, if we Just in Time or Ahead of Time optimize for high performance vs. debugging and available SIMD variants (e.g. via pre combined and partially pre-optized bitcode) locally instead of statically compiling most software for some least common denominator, ... https://www.youtube.com/watch?v=N09IZvuyWB4

    Comment


    • #3
      *Starts looking at benchmarks*
      "Well that's not so bad"
      *Scrolls down*
      "Jeez"
      *Clicks next page*
      "Oh shit"
      *Clicks next page*
      "I wonder if '-O3 -fno-omit-frame-pointer' would 'mitigate' that?"

      Comment


      • #4
        Originally posted by rene View Post
        now, if we Just in Time or Ahead of Time optimize for high performance vs. debugging and available SIMD variants (e.g. via pre combined and partially pre-optized bitcode) locally instead of statically compiling most software for some least common denominator, ... https://www.youtube.com/watch?v=N09IZvuyWB4
        Personally I think hwcaps is a better approach. Just offer more "least common denominator" variants for relevant software. It just needs to reach (more) Linux distributions.

        Everything else is just too much bloat and overhead.

        Comment


        • #5
          Implement ORC or parse DWARF in the kernel. Performance profiling is not something 98% of users would need, and the 2% do not deserve the greenhouse emissions the 14% perf hit would cause.

          Comment


          • #6
            Originally posted by -MacNuke- View Post

            Personally I think hwcaps is a better approach. Just offer more "least common denominator" variants for relevant software. It just needs to reach (more) Linux distributions.

            Everything else is just too much bloat and overhead.
            Well, few will ship hwcap variants. They are also pretty coarse, and do not even cover cases like this. With some Scalable Vector byte code it could even be forward compatible to optimize vector code even for next years SIMD extensions just like RISCV-V and ARM SVE do in hardware.

            Comment


            • #7
              Originally posted by rene View Post
              Well, few will ship hwcap variants. They are also pretty coarse, and do not even cover cases like this. With some Scalable Vector byte code it could even be forward compatible to optimize vector code even for next years SIMD extensions just like RISCV-V and ARM SVE do in hardware.
              We are talking about open source software here. There is not need to be forward compatible. If something new comes, just recompile.

              JIT and AOT are not perfect either. You assume the bytecode generator to be perfect. You assume the bytecode fits the target architecture perfectly. You assume the JIT and/or AOT phase to be so fast that the performance benefit does not eat itself. You assume the resulting code is faster than compiled directly.

              Now you start up a Linux distribution and sitting there waiting for it to be compiled. For every update. You start up a container and sitting there again for it to be compiled... for every container base image variant and container system (docker, flatpak, snap, nspawn, ...) and image update. Your compiler toolchain updates -> Compile everything again.

              Yeah no... too many ifs, too much runtime bloat, too much overhead. Just use Gentoo if you want something like that.

              Comment


              • #8
                Just... WHY?

                You have release builds and you have debug builds. And a whole range in between if need be.
                What you push down the throat of your users should be release builds with the best possible performance. Adding a single flag to make debugging easier is just a stupid argument. If you want to debug, well, recompile with those flags.. It's that simple.

                Glad I've left the fedora bandwagon many years ago!

                Comment


                • #9
                  I don't mind some performance cost when there is an actual, practical reason behind it but this is just bonkers. What exactly is FB doing that they need to profile the entire userspace? Sampling profilers have limited accuracy anyway and if your code spends most of the time in external libraries, then you are either doing it right or you need to switch to better libraries.

                  Comment


                  • #10
                    Are they out of their minds?

                    "Fedora will add -fno-omit-frame-pointer to the default C/C++ compilation flags, which will improve the effectiveness of profiling and debugging tools."

                    You wanna profile and debug - build a debug package. End users shouldn't suffer and get their systems slowed down because some developer has an itch.

                    You may as well build everything with `-O -g` and don't strip debug symbols. What? Too much bloat? Too slow? Who cares! Debugging all the way!

                    Comment

                    Working...
                    X