Following last month's initial benchmarks of the AMD "znver3" support that landed in the GCC 11 compiler was a request by a premium supporter to see the AMD Zen 3 benchmarks at more compiler optimizations. Well, here are those numbers for those wanting to pursue aggressive compiler optimizations on a shiny AMD Ryzen 9 5950X.
With this week marking three years since Spectre and Meltdown were made public in ushering in a wave of CPU security disclosures that followed and mitigations that often resulted in measurable performance hits, here is a look at how the performance costs stand today with various new and older Intel CPUs as well as AMD processors too. This article is looking at the current performance costs under Linux with the default mitigations and then the run-time disabling of the relevant mitigations for each of the processors under test while using an up-to-date Ubuntu 20.10 paired with the new Linux 5.10 LTS kernel.
Phoronix Test Suite 10.2 is available today as the latest quarterly (Q1-2021) feature update to our open-source automated benchmarking framework for Linux, macOS, Solaris, Windows, and BSD platforms.
Back on Christmas I wrote about Linux 5.11 regressing for AMD performance on Zen 2 and newer systems where the just-added CPU frequency invariance support was often hurting various workloads when using the default "Schedutil" scheduler utilization frequency scaling governor. Since then and through the holidays I have been carrying out many more benchmarks looking at the Linux 5.11 performance with a particular focus on the AMD desktop/server platforms.
There are many new features with the Linux 5.11 kernel that is presently under development but one of the ones I've been more curious about for how well it works is the Intel "workload hints" that can be passed via its thermal framework. This is about providing the system with hints of workloads being run to optimize the thermal/power properties.
The Linux 5.11 merge window has been open the past two weeks following the debut of Linux 5.10 but is set to close today. A lot of new features and exciting improvements were merged for Linux 5.11 although it is somewhat of a bumpy ride at the moment but should be buttoned up and ready for its stable release come February.
It's not the Grinch in 2020 that stole Christmas, but the Schedutil CPU frequency scaling governor on the in-development Linux 5.11 kernel that is thrashing performance for AMD Zen 2 and newer. Distributions like Ubuntu, Fedora, and Manjaro are beginning to use CPUFreq Schedutil by default on newer kernels and thus leading to a very bad initial/out-of-the-box experience with the current behavior on the early Linux 5.11 code.
At the start of the month AMD released AOCC 2.3 as the newest version of the AMD Optimizing C/C++ Compiler. AOCC is one of several LLVM/Clang downstream versions maintained by the company with this one being about delivering flagship AMD Zen family compiler support. From an AMD EPYC 7002 "Rome" series processor I recently wrapped up fresh benchmarks of AOCC 2.3 against the current GCC 10 and Clang 11 compiler releases.
Last week AMD published their Zen 3 support for GCC code compiler. That initial support, which has already been merged into GCC 11, is the initial support flipping on newly supported instructions but not yet offering any tuned scheduler model or other optimizations compared to the existing Zen 2 path. In any case, here is a look at the performance changes with building the open-source benchmarks under test with "znver3" compared to the prior Zen 2 and Zen 1 targets along with generic x86_64 and then also looking at the performance if catering the compiler targets for Intel's Skylake and Haswell processors.
Last week I provided some benchmarks looking at the IBM POWER9 mitigation for the L1 data cache needing to be flushed upon entering the kernel and on user accesses due to a recently disclosed vulnerability. POWER9 allows speculatively operating on validated data in the L1 cache, but when it comes to incompletely validated data paired with other side channels it could lead to local users potentially obtaining improper access to data in the L1 data cache. When benchmarking the impact on a POWER9 4c/16t CPU the overall impact was fairly modest while since then I fired up some benchmarks as well on a large POWER9 server with 44 cores / 176 threads to see the performance impact of this default Linux kernel change.
With this month's release of Chrome 87 having more performance improvements while Firefox 83 debuted with its "Warp" JavaScript improvements, it's a good time for some fresh Linux web browser benchmarks of these two main options. Plus with Firefox 84 to begin enabling WebRender by default in some Linux configurations, there is also a fresh run of Firefox with WebRender enabled.
One area not talked about much for Intel's latest Tiger Lake processors are hardened CPU security mitigations against the various speculative execution vulnerabilities to date. What's peculiar about Tiger Lake though is now if disabling the configurable mitigations it can actually result in worse performance than the default mitigated state. At least that's what we are seeing so far with the Core i7 1165G7 on Ubuntu 20.10 Linux is the opposite of what we have been seeing on prior generations of hardware.
Last week a new vulnerability was made public for IBM POWER9 processors resulting in a mitigation of the processor's L1 data cache needing to be flushed between privilege boundaries. Due to the possibility of local users being able to obtain data from the L1 cache improperly when this CVE is paired with other side channels, the Linux kernel for POWER9 hardware is flushing the L1d on entering the kernel and on user accesses. Here are some preliminary benchmarks looking at how this security change impacts the overall system performance.
Now that the Samsung-contributed open-source exFAT file-system kernel driver has matured quite nicely since being merged earlier this year as a replacement to the short-lived staging exFAT driver based on an older code-base, here is a look at how exFAT is performing on the Linux 5.9 kernel compared to EXT4 and F2FS as well as the existing exFAT FUSE file-system implementation.
Making use of "-march=tigerlake" for building optimized binaries catering to Intel's latest-generation processors is well worth it on the likes of GCC 11. Out of the new instruction set extensions on Tiger Lake is more uplift than we have seen out of recent Intel generations and comparing the different "-march=" targets shows significant performance benefits if you don't mind compiling your own software from source.
The Linux 5.10 merge window is set to close this afternoon followed by around seven weeks worth of release candidates before the stable kernel release in December. As usual here is our look at the many new features set to premiere with this next version of the Linux kernel.
With Ubuntu 20.10 due for release this week I have begun testing near-final Ubuntu 20.10 builds on many more systems in the lab. Larger than our normal distribution/OS comparisons, here is the culmination of running hundreds of benchmarks (366 tests to be exact) under both Ubuntu 20.04 LTS with all available updates and then again on the Ubuntu 20.10 development state while testing on Intel Comet Lake.
Phoronix Test Suite 10.0-Finnsnes is now officially available as the latest major feature release for our open-source, cross-platform automated benchmarking software that now has more than six hundred tests/benchmarks available for fully-automated testing. With Phoronix Test Suite 10.0 also comes a significant overhaul to OpenBenchmarking.org and its biggest since its debut back in 2011 alongside Phoronix Test Suite 3.0.
Of the many new features in Linux 5.9 with its debut set for this weekend, one of the performance-related changes is Intel FSGSBASE support finally being mainlined. A half-decade after the Linux patches first appeared for this feature present in Intel CPUs going back to Ivy Bridge, the mainline kernel is now patched for this feature that can help out I/O and other context switching heavy workloads. Given many of the same workloads were negatively impacted by the CPU security mitigations of recent years, here is a look at the current mitigated vs. unmitigated performance difference on the Linux 5.9 kernel with an Intel Core i9 9900K CPU for reference on how the mitigation impact is on recent versions of the Linux kernel.
Following last week's news of Firefox Nightly flipping on their new JIT "Warp" update I was eager to run fresh benchmarks of the current Firefox releases compared to Google Chrome under Ubuntu Linux.
After announcing oneAPI at the end of 2018 and then going into beta last year, oneAPI 1.0 is now official for this open-source, standards-based unified programming model designed to support Intel's range of hardware from CPUs to GPUs to other accelerators like FPGAs. Intel's oneAPI initiative has been one of several exciting software efforts led by the company in recent years while continuing to serve as one of the world's largest contributors to open-source software.
One of the most frequent questions received at Phoronix in recent times is whether the "schedutil" governor is ready for widespread use and if it can compare in performance to, well, the "performance" governor on AMD Linux systems. Here are some benchmarks of an AMD Ryzen 9 3900XT using the latest Linux 5.9 development kernel in looking at the performance differences between the CPUFreq governor options of Ondemand, Powersave, Performance, and Schedutil.
Following the Linux 5.0 to 5.9 kernel benchmarks on AMD EPYC and it showing the in-development Linux 5.9 kernel regressing in some workloads, bisecting that issue, and that bringing up the issue of the performance regression over page lock fairness a solution for Linux 5.9 has now landed.
Last week we reported on a Linux 5.9 kernel regression following benchmarks from Linux 5.0 to 5.9 and there being a sharp drop with the latest development kernel. That kernel regression was bisected to code introduced by Linus Torvalds at the start of the Linux 5.9 kernel cycle. Unfortunately it's not a trivial problem and one still being analyzed in coming up with a proper solution. So the short story is it's a work-in-progress while this article has some additional insight and benchmarks done over the course of the past few days.
The Linux 5.0 to 5.9 kernel benchmarking posted this week showed TensorFlow Lite running slower since the Linux 5.5 kernel... On top of looking at the new Linux 5.9 regressions, I also spent some time bisecting and figuring out what happened for TensorFlow Lite last year that has at least for the system under test caused it to run slower for all the kernel releases this year as shown in the aforelinked article.
Recently carrying out some benchmarks of all major kernel releases from Linux 5.0 through Linux 5.9 ended up yielding some surprising performance changes with the in-development 5.9 kernel. Here's details on this historical look at the kernel performance and what's going on with the Linux 5.9 kernel slowdowns.
Following the debut of the big Blender 2.90 release and subsequently updating it for the Phoronix Test Suite / OpenBenchmarking.org, here is a deep dive into the Blender 2.90 performance... A number of areas are being looked at with the initial Blender 2.90 benchmarks from how the performance is on various CPUs and GPUs to the performance of the Blender 2.82 vs. 2.90 to looking at the Windows vs. Linux performance for Blender 2.90 with various means of acceleration.
At the end of June AMD quietly released a new version of the AMD Optimizing C/C++ Compiler. Noticing the new release this week, here are some benchmarks of AOCC 2.2 up against LLVM Clang 10 and GCC 10 with Ubuntu Linux while running from an AMD EPYC 7742 2P server for looking at the performance gains possible with the compiler optimizations.
As alluded to previously, a major overhaul of OpenBenchmarking.org has been in the works for a number of months now including a completely brand new analytics engine as part of the Phoronix Test Suite 10.0 development with its release due out later this year. With the new OpenBenchmarking.org now in good enough shape at least for the internal infrastructure, this new version is being opened up to the public today while over the weeks ahead more features will continue to be flipped on.
Linux 5.9-rc1 is set to be released this evening in marking the end of the two-week long merge window where new features are introduced for the cycle.
761 software articles published on Phoronix.