Multi-Core Scaling Performance Of AMD's Bulldozer

Written by Michael Larabel in Processors on 26 October 2011 at 01:00 AM EDT. Page 3 of 7. 18 Comments.
Scaling Bulldozer, Gulftown, Sandy, Shanghai

Starting off is the C-Ray ray-tracing benchmark, which is a Phoronix favorite as it always manages to do an amazing job at stressing all available CPU cores and is compatible with numerous platforms. The C-Ray results show the FX-8150 Bulldozer not scaling quite as well as the Intel Core i7 990X, Intel Core i5 2500K, and dual AMD Opteron 2384 systems.

When moving from one core to two cores enabled, the C-Ray result on the i7 990X and Opteron 2384s were exactly twice as fast, the Core i5 2500K was 1.94x, and the FX-8150 was only 1.63x faster. However, the Bulldozer was at least ahead of the mobile Core i7 2630QM Sandy Bridge (a 2GHz quad-core with Hyper Threading), which was scaling very poorly. When at four cores, the FX-8150 was 3.26x faster than the single-core configuration, while the Core i5 2500K was hitting 3.68x and the Intel Core i7 990X Gulftown hit 4.01x.

When running at eight threads, the FX-8150 was 5.98x faster than the single-thread result. The Core i7 990X meanwhile was at just 4.38x, the Core i7 2630QM at 4.21x, and the dual AMD Opteron 2384 quad-core configuration was at 7.96x. The Bulldozer micro-architecture strategy is not as effective as having eight Shanghai cores, but it at least did better than the Intel Sandy Bridge / Gulftown CPUs that were partially utilizing Hyper Threading. Hyper Threading does very poorly with C-Ray and other select workloads, as is shown by the awkward performance of Gulftown when hitting eight and twelve threads.

Scaling Bulldozer, Gulftown, Sandy, Shanghai

With Smallpt, a light-weight and furiously fast multi-threaded path tracing test, the AMD FX-8150 multi-core scaling is right in line with the Sandy Bridge, Gulftown, and Shanghai processors. The only processor struggling to compete was the mobile Sandy Bridge (Core i7 2630QM). Hyper Threading tends to tarnish the 8/12 thread results for the Core i7 990X Extreme, but aside from that the Core i5 2500K, dual Opteron 2384, Core i7 990X, and FX-8150 results all scaled approximately the same. When testing four threads, the FX-8150 was 4.05x faster while the Core i5 2500K was at just 3.63x faster. With eight threads (fully utilizing the FX-8150), the improvement was 6.05x over the single-core result while the Opteron 2394 was at 8.02x and the Core i7 990X at 6.11x.


Related Articles