Michael, can you also test core scaling disabling one core per module? So that we can confront IPC with CMT enabled and disabled, mainly with four threads in this two cases:
1) 2 modules active, 2 cores per module active = 4 threads with CMT
2) 4 modules active, 1 core per module active = 4 threads without CMT
Nice article.
Interesting would be a comparison to the X6 1090T, to see if AMD was able to address the bottlenecks that plagued Bulldozer cores due to sharing resources with the other core in the same module.
I remember seeing similar tests back in the days when the first bulldozer came out, and IIRC some motherboards allow to do this directly from bios. An user reported an increase of IPC of about 6% on cinebench with the FX-8320, but I couldn't find any review that tested this situation deeply.