View Full Version : New SMP build
fluffy_bunny
05-16-2008, 04:49 PM
Hi all
I'm in the process of replacing my Dual SMP AMD MPX system at the moment. I plan on staying with SMP, but with multi-core now, of course!
My plan is to build a dual Quad Xeon system with either 4 or 8GB of RAM. I run 50/50 Win XP Pro and a custom Linux distro based loosely on LFS. As the days go by I spend more time with Linux than I do with Windows.
I have the following on my shopping list so far:
Tyan S5396A2NRF i5400, S771 x 2, PCI-E (x16), DDR2 ECC 533/667 MHz, SATA II, SATA RAID, E-ATX/SSI
Intel Xeon E5420A Quad Core, S771, Harpertown Core, 2.5GHz, FSB 1333MHz, 12MB Cache
Crucial CT2KIT25672AF667 4GB kit (2GBx2), 240-pin FB-DIMM DDR2 PC2-5300
Gainward Bliss 8800 GT 1024MB GS NVIDIA 8800 GT, 1024MB TV, DVI-DVI GS PCI-E 2.0, Mem 1900MHz GDDR3, GPU 650MHz
Enermax EG1000EWLDXX 1000W V2 Galaxy Modular PSU 85% Efficiency EPS12v Triple Quad +24 Rails Silent x2 Fan
Silverstone SST-TJ10S Temjin Aluminium Tower Chassis in silver, RoHS
I'd be pleased to read your comments, good and bad. I'm particularly interested if you notice anything glaringly obvious. For example, it's only through reading the Phoronix reviews that I realised that I had to get an SSI-compatible case to support the weighty XEON heatsinks (thank you Phoronix!).
apaige
05-16-2008, 05:52 PM
Don't Barcelona-class AMD processors scale better, as far as SMP goes? They have more memory bandwidth, too. My Phenom's pbzip2 results are much better than that of Intel CPUs. The latter fare better in other areas, notably where L2 cache size matters, but since you're building a dual-quad rig…
Redeeman
05-16-2008, 10:12 PM
Don't Barcelona-class AMD processors scale better, as far as SMP goes? They have more memory bandwidth, too. My Phenom's pbzip2 results are much better than that of Intel CPUs. The latter fare better in other areas, notably where L2 cache size matters, but since you're building a dual-quad rig…
that is some rather interresting results...
http://global.phoronix-test-suite.com/index.php?k=profile&u=redeeman-6636-9602-14335
http://global.phoronix-test-suite.com/index.php?k=profile&u=kte-17851-11474-15615
http://global.phoronix-test-suite.com/index.php?k=profile&u=khurios2000-17383-27337-1173
apaige
05-17-2008, 03:07 AM
Same test (pts 0.5.0), Phenom 9600 (2.3GHz), 4 cores, gcc 4.3.0:
Parallel BZIP2 v1.0.2 - by: Jeff Gilchrist [http://compression.ca]
[July 25, 2007] (uses libbzip2 by Julian Seward)
# CPUs: 4
BWT Block Size: 500k
File Block Size: 900k
-------------------------------------------
File #: 1 of 1
Input Name: bigfile
Output Name: bigfile.bz2
Input Size: 691505952 bytes
Compressing data...
Output Size: 425971060 bytes
-------------------------------------------
Wall Clock: 38.688440 seconds
38.7s versus 57s for kte's Phenom 9850 @2.7GHz and 69.8s for khurios' Phenom 9500 @2.2GHz. Maybe his hard drive was the bottleneck, maybe his RAM was set in ganged mode, maybe the TLB bug patch was enabled, I don't know. GCC 4.3.x also provides performance gains with recent processors such as the Phenom and the Core 2 Duo/Quad CPUs.
But while 20.6s for your C2Q @3.2GHz is nothing to sneeze at, I've seen very different results for more comparable CPUs such as the Q6600. It's hard to find comparable data because the benchmark file changed so often - it'll be easier once pts 1.0 comes out with a definitive file.
Anyway, I don't have any experience with SMP systems with more than 4 cores. I just read that while AMD has had very little success on the consumer PC front, it currently has an edge in the server and HPC markets. http://www.anandtech.com/weblog/showpost.aspx?i=443
I guess what's important is to clearly identify the target usage, and determine which offering would perform better in the relevant scenarios. The OP hasn't stated what those would be.
Redeeman
05-17-2008, 08:57 AM
AMD does have more bandwidth available yes.
apaige
05-17-2008, 09:27 AM
Once more, there are very odd results in the PTS database.
PTS 0.7.0 (latest), multicore benchmarks: Phenom 9850 @2.50GHz (http://global.phoronix-test-suite.com/index.php?k=profile&u=root-3022-7554-26766) vs. Core 2 Quad Q6600 @3.38GHz (http://global.phoronix-test-suite.com/index.php?k=profile&u=dave-30041-21377-23986). The much higher-clocked Intel CPU is slower than the Phenom in ALL benchmarks. That's got to be wrong. And while the p7zip bench result is only mildly lower, the other ones are MUCH lower. There's gotta be a bottleneck somewhere (the hard drive perhaps?). EDIT: then again, the p7zip benchmark doesn't involve the hard drive at all.
Even the Phenom results are somewhat surprising. My lower-clocked Phenom (200MHz per core slower) scores about 6000 (vs. 4653) with the benchmark settings (i.e. p7zip compiled without optimizations), while the OpenSSL result somewhat matches my expectations (162 on mine vs. 167). With optimizations (gcc 4.3.0, -O3 -march=amdfam10 and the special amd64 makefile present in the package), p7zip scores 6891. Which brings me to another issue I'd like to mention: the lack of optimization in pts builds (but I guess that's for another thread).
Many benchmarks use tempfiles. Thats one of the biggest drawbacks when you compare cpu speed.
apaige
05-17-2008, 11:25 AM
Many of those could be modified to output to /dev/null (that's the case now for the audio encoding profiles). Still, how do you explain the p7zip discrepancies? That benchmark doesn't use tempfiles, as far as I can tell.
p7zip is no good mulitcore benchmark, a dual core E6600@3.2 ghz has already 3800 MIPS. Maybe the C2Q got too hot and throttled down.
deanjo
05-17-2008, 12:41 PM
There a few things I notice about the p7zip benchmark that can dramatically effect it's performance.
-scheduler choice
-optimization builds
-tlb workarounds for phenoms
-assembly or non assembly compiles
-motherboard chipsets
Also the Phenom system listed by apaige in the link also is using a rt kernel.
Max Spain
05-17-2008, 11:31 PM
That's all well and good, but op is considering building a dual quad core processor system, and since Intel has its FSB and AMD has HT, HT wins hands down as far as bandwidth is concerned. That advantage could change depending on intended workload, but until Intel releases Nehalem with quickpath, they can't compete on multiprocessor systems where lots of data needs to be shared.
So here is a nice benchmark with an OC Q6600:
http://global.phoronix-test-suite.com/index.php?k=profile&u=makke-16163-22205-28735 m
Compared against Phenom - this time the values are more logical...
deanjo
05-18-2008, 06:30 AM
Unfortunately, none of the PTS tests are of much relevance when we start talking server use that would exploit SMP systems and show the bottlenecks that FSB has. Start getting some VM / SQL / apache and the likes tests in there and then you could potentially start seeing the difference.
fluffy_bunny
05-18-2008, 07:05 AM
Thanks for the many replies.
Hmm, I'm wondering now if I should hold out to Nehalem - according to this Wiki http://en.wikipedia.org/wiki/Nehalem_(CPU_architecture) it's a big a change of architecture as the PPro was. If that's true, it's very significant.
I'd totally forgotten that Intel still doesn't use point to point communication - yet.
I may have to go back and investigate an AMD option based on the comments here:D
deanjo
05-18-2008, 10:28 AM
Thanks for the many replies.
Hmm, I'm wondering now if I should hold out to Nehalem - according to this Wiki http://en.wikipedia.org/wiki/Nehalem_(CPU_architecture (http://en.wikipedia.org/wiki/Nehalem_%28CPU_architecture)) it's a big a change of architecture as the PPro was. If that's true, it's very significant.
I'd totally forgotten that Intel still doesn't use point to point communication - yet.
I may have to go back and investigate an AMD option based on the comments here:D
Until AMD has their answer to Nehalem you can be guaranteed that intel will price those at insane dollars.
Redeeman
05-18-2008, 08:52 PM
well... from these results:
http://global.phoronix-test-suite.com/?k=profile&u=redeeman-3588-5507-25823
one can certainly conclude, that if one is going for a 7zip box, intel is the way to go, but for openssl performance, amd is the way to go.
apaige
05-19-2008, 06:49 AM
Sigh, I'm tired of comparing apples to oranges. I'd like to see a Phenom 9750 against a Q6600 (at stock speed, which is 2.4GHz for both) and/or a Phenom 9850 against a Q9300 (both 2.5GHz).
apaige
05-19-2008, 07:45 AM
I don't see the point in overclocking when the goal is to compare two competing products. Besides, the Phenom doesn't overclock as well as Intel CPUs, and few people overclock their processors in the first place. Moreover, it requires somewhat expensive equipment (high-end motherboard, third-party cooler, etc…), and it's not exactly simple. When I see the comments on hardware articles, I get the feeling people who read them only care about overclocking their systems… Me, I want to run my CPU at stock speeds and use Cool N' Quiet/SpeedStep, in a reasonably quiet PC. Too bad my motherboard doesn't lower voltages when throttling the CPU's frequency (I've mailed Gigabyte about this, I'm still waiting for an answer).
I use stock Intel cooler for 2.4 -> 3.2 with E6600, standard P35 Gigabyte board - not the cheapest, but not "highend" like X48 or so. What board do you use and which cpu?
apaige
05-19-2008, 08:27 AM
Gigabyte GA-M61P-S3 (rather low-end, AM2-only socket) with a Phenom 9600 BE (started with an Athlon 64 X2 4000+ EE). I wasn't looking for the Black Edition specifically; I asked for a 9600, got a 9600 BE for the same price.
Do you use latest BETA bios with setup defaults loaded? Be sure not to use automatic performace enhance in MIT menu (use standard not turbo).
apaige
05-19-2008, 09:16 AM
Yeah, latest BIOS. I wrote Gigabyte on May 9th, still no answer.
Anyway, here's a better setup to compare my CPU to: http://global.phoronix-test-suite.com/index.php?k=profile&u=edman007-6118-16404-13644
2 x Intel Xeon CPU E5410 @ 2.32GHz (Total Cores: 8). Write cacheing enabled (whatever that means, but sure helps with the pbzip2 bench). It's also close to what the OP intended to buy.
I get similar results in the FLAC and gzip benchmarks (12.7s for me vs. 12.3s and 60s vs. 64s), while his OpenSSL results are still disappointing (240 for him with 8 cores vs. 162 for me with 4 cores). His pbzip2 results are much better than what I've seen with other Intel benchmarks so far though: 32s with 8 cores vs. 73s for me. That's a huge increase in performance compared to the same setup but without "Write cacheing": 74s. http://global.phoronix-test-suite.com/index.php?k=profile&u=edman007-27825-20598-18993
Overall it feels like his system is much better configured than other Intel systems on PTS. For one thing, FLAC should be substantially faster than LAME and Ogg Vorbis, which is the case with his system, but not with the others. I wonder what makes the difference?
Redeeman
05-19-2008, 11:11 AM
#17 apaige:
"Sigh, I'm tired of comparing apples to oranges. I'd like to see a Phenom 9750 against a Q6600 (at stock speed, which is 2.4GHz for both) and/or a Phenom 9850 against a Q9300 (both 2.5GHz)."
Q9300 is not to be compared, since it neither matches the price, and it is a very nerfed intel - Q9450 is 12mb cache, and costs around the same as 9850.
#19:
well.. it is in fact QUITE simple to overclock the intels, ALL i did was set fsb to 400, and i instantly get 2.66 -> 3.2ghz. The stock cooler can keep it within safety limits, as it requires not to raise voltage. also, a standard P35 motherboard can do this easily.
#23:
hmm.. those pbzip2 results are very low.. i just redid mine with pts 0.7:
http://global.phoronix-test-suite.com/index.php?k=profile&u=redeeman-7593-23982-30604
and with 0.5:
http://global.phoronix-test-suite.com/index.php?k=profile&u=redeeman-6636-9602-14335
apaige
05-19-2008, 11:25 AM
Q9300 is not to be compared, since it neither matches the price, and it is a very nerfed intel - Q9450 is 12mb cache, and costs around the same as 9850.
Erm, excuse me, but the lowest prices I could find in France are 315 euros for the C2Q Q9450 and 179 euros for the Phenom 9850, while I could find the Q6600 at 141 euros and the Q9300 at 208 euros… Also, stock frequencies aren't even the same (2.667GHz and 2.5GHz respectively). I don't know where you got your prices from, but the Q9450 is a lot more expensive than all of the aforementioned processors from both Intel and AMD.
As for the C2Q Q9300 being a "nerfed" processor (I assume that means crippled?), although I can't compare myself, hardware review sites have found its performance to be on par (sometimes slightly better) with the similarly clocked Q6700. Again, compare apples to apples and oranges to oranges.
Edit: Release prices are $266 for both the 95W Q6600 and the Q9300, $316 for the Q9450 and $235 for the Phenom 9850 BE.
deanjo
05-19-2008, 11:54 AM
Well here is a Phenom 9850 universe test. Keep in mind that this particular 9850 is running on a AM2 board which does not support HT3. Unfortunately I won't be able to post the results for the same processor in a Asus M3N -HT Deluxe board (780a chipset) until it's finished doing some fluid dynamics analysis (eta to finish, next Saturday.)
http://global.phoronix-test-suite.com/index.php?k=profile&u=dean-25316-17291-32168
Here is a Q6600 running at stock speed as well. With the exception of the mplayer compile (which the bug has been reported but is not fixed yet) It should give you some idea at what your looking at.
http://global.phoronix-test-suite.com/index.php?k=profile&u=kabage-26266-14772-8279
I'd like to see Phenom 9950 (2,66 GHz with higher FSB than other Phenoms, coming very soon) vs. Q6700 (2,66 GHz) reviews.
deanjo
05-19-2008, 12:52 PM
I'd like to see Phenom 9950 (2,66 GHz with higher FSB than other Phenoms, coming very soon) vs. Q6700 (2,66 GHz) reviews.
Should be easily simulated by overclocking the 9850BE to 2.6Ghz. The HT link speed are identical on the 9850 and 9950. Just setting the multiplier to 13 instead of 12.5 would give the results you are looking for.
apaige
05-19-2008, 01:03 PM
Well here is a Phenom 9850 universe test. Keep in mind that this particular 9850 is running on a AM2 board which does not support HT3.
Same here (AM2 socket, MCP61/nForce 430 chipset).
Here is a Q6600 running at stock speed as well. With the exception of the mplayer compile (which the bug has been reported but is not fixed yet) It should give you some idea at what your looking at.
Sounds about right, although compilation times can't really be compared, since the Phenom benchmarks use GCC 4.3.1 and the Q6600 benchmarks use GCC 4.1.2. One thing about the Phenom pbzip2 results though: my 9600 scores 72s vs. 86s for the 9850. I guess there are indeed a lot of factors that come into play with that app. GCC 4.3.x should improve performance on the Q6600 as well.
Should be easily simulated by overclocking the 9850BE to 2.6Ghz. The HT link speed are identical on the 9850 and 9950. Just setting the multiplier to 13 instead of 12.5 would give the results you are looking for.
2.66 GHz. And the FSB is 266, instead of 200.
Redeeman
05-19-2008, 01:08 PM
#25:
hmm you are right, it appears the shops here just lowered prices for 9850 a few weeks ago - to the same price as q9300 here.
but q9300 seems to for many things to be almost as fast as q9450:
http://global.phoronix-test-suite.com/index.php?k=profile&u=redeeman-7115-5894-24995
deanjo
05-19-2008, 01:37 PM
2.66 GHz. And the FSB is 266, instead of 200.
Umm, no, first of all, there is no FSB on AMD Procs (hasn't been for a long time). The ONLY differences between the 9850 and the 9950 is the multiplier which is 13 instead of 12.5 and is a 140 Watt part.
http://www.sharkyextreme.com/hardware/cpu/article.php/3742726
AMD has no FSB? I think you have to live in a parallel universe when this would be true ;)
deanjo
05-19-2008, 01:57 PM
AMD has no FSB? I think you have to live in a parallel universe when this would be true ;)
They don't they use HyperTransport which is a replacement for FSB.
Call it like you want, you always have got an external clock and the internal clock is external * multiplicator. Usually you call the external clock FSB.
deanjo
05-19-2008, 02:54 PM
Call it like you want, you always have got an external clock and the internal clock is external * multiplicator. Usually you call the external clock FSB.
Base clock has nothing to do with FSB. FSB is the link. not the clock. Hypertransport replaces the FSB therefore you cannot have a FSB speed on a system that does not have a front side bus.
apaige
05-19-2008, 03:12 PM
HyperTransport != Front Side Bus. The CPU multiplier on AMD processors refers to the HT "base clock" (whatever that means) which is set at 200MHz for all models (not 266). Thus raising the CPU multiplier doesn't involve raising that base clock value, and the HT frequency remains at 2GHz for both the 9850 and the 9950 (1.8GHz for all other Phenom CPUs except the "power efficient" 9100e, set at 1.6GHz). The only thing that changes is the CPU's frequency (which would be 200 x 13 = 2.6GHz exactly, not 2.66).
deanjo
05-19-2008, 11:05 PM
Phenom 9850 vs Q6600 @ stock speeds.
Thanks kabage.
http://global.phoronix-test-suite.com/index.php?k=profile&u=kabage-28961-5738-19437
Redeeman
05-20-2008, 12:35 PM
deanjo, i do not understand, theres no results on that benchmark?
deanjo
05-20-2008, 12:49 PM
deanjo, i do not understand, theres no results on that benchmark?
Huh? It shows the complete universe set of benchmarks. The link is good.
apaige
05-20-2008, 01:26 PM
I think it's safe to assume Redeeman smokes marijuana :P
deanjo
05-20-2008, 01:36 PM
I think it's safe to assume Redeeman smokes marijuana :P
Left click dammit! I said left click! :p
Redeeman
05-20-2008, 06:28 PM
or perhaps it is you two, that are on some rather heavy drugs, given that you apparently can see results here:
http://img160.imageshack.us/my.php?image=theuniversemf5.png
deanjo
05-20-2008, 07:39 PM
or perhaps it is you two, that are on some rather heavy drugs, given that you apparently can see results here:
http://img160.imageshack.us/my.php?image=theuniversemf5.png
Dude something is screwed up with your browser.
http://i30.photobucket.com/albums/c316/deanjo/yoursystemisscrewed1.jpg
apaige
05-20-2008, 09:03 PM
Some kind of adblocker gone wild?
Redeeman
05-20-2008, 09:19 PM
it works nicely with the other results, so i would not imagine it is not my browser..
deanjo
07-01-2008, 11:52 AM
As predicted here
2.66 GHz. And the FSB is 266, instead of 200.
Umm, no, first of all, there is no FSB on AMD Procs (hasn't been for a long time). The ONLY differences between the 9850 and the 9950 is the multiplier which is 13 instead of 12.5 and is a 140 Watt part.
http://www.sharkyextreme.com/hardware/cpu/article.php/3742726
The 9950 is released today:
http://products.amd.com/en-us/DesktopCPUFilter.aspx
SPECS:
Processor AMD Phenom™ X4
Quad-Core Model Number 9950
Frequency (MHz) 2600
L2 Cache Size (KB) 512
Socket AM2+
Stepping B3
Manufacturing Tech (CMOS) 65nm SOI
Wattage (W) 140 W
System Bus (MHz) 4000
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.