Page 1 of 5 123 ... LastLast
Results 1 to 10 of 47

Thread: New SMP build

  1. #1
    Join Date
    May 2008
    Location
    London
    Posts
    3

    Default New SMP build

    Hi all

    I'm in the process of replacing my Dual SMP AMD MPX system at the moment. I plan on staying with SMP, but with multi-core now, of course!

    My plan is to build a dual Quad Xeon system with either 4 or 8GB of RAM. I run 50/50 Win XP Pro and a custom Linux distro based loosely on LFS. As the days go by I spend more time with Linux than I do with Windows.

    I have the following on my shopping list so far:

    Tyan S5396A2NRF i5400, S771 x 2, PCI-E (x16), DDR2 ECC 533/667 MHz, SATA II, SATA RAID, E-ATX/SSI
    Intel Xeon E5420A Quad Core, S771, Harpertown Core, 2.5GHz, FSB 1333MHz, 12MB Cache
    Crucial CT2KIT25672AF667 4GB kit (2GBx2), 240-pin FB-DIMM DDR2 PC2-5300
    Gainward Bliss 8800 GT 1024MB GS NVIDIA 8800 GT, 1024MB TV, DVI-DVI GS PCI-E 2.0, Mem 1900MHz GDDR3, GPU 650MHz
    Enermax EG1000EWLDXX 1000W V2 Galaxy Modular PSU 85% Efficiency EPS12v Triple Quad +24 Rails Silent x2 Fan
    Silverstone SST-TJ10S Temjin Aluminium Tower Chassis in silver, RoHS


    I'd be pleased to read your comments, good and bad. I'm particularly interested if you notice anything glaringly obvious. For example, it's only through reading the Phoronix reviews that I realised that I had to get an SSI-compatible case to support the weighty XEON heatsinks (thank you Phoronix!).

  2. #2
    Join Date
    Apr 2008
    Posts
    126

    Default

    Don't Barcelona-class AMD processors scale better, as far as SMP goes? They have more memory bandwidth, too. My Phenom's pbzip2 results are much better than that of Intel CPUs. The latter fare better in other areas, notably where L2 cache size matters, but since you're building a dual-quad rig…
    Last edited by apaige; 05-16-2008 at 05:54 PM.

  3. #3
    Join Date
    Oct 2007
    Posts
    370

    Default

    Quote Originally Posted by apaige View Post
    Don't Barcelona-class AMD processors scale better, as far as SMP goes? They have more memory bandwidth, too. My Phenom's pbzip2 results are much better than that of Intel CPUs. The latter fare better in other areas, notably where L2 cache size matters, but since you're building a dual-quad rig…
    that is some rather interresting results...
    http://global.phoronix-test-suite.co...636-9602-14335
    http://global.phoronix-test-suite.co...51-11474-15615
    http://global.phoronix-test-suite.co...383-27337-1173

  4. #4
    Join Date
    Apr 2008
    Posts
    126

    Default

    Same test (pts 0.5.0), Phenom 9600 (2.3GHz), 4 cores, gcc 4.3.0:
    Code:
    Parallel BZIP2 v1.0.2 - by: Jeff Gilchrist [http://compression.ca]
    [July 25, 2007]             (uses libbzip2 by Julian Seward)
    
             # CPUs: 4
     BWT Block Size: 500k
    File Block Size: 900k
    -------------------------------------------
             File #: 1 of 1
         Input Name: bigfile
        Output Name: bigfile.bz2
    
         Input Size: 691505952 bytes
    Compressing data...
        Output Size: 425971060 bytes
    -------------------------------------------
    
         Wall Clock: 38.688440 seconds
    38.7s versus 57s for kte's Phenom 9850 @2.7GHz and 69.8s for khurios' Phenom 9500 @2.2GHz. Maybe his hard drive was the bottleneck, maybe his RAM was set in ganged mode, maybe the TLB bug patch was enabled, I don't know. GCC 4.3.x also provides performance gains with recent processors such as the Phenom and the Core 2 Duo/Quad CPUs.
    But while 20.6s for your C2Q @3.2GHz is nothing to sneeze at, I've seen very different results for more comparable CPUs such as the Q6600. It's hard to find comparable data because the benchmark file changed so often - it'll be easier once pts 1.0 comes out with a definitive file.

    Anyway, I don't have any experience with SMP systems with more than 4 cores. I just read that while AMD has had very little success on the consumer PC front, it currently has an edge in the server and HPC markets. http://www.anandtech.com/weblog/showpost.aspx?i=443
    I guess what's important is to clearly identify the target usage, and determine which offering would perform better in the relevant scenarios. The OP hasn't stated what those would be.

  5. #5
    Join Date
    Oct 2007
    Posts
    370

    Default

    AMD does have more bandwidth available yes.

  6. #6
    Join Date
    Apr 2008
    Posts
    126

    Default

    Once more, there are very odd results in the PTS database.
    PTS 0.7.0 (latest), multicore benchmarks: Phenom 9850 @2.50GHz vs. Core 2 Quad Q6600 @3.38GHz. The much higher-clocked Intel CPU is slower than the Phenom in ALL benchmarks. That's got to be wrong. And while the p7zip bench result is only mildly lower, the other ones are MUCH lower. There's gotta be a bottleneck somewhere (the hard drive perhaps?). EDIT: then again, the p7zip benchmark doesn't involve the hard drive at all.

    Even the Phenom results are somewhat surprising. My lower-clocked Phenom (200MHz per core slower) scores about 6000 (vs. 4653) with the benchmark settings (i.e. p7zip compiled without optimizations), while the OpenSSL result somewhat matches my expectations (162 on mine vs. 167). With optimizations (gcc 4.3.0, -O3 -march=amdfam10 and the special amd64 makefile present in the package), p7zip scores 6891. Which brings me to another issue I'd like to mention: the lack of optimization in pts builds (but I guess that's for another thread).
    Last edited by apaige; 05-17-2008 at 09:53 AM.

  7. #7
    Join Date
    Aug 2007
    Posts
    6,613

    Default

    Many benchmarks use tempfiles. Thats one of the biggest drawbacks when you compare cpu speed.

  8. #8
    Join Date
    Apr 2008
    Posts
    126

    Default

    Many of those could be modified to output to /dev/null (that's the case now for the audio encoding profiles). Still, how do you explain the p7zip discrepancies? That benchmark doesn't use tempfiles, as far as I can tell.

  9. #9
    Join Date
    Aug 2007
    Posts
    6,613

    Default

    p7zip is no good mulitcore benchmark, a dual core E6600@3.2 ghz has already 3800 MIPS. Maybe the C2Q got too hot and throttled down.

  10. #10
    Join Date
    May 2007
    Location
    Third Rock from the Sun
    Posts
    6,583

    Default

    There a few things I notice about the p7zip benchmark that can dramatically effect it's performance.

    -scheduler choice
    -optimization builds
    -tlb workarounds for phenoms
    -assembly or non assembly compiles
    -motherboard chipsets

    Also the Phenom system listed by apaige in the link also is using a rt kernel.
    Last edited by deanjo; 05-17-2008 at 12:45 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •