Page 2 of 6 FirstFirst 1234 ... LastLast
Results 11 to 20 of 58

Thread: Linux vs Solaris - scalability, etc

  1. #11
    Join Date
    Nov 2008
    Posts
    418

    Default

    Quote Originally Posted by TheOrqwithVagrant View Post
    Lots of interesting stuff.


    1. The SGI Altix UV and Altix 4000/3000 systems are NOT distributed memory systems, they are shared memory systems. They are the same "class" of multiprocessor servers as what you (Kebabbert) would refer to as "large enterprise SMP systems", such as the HP SuperDome, Oracle T & M-series enterprise servers, etc.
    Ok, this is interesting. Forgive me for expressing this, but I am just used to debate to Kraftman, and as you have seen, those "debates" tend to be quite... strange. I am not used to debate with sane Linux supporters. I have some questions to your interesting and informative post:

    1) You talk about the SGI Altix systems, and call them SMP servers. Why dont people run typical SMP work loads on them, in that case, such as databases? They mostly run HPC workloads.

    2) Why does lot of links refer to the Altix systems as HPC servers?

    3) I read a link on this SGI Altix servers, and the engineers on SGI said "it was easier to construct the hardware than to convince Linus Torvalds to allow Linux to scale to Altix 2048 cores". If it is that easy to construct a true SMP server - is this not strange? I know it is easy to construct a cluster. But to construct a SMP server in less time it takes to convince Linus Torvalds?!

    4) I suspect the Altix systems being cheaper than a IBM P795 AIX server. Why dont people buy a cheap 2048 core Altix server to run SMP workloads, instead of bying an 32cpu IBM P795 server to run SMP workloads? The old 32 cpu IBM P595 server for the old TPC-C record costed 35 million USD, list price. What does the Altix servers cost?

    5) Why dont IBM, Oracle and HP just insert 16.384 cpus into their SMP servers, if it is so easy as you claim to build large SMP servers? Why are all of them still stuck at 32-64 cpus? SGI has gone into 1000s of cores, but everyone else with their Mature Enterprise Unix systems are stuck at 64 cpus. Why is that?
    Last edited by kebabbert; 11-15-2011 at 08:49 AM.

  2. #12
    Join Date
    Nov 2008
    Posts
    418

    Default

    Holy Cow! ANOTHER sane Linux supporter. I did not even knew they existed! I dont know if you have seen the usual Linux supporters here, but the debates with them.... are quite strange.

    You are the fourth person I debate with, so this might take time as I have work to do. I will use Round Robin, or I should just drop the FUDers and talk only with the sane Linux persons.



    Quote Originally Posted by jabl View Post
    No, this is wrong. It's not about "SMP servers" vs. "HPC servers". It's more about which kind of workloads the OS has been tuned for. The Linux VM subsystem has seen extensive scalability work in order to run (mostly) HPC style workloads on gigantic CC-NUMA machines. OTOH, database workloads are very different, and there hasn't been as much interest in scalability work that would benefit these kinds of workloads (although not too long ago a big bunch of "VFS scalability" work was integrated into the mainline kernel, so this situation is changing). For various reasons, the proprietary Unix vendors have had different priorities.
    This is strange. As Ted Tso said, Linux kernel devs had not access to big SMP servers. It could be interpreted as the reason as Linux is not targeting SMP servers.

    You say that there are not much interest in SMP workloads such as databases. That is strange. RedHat, Suse, Novell etc all of them wants to make money. The big money is in Enterprise workloads, such as Databases. We all know that Oracle made a fortune in databases.

    Thus, I am not convinced in your explanation "there have not been much interest". My interpretation is that Linux kernel devs can not handle that kind of work load, they dont have the servers nor experience of building such servers. Oracle, IBM and HP has.

    To build a cluster is easy and Linux kernel devs can develop for them. But not for SMP servers costing many millions of USD.

  3. #13
    Join Date
    Aug 2010
    Location
    Sweden
    Posts
    30

    Default

    Quote Originally Posted by kebabbert View Post
    You talk about "ZFS being the fastest filesystem", that is not true. ZFS is the most _advanced and secure_ filesystem, but ZFS is actually quite slow. The reason is that ZFS protects your data, and does checksum calculations all the time, and burn lot of CPU to do that. If you do a MD5 or SHA-256 on every block, then everything will be slow, yes? Have you done a MD5 checksum on a file? It takes time, yes?
    ..........

    Sure, on many disks ZFS is faster, because it scales better on 128 disks and more. Solaris has no problem of using many cpus, and 128 disks burn lot of cpu, but that load is distributed to many cpus. Thus, you need many disks and much cpu, for ZFS to be fastest.

  4. #14
    Join Date
    Nov 2008
    Posts
    418

    Default

    Quote Originally Posted by marwi509 View Post
    ..........
    Ok, you are right. I did not mean
    "ZFS needs lots of cpus and disks to be fastest filesystem"

    I meant,
    "An ZFS server needs lots of cpus and disks to achieve highest performance"

    Because ZFS scales well on Solaris, it can use many disks and cpus. There is some point where Linux stops to scale well, at that point ZFS takes over and continues to scale. Thus, you can reach ZFS high performance if you have enough disks and cpu. But the ZFS filesystem does not become faster, only the server becomes faster.

    On few disks and few cpus, I expect Linux to be faster. On many disks and many cpus, I expect Solaris solutions to scale better and be the fastest. As shown by different benchmarks.

    Thanks for pointing that out. You know, I am writing lot of text here, and sometime there are some less optimal wordings. But thanks.

  5. #15

    Default

    Quote Originally Posted by kebabbert View Post
    Ok, this is interesting. Forgive me for expressing this, but I am just used to debate to Kraftman, and as you have seen, those "debates" tend to be quite... strange. I am not used to debate with sane Linux supporters. I have some questions to your interesting and informative post:

    1) You talk about the SGI Altix systems, and call them SMP servers. Why dont people run typical SMP work loads on them, in that case, such as databases? They mostly run HPC workloads.

    2) Why does lot of links refer to the Altix systems as HPC servers?
    For the same reason Oracle/SUN are _not_ referring to their servers as HPC servers, even though they could (and have been) used for that - it's not the company's primary market. SGI has traditionally had two primary markets; HPC and graphics. The graphics side has largely disappeared because commodity graphics cards have become so insanely powerful that there really is no market left for specialized million-dollar visualization. Likewise, their HPC business took big hit because for the majority of HPC workloads, a distributed memory cluster solution is often the much better value proposition. However, there are still some HPC workloads out there that simply doesn't scale well on distributed memory systems and where a huge multi-terabyte dataset needs to be kept all in memory at once and be accessible to to all threads. This is the rather tiny niche market for giant shared memory HPC systems like this, but fortunately for SGI, they are the ONLY ones to go to for this type of system, which means they are pretty much guaranteed to sell a few 1000+ core megasystems to entities such as the DoD, NSA and NASA - and a few of those systems sold is enough to recoup development costs and keep the business afloat.

    [QUOTE=kebabbert;238639]

    3) I read a link on this SGI Altix servers, and the engineers on SGI said "it was easier to construct the hardware than to convince Linus Torvalds to allow Linux to scale to Altix 2048 cores". If it is that easy to construct a true SMP server - is this not strange? I know it is easy to construct a cluster. But to construct a SMP server in less time it takes to convince Linus Torvalds?!

    [QUOTE]

    If you look at the little SGI history section in my previous post, you'll realize that by the time SGI was building the Altix version that scaled to 2048 CPUs, they already had 10 years+ experience with the NUMAlink architecture. Scaling the Altix up from 512 to 2048 CPUs probably wasn't all that hard compared to, say, designing the first Origin 2000 systems, which was truly groundbreaking. On the other hand, Linus is very protective of the Linux "core" and whenever a company tries to merge changes to the core that benefits no one but them, they are going to have to make damn sure that those changes don't _hurt_ any other use case before they get merged with mainline. In the case of scaling to enormous CPU counts, there was a lot of (legitimate) concern and flaws in the early patches which hurt performance on "regular size" servers and embedded systems. Another example of an "epic struggle" with Linus is the Xen project's effort to get dom0 support into mainline Linux, which took 3+ years before the patches were of sufficient quality to be allowed in. Also, Linus's position nowadays, in his own words, is to be "the guy who says 'No'" - basically, he's the "quality control" for patches to the core of the Linux kernel. He does very little new development himself these days.

    Quote Originally Posted by kebabbert View Post
    4) I suspect the Altix systems being cheaper than a IBM P795 AIX server. Why dont people buy a cheap 2048 core Altix server to run SMP workloads, instead of bying an 32cpu IBM P795 server to run SMP workloads? The old 32 cpu IBM P595 server for the old TPC-C record costed 35 million USD, list price. What does the Altix servers cost?
    It's hard to find pricing information for the Altix UV, but for example, a 1152 CPU configuration cost about $3.5 million. It might be "cheap" per cpu compared to a P795 (which I believe is about $4.5 million fully decked out with 256 cores) but that's to be expected since the Altix UV uses commodity Xeon CPUs whereas the P795 has Power 7 CPUs. "Non-HPC" workloads like databases, java app servers and the like are unlikely to be tuned to scale beyond the largest SMP configurations sold by the standard enterprise vendors, so a system bigger than 256 CPUs might simply be wasted for "standard" enterprise SMP workloads. That said, SGI does seem to cautiously be trying to "branch out" their market with the publication of the specjbb benchmarks and becoming an officially supported platform for Oracle 11g R2 last year. I haven't seen any database benchmarks yet, but I'll be very curious to see those numbers.

    Quote Originally Posted by kebabbert View Post
    5) Why dont IBM, Oracle and HP just insert 16.384 cpus into their SMP servers, if it is so easy as you claim to build large SMP servers? Why are all of them still stuck at 32-64 cpus? SGI has gone into 1000s of cores, but everyone else with their Mature Enterprise Unix systems are stuck at 64 cpus. Why is that?
    It's "easy" for SGI because they've been doing it for a long time, and they designed their entire architecture towards the goal of being able to scale up almost indefinitely. This was their target market, and they knew at the time of these decisions that they already had customers for systems of that size. The market for extremely large shared memory systems is not really growing, hence it's not money-well-spent for Oracle, HP or IBM to change their existing architectures toward this goal and try to "muscle in" on the niche that's pretty much owned by SGI. Most of the other "big iron" vendors are increasing their "Max cores per server" largely as a side effect of the trend towards fitting more cores/threads on one CPU die than changing their large servers to support more CPU sockets, hence they get their upward scaling "for free" due to the trends in CPU manufacturing and design.
    Second - there's a very good chance that SGI owns a bucketload of patents on this types of scalable architecture, and if Oracle or IBM tried to design a modular "infinitely expandable" shared memory design like SGI's, they'd be walking into a patent minefield.

    Finally - from what I see at actual customer sites, the large systems these days are VERY rarely used to run a single OS image; most are partitioned, or running hypervisors with a ton of VMs. Hence, the market for single OS image servers larger than what the m9000 and P795 can provide outside of HPC is really vanishingly small. That said, the Altix UV (as opposed to the older Itanium-based Altixes which were MUCH more expensive) does put SGI in a position where it can actually be competitive for the enterprise side of that market, and it will be interesting to see whether they have any success with this.

  6. #16
    Join Date
    Nov 2011
    Posts
    24

    Default

    Quote Originally Posted by kebabbert View Post
    This is strange. As Ted Tso said, Linux kernel devs had not access to big SMP servers. It could be interpreted as the reason as Linux is not targeting SMP servers.
    I have no idea what your point is here. Those that have an interest in scalability work certainly do have access to big NUMA systems (as I and others explained before, "big SMP" doesn't really exist anymore). But, the vast majority of the market and developer interest is towards smaller systems, yes. That's hardly surprising.

    You say that there are not much interest in SMP workloads such as databases.
    No, I said that there hasn't been that much scalability work for database type workloads, compared to, say, the effort SGI and others have put into scaling the VM subsystem. As I also mentioned, this is slowly changing, e.g. the recent VFS scalability work.

    That is strange. RedHat, Suse, Novell etc all of them wants to make money. The big money is in Enterprise workloads, such as Databases. We all know that Oracle made a fortune in databases.
    Indeed, a large portion of RH et al. revenue has come from replacing legacy Unix systems for running databases. But the vast majority of that market has been replacing modest sized servers, rather than humongous servers worth millions each. Linux does just fine in that market. Heck, unless you've been living under a rock for the past 20 years, you should be well aware that Linux, Windows, and x86(-64) have all but obliterated the Unix workstation and server market, forcing the remaining proprietary Unix vendors into an increasingly stratified high-end niche.

    I expect that the current thrend will continue, that is, the high-end market gets eaten from below by Windows and Linux and thus reduces the profitability of the proprietary vendors. It will be interesting to see who will be the first one to throw in the towel.

    My interpretation is that Linux kernel devs can not handle that kind of work load, they dont have the servers nor experience of building such servers. Oracle, IBM and HP has.

    To build a cluster is easy and Linux kernel devs can develop for them. But not for SMP servers costing many millions of USD.
    LOL. Fanboy much?

  7. #17
    Join Date
    Nov 2008
    Posts
    418

    Default

    Ok, thanks for your interesting and informative answer. However, I do not really feel that I have got a short and concise clear answer to my questions. Let me see if I can recap you, to see if I understand you correctly.



    1) SGI Altix systems are mostly running HPC workloads. Why?
    Answer: Altix is not interested in SMP workloads, however Altix is an SMP system and can handle SMP without problems.

    2) Why do many refer Altix system as HPC?
    Answer: Altix is an SMP but are mostly used for HPC. Thus, out of tradition people say Altix is HPC.

    3) Why is it easier to construct a 2048 core SMP, than to convince Linus?
    Answer: SGI has been doing SMP for a long time. It is easy for SGI to do 2048 core SMP, because SGI are experts on SMP.

    4) Why are traditional Unix SMP systems with 64 cpus (IBM/Solaris/HP-UX) much more expensive than Altix 2048 core?
    Answer: SGI is using x86 cpus, which are much cheaper.

    5) Why dont traditional Unix SMP just insert 16.384 cpus if you claim it is easy to build an SMP server?
    Answer: IBM/Sun/HP can not just insert 16.384 cpus without violating SGI patents.



    Is this correctly understood? You wrote lot of text, so I tried to condense what you wrote.

  8. #18

    Default

    Don' let Kebabbert to cheat you. It is worth to get the facts who Kebabbert is:

    http://forums.theregister.co.uk/foru..._old_iron_mia/

    He was rewarded by the head of Sun for for his diligent posting a few years ago (article in Swedish):
    http://www.idg.se/2.1085/1.202161/st...a-suns-julfest
    I'll reply to your FUD later, Kebb.

  9. #19
    Join Date
    Sep 2006
    Posts
    714

    Default

    Linux supporters say that Linux scales excellent, they say Linux scales to 1.000s of cores. So what is the deal, does Linux scale bad or what?
    When the Solaris folks say that Linux does not scale they are talking about I/O, not processing power.

    As far as HPC is concerned Linux is utterly dominate. In both clusters and large single image systems. Even the systems in the top500 that claim to be running Windows or whatever are actually dual booting Linux and run their production workloads on Linux. Many systems started off as Linux/proprietary Unix hybrids, but now are just running Linux. Stuff like that. The reason for this is that Linux has a massive amount of research into it, developers have become very comfortable with it, it's cheap, it's open source, and it stays out of the way. All these things are critical for HPC.

    For database work and I/O it's _completely_ different. Totally different ballgame then HPC. It still needs some maturing to get up to the same level as Solaris in large databases and stability under high loads. It'll get there, but it just takes time.

  10. #20
    Join Date
    Nov 2008
    Posts
    418

    Default

    Quote Originally Posted by kraftman View Post
    Don' let Kebabbert to cheat you. It is worth to get the facts who Kebabbert is:

    http://forums.theregister.co.uk/foru..._old_iron_mia/
    I am flattered that you invest lots of energy and time to learn more about me, Kraftman. Do you think of me, often? If you want to know more things about me, you can just ask me instead of googling around.

    Again, I have never tried to hide or conceal that I was invited to a Christmas dinner by Sun for a few years ago. You know, also Microsoft invited MS supporters to their Christmas party. And other companies too.

    It is well known that I was invited to a Christmas party, why should I hide that? Do you think I am trying to hide that? If I really wanted to hide that, I would never have let them took pictures of me, and I would have refused an interview. Instead I would have silently gone to the party, without saying anything in media. I have explained this before to you.



    I'll reply to your FUD later, Kebb.
    Please do. You have 4 questions from me, just as I had questions to "TheOrqwithVagrant" and he had questions to me, and we answered each other. Politely, without calling me "Idiot, FUD, Troll" and all the other things you frequently say. Do you want me to quote you, on this?

    And again, if you want to know more things about me, you can just ask me instead of trying to look me up. It is easier, and saves you time. I would not lie, you know that. You have never ever quoted me lying, even though I have asked you many times to cite a lie from me. I can always backup my claims with links and proofs, that is important to mathematicians. A liar does not care to backup his claims. But I care. Of course, if you ask too personal questions, then I might not answer, but in that case I will say so: "I prefer not to answer that question, as it is too personal". I will not make up a false answer that is not correct.

    So go ahead. What do you want to know about me? *flattered*
    Last edited by kebabbert; 11-16-2011 at 09:08 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •