In all of the benchmarks I've seen so far, all of LLVM's so-called "wins" are statistically insignificant, i.e. within 5%. Furthermore, not all of gcc's wins are due solely to the absence of OpenMP on LLVM; some of the benchmarked programs don't use OpenMP at all, yet still exhibit a 20% delta favoring gcc. Clearly, llvm has a long way to go. The problem is that the last mile is 10 times harder than the previous hundred miles; the last 10 feet are 10 times harder than the last mile; the last inch is 10 times harder than the last foot; and so on (the analogy works better with the metric system but I'm too lazy to edit what I typed).
Unless llvm literally copy and pastes much of the micro optimization stuff from gcc, there is no reason to think that they will implement those expensive optimizations in any sort of reasonable timeframe. Look how long it took gcc to develop them. Moreover, there wasn't really any other good open source competition to gcc at the time that the gcc devs were developing those optimizations, so they pretty much had to do them from scratch. Now, you might say that llvm developers could just study the overall algorithmic approach that gcc takes to optimizing so well, and base their own optimizations on that. But by your own admission, the internal architecture of gcc and llvm are wildly different. So llvm will not be able to easily copy and paste from gcc, even if they were permitted to do so by the license, because of the difference in architecture. In short, llvm will have to do things "mostly" from scratch, whereas gcc had to invent optimizations "entirely" from scratch.
Anyway, the whole point of a compiler is that it is a tool; it's not the end product in itself. Having complex internals vs. well-documented and elegant internals does not add any sort of value to the end product. Having a compiler that produces fast code (or small code, depending on your needs) is a value-added attribute. Having a compiler that is simply well-designed is not, by itself, a value-added attribute. If forced to choose between a compiler that has more/better value-added attributes versus one that does not, it should be a no-brainer for anybody who has ever taken a business class, or even anyone whose goal is to deliver high quality products to whomever their customers are.
I recall Michael posting an article/video a few months ago about some developer working on advanced tooling using LLVM, and I remember being pretty impressed. I mean, if you take this kind of thing to its logical extreme, C++ could almost start to approach the maintainability and productivity of Java, which is a huge feat for such a terrible language. So why don't we invest all those man-hours into the slow LLVM/Clang to make it as easy to develop with as Java, just so we can say C++ is the best? Meanwhile our "C++" will be almost as slow as Java because we aren't using a compiler that has fully explored runtime performance optimization.
Now, more than ever, gcc is poised to be able to produce better diagnostics than it has in the past. With the introduction of C++ into gcc's source code, internal APIs are being rewritten in an object-oriented manner, replacing old spaghetti code with a layered architecture that at least belongs in the same discussion as LLVM's architecture, even if LLVM is "even more layered" or "even more well-designed".
The point is, while LLVM is trying to catch up to gcc's performance, gcc is trying to catch up to LLVM's usefulness to developers. People are working on both sides to make both compilers better in their weaknesses. Claiming that gcc's complex/unmaintainable internals make it unable to match LLVM in the long run is plain wrong, if for no other reason than the fact that gcc's internals are actively being rewritten as we speak. But (hopefully) they'll be maintaining all of the optimizations they have today, just sitting them on top of more object boxes, separating it out into more shared libraries, etc. to make the code more maintainable.
I disagree, however, that the point of a compiler is to provide good diagnostics. I would rather that the compiler focus on what it does best -- compiling -- and run a different, separate tool that tells me why my code is bad.
Actually, I'm a huge fan of clang analyzer. That is exactly the kind of application that LLVM is best at. All hail the open source equivalent to Coverity, which I hope will in time produce even better diagnostics than Coverity itself, and then some.
In my perfect world, clang would -- as you said somewhere above -- be unable to actually codegen a built and linked binary. Its sole focus would be on helping developers improve their code by eliminating incorrect code (compiler errors, inadvisable practices, slow code, non-standard-compliant code, and so on). Clang could very easily fit into the open source ecosystem this way.
If Clang DOES fully catch up with gcc on performance of compiled code, that's great -- but I think it unlikely because clang is much more valuable if efforts are concentrated on its diagnostics, which you yourself admitted are the most important aspect of a developer tool (not to be confused with a compiler). Clang seems more focused on being a developer tool from the get-go, so why not just push that angle and leave the release builds to gcc?
First, as a premise: the year is 2012. Let's not get ahead of ourselves here.
- Are you compiling a release build? Then use GCC!
- Are you already familiar with GCC's error messages and know what to do when you see one? Then use GCC!
- Are you stumped by an error, or looking to improve the quality of your code? Then use clang or clang-analyzer!
- Are you writing a graphics driver and don't have the manpower / time to develop your own optimizing shader compiler? Then use LLVM!
Now, the year is 2014. Oh, wait -- I was about to make a similar list as above but with what the situation will be in 2014, but then I realized that I misplaced my crystal ball. Maybe you have it over there and can find it for me, elanthis?