True. I guess after LLVMPipe is mostly feature complete, they might do a few optimisation runs, and we will probably see some improvements.
And what will happen when the Gallium3D developers decide to deprecate TGSI, or use an LLVM front-end?
I believe that discussion came up several times recently.
There will be less translation going on, so possibly even more speedups?
Could you use this renderer, as a way of defining the LLVM-providing framework for Gallium?
Hmm, Would be cool, I think.
It seems like LLVM at this point would be an attractive failover pipe for hardware pipes being written. That way, even a pretty basic GPU pipe could accelerate enough to make the system usable (e.g. desktop composition, really basic games, video). Is anyone doing this or thinking about it?
Good article. The first sentence is quite awkward, but the topic of the article itself is pretty interesting.
Also, the new graphs, especially the bar graphs, look great. Nice work.
The idea is to use it for older chips without vertex shaders to do fast vertex processing.
Originally Posted by TechMage89
RE: Hyper Threading
Correct me if I'm wrong but doesn't Hyper Treading essentially require a tread to stall temporarily (do to a a cache miss or bad branch prediction) in order for there to be a performance gain? Basically it uses whatever downtime caused by one thread's stall to run another thread scheduled for the same physical core. So, if a thread stalls frequently you frequently get a payoff.
But haven't we been working to avoid situations where a thread stalls in compilers, kernels, and other performance critical software? Certainly we can't avoid them all, but does it not stand to reason then that the better GCC/LLVM, the kernel, and etc. gets the less benefit you'll see from Hyper Threading?
Also as the number of physical cores goes up, unless your software scales with it, you won't see as much of a benefit. If you have 6 cores for example, you need to be able to peg all 6 and be hungry for more or hyper threading to matter. Mostly, unless I'm doing some "make -j4" my processors are sitting idle and scaled down to 800 MHz. If I'm gaming, I may see activity on up to two cores with most of the heavy lifting done on the GPU.
So I guess I find myself wondering if it came down to just HT as the distinguishing bullet point, would I be just as well choosing a cheaper processor without HT. HT was just so much more interesting when single core systems were more common.
Another thought... If we know the likelihood a thread has of stalling within a time slice, we could employ some interesting scheduling for even greater performance than assuming all threads stall with around the same frequency. Advanced PGO for Intel SMT? Of course it would be very architecture specific optimization and perhaps not attractive to developers aiming to catch more than just the Intel crowd.
Very interesting results.
What has my mind spinning at the moment is the performance one could expect out of bulldozer. Yeah I know the compiler technology is immature and the structure of bull dozer is radically different. The thing is this software reveals weaknesses in Intels hyper threading so why not reveal the weaknesses in bulldozer?
Maybe the infra structure isn't up to the challenge yet but it would be nice to see Intel performance in context. That is in comparison to AMD latest. Actually when this gets running on ARM hardware I'd like to see that too. The goal isn't to declare a winner but to be better able to understand on systems work in relation to each other. A generic GPU card thrown into the mix would hurt either.
Tags for this Thread