I expect OpenCL performance is going to be driven by two things - shader compiler efficiency packing multiple instructions into a VLIW, and memory manager maturity. Both are hard but not impossible, and I *think* the OpenCL use cases should be less varied than OpenGL and easier to deal with. Other than that, raw power of the GPU should drive things.
That was what I was hoping for

Remember that there hasn't been a lot of optimization on the Gallium3D code base yet, just a push to get to classic Mesa levels so that 300g can replace 300.
What do you mean by 'the' Gallium3D code base? KMS+Mesa? KMS+rxxxg+Mese? Or everything related to a r300GPU to make 3D work?