This first point in Bridgeman's post is what I was talking about.

These steps made code optimized for the previous architecture run like crap on the new.
I don't understand. I was talking about a new shader compiler for the new architecture in the open source graphics stack. You seem to be talking about proprietary OpenCL.