ok, makes sense. But shouldn't the programmer use inline functions or macros in this case?
I guess I will add the inline parameter to my CXXFLAGs and for single C packages.
Function inlining varies a lot between software. In some cases, it gives huge speedups. Other times, it just results in slower performance and greater memory use. It can vary depending on how large your CPU cache is as well.
You can even manually set the depth the compiler will inline down to - something Firefox does for example, because the default -O3 inlining was too much, but by limiting the inlining amount they could still turn on -O3 and get better results than plain old -O2.
Question is indeed if mesa is speed limiting step (aka bottleneck) in the whole system here. But it won't hurt to keep my Gentoo CFLAGS like they are. Mainly march set and -O2. In few cases I actually use -Os for VIA CPUs or AMD's old Geode LX. Few packages might dislike messing too much with CFLAGS though.
It's much more likely to be with faster GPUs and lower resolutions. Michael testing an IGP at 1080p probably isn't going to show a lot.
I guess the bottleneck of most videogames is not OpenGL, unless the game is designed for high-end graphics card. Check this with any profiler: gl... calls are almost unnoticeable amoung game physics and logic. Compiling the actual software and main libraries instead of driver could give a very different result.
Not in my experience. I've run a lot of benchmarks and games, and 'sysprof' often shows that _mesa_* calls (which are the actual implementation of the gl* calls) are a very noticable percentage.