OK... You have a series of things to do that alltogether make up the pipeline, right? And sending data back and forth between the GPU and the CPU takes too much time, right?
Originally Posted by Drago
So what you want to do is have the data go from the CPU to the GPU and never back.
So if what you're doing needs CPU fallback, it might be faster to do the entire pipe, for as long as possible, on the CPU, untill what you wanted to do has been done by the CPU and then send the rest of the pipeline to the GPU.
Now imagine you 'just' have to do a single thing on the CPU, like floating point, but floating point calculations are all the way from the geometry to the post processing effects, then you either need to do the entire pipe in the CPU, or send everything back and forth a billion times before comming to a single frame on the GPU's framebuffer...
In other words a single patent makes it impossible...
PS: I don't even know what we were talking about anymore now :P
Last edited by V!NCENT; 07-19-2011 at 02:34 PM.
I guess the actual problem is that dynpm does not really know the workload. It simply polls the queue every 100ms. If by pure chance it happens to be empty enough times downclocking is initiated.
Originally Posted by curaga
Usually it works, but under my test case that algorithm does not seem to work at all! I even tried adjusting the values for less aggressive downclocking, but it didn't help. It wouldn't be much better if it downclocked every 5 seconds instead of every 400ms or so.
I'll try something else...
Indeed, nvidia proprietary driver checks the state every 15 seconds. Seems a bit too long to me..
Originally Posted by ahlaht
Why not simply use a sliding window and upclock/downclock based on the average load after a couple of seconds? I mean, GPU-intensive apps usually run longer than a few miliseconds, so you get huge power savings even if you're switching a few seconds later.
Or is this really obvious and I'm missing something?
Most desktop effects work perfectly even with low power state on low-end hardware here, but you might have a point with mplayer.
Anyway, sounds great that you've improved it already! When will the patches hit mainstream git?
I seriously think that clocking should not be bound with the load, but instead to the maxframerate.
For example, not all load is worthy processing and some load is very worthy processing. Software can not determine which load is worth processing, unless it has profile. This is actually same to my proposition.
As a very nice example - nvidia blob runs Fallout 2 in Wine at MAX PERFORMANCE. And I can not affect this, there is even no permanent low-profile.
DRICONF should have a field, that activates, when you select your suggested dynpm AND when Vsync is off - where you can put the desired MAX framerate.
If vsync is on, maxframerate should default to Vsync(60 or 59,97)
Before that happens, maybe introduce xorg.conf Option, say Option "DynpmMaxFPS" "value"
Maybe this logic:
1) Determine current GPU load
2) Get framerate(tricky)
3) Compare current framerate to maxframerate
4) If difference within acceptable scope - exit
5) Else, approximate to shift gear up/down
6) If shifting up, do it, exit
7) Else, check shift down barrier(ticks, you default to 5 second) if true - shift down, exit; else exit.
Last edited by crazycheese; 07-21-2011 at 10:34 AM.
I suspect it wouldn't work with fast-paced games, but with games like Fallout 2 it might.
Originally Posted by crazycheese
It would certainly be possible to interpolate some more clock frequencies between clock modes and then select an exact match depending on framerate. Any idea how to get the framerate?