Page 7 of 7 FirstFirst ... 567
Results 61 to 69 of 69

Thread: AMD Radeon HD 6000 Gallium3D Attempts To Compete With Catalyst

  1. #61
    Join Date
    Aug 2009
    Posts
    2,264

    Default

    Quote Originally Posted by Drago View Post
    Can you please explain this more thoroughly, please.
    OK... You have a series of things to do that alltogether make up the pipeline, right? And sending data back and forth between the GPU and the CPU takes too much time, right?

    So what you want to do is have the data go from the CPU to the GPU and never back.

    So if what you're doing needs CPU fallback, it might be faster to do the entire pipe, for as long as possible, on the CPU, untill what you wanted to do has been done by the CPU and then send the rest of the pipeline to the GPU.

    Now imagine you 'just' have to do a single thing on the CPU, like floating point, but floating point calculations are all the way from the geometry to the post processing effects, then you either need to do the entire pipe in the CPU, or send everything back and forth a billion times before comming to a single frame on the GPU's framebuffer...

    In other words a single patent makes it impossible...

    PS: I don't even know what we were talking about anymore now :P
    Last edited by V!NCENT; 07-19-2011 at 02:34 PM.

  2. #62
    Join Date
    May 2011
    Posts
    58

    Default

    Quote Originally Posted by curaga View Post
    I thought that was already done, but it was just too trigger-happy, which is why I suggested a time check.
    I guess the actual problem is that dynpm does not really know the workload. It simply polls the queue every 100ms. If by pure chance it happens to be empty enough times downclocking is initiated.

    Usually it works, but under my test case that algorithm does not seem to work at all! I even tried adjusting the values for less aggressive downclocking, but it didn't help. It wouldn't be much better if it downclocked every 5 seconds instead of every 400ms or so.

    I'll try something else...

  3. #63
    Join Date
    Apr 2010
    Posts
    1,946

    Default

    Quote Originally Posted by ahlaht View Post
    I guess the actual problem is that dynpm does not really know the workload. It simply polls the queue every 100ms. If by pure chance it happens to be empty enough times downclocking is initiated.

    Usually it works, but under my test case that algorithm does not seem to work at all! I even tried adjusting the values for less aggressive downclocking, but it didn't help. It wouldn't be much better if it downclocked every 5 seconds instead of every 400ms or so.

    I'll try something else...
    Indeed, nvidia proprietary driver checks the state every 15 seconds. Seems a bit too long to me..

    http://tutanhamon.com.ua/technovodst...A-UNIX-driver/

  4. #64
    Join Date
    May 2011
    Posts
    58

    Default

    Quote Originally Posted by crazycheese View Post
    Indeed, nvidia proprietary driver checks the state every 15 seconds. Seems a bit too long to me..

    http://tutanhamon.com.ua/technovodst...A-UNIX-driver/
    Thanks for that link. I wouldn't have even considered a delay that long. I think it makes sense on desktops where power saving is not that important.

    I have determined that the dynpm switching code is much more broken than I originally thought. What it does is almost completely unpredictable. The problem is that sometimes (and actually most of the on my system) the GPU load can be highly random: the GPU can be both idle (fences == 0) and busy (fences >= 3) within milliseconds. According to my testing the correct action to do in those cases is to always UPCLOCK. But instead dynpm gets confused and doesn't know what to do. It may even downclock.

    Fortunately I have already made a fix.

    Basically I want dynpm to work like this:
    - Immediately upclock without hesitation when needed (Otherwise my special test program does not run correcty)
    - NEVER downclock when it would hurt performance

  5. #65
    Join Date
    Jun 2009
    Posts
    2,926

    Default

    Why not simply use a sliding window and upclock/downclock based on the average load after a couple of seconds? I mean, GPU-intensive apps usually run longer than a few miliseconds, so you get huge power savings even if you're switching a few seconds later.

    Or is this really obvious and I'm missing something?

  6. #66
    Join Date
    May 2011
    Posts
    58

    Default

    Quote Originally Posted by pingufunkybeat View Post
    Why not simply use a sliding window and upclock/downclock based on the average load after a couple of seconds? I mean, GPU-intensive apps usually run longer than a few miliseconds, so you get huge power savings even if you're switching a few seconds later.

    Or is this really obvious and I'm missing something?
    I think desktop effects or things like that may need faster upclocking. (Not sure I don't use any.)

    Also if I play a video clip with "mplayer -vo gl" sound tends to go out of sync if upclocking is not done fast.

    Anyway I already fixed it. I'm currently trying to find something that does NOT work correctly with my fixes. So far it reclocks perfectly.

  7. #67
    Join Date
    Jun 2009
    Posts
    2,926

    Default

    Most desktop effects work perfectly even with low power state on low-end hardware here, but you might have a point with mplayer.

    Anyway, sounds great that you've improved it already! When will the patches hit mainstream git?

  8. #68
    Join Date
    Apr 2010
    Posts
    1,946

    Default

    I seriously think that clocking should not be bound with the load, but instead to the maxframerate.

    For example, not all load is worthy processing and some load is very worthy processing. Software can not determine which load is worth processing, unless it has profile. This is actually same to my proposition.

    As a very nice example - nvidia blob runs Fallout 2 in Wine at MAX PERFORMANCE. And I can not affect this, there is even no permanent low-profile.
    DRICONF should have a field, that activates, when you select your suggested dynpm AND when Vsync is off - where you can put the desired MAX framerate.
    If vsync is on, maxframerate should default to Vsync(60 or 59,97)
    Before that happens, maybe introduce xorg.conf Option, say Option "DynpmMaxFPS" "value"

    Maybe this logic:
    1) Determine current GPU load
    2) Get framerate(tricky)
    3) Compare current framerate to maxframerate
    4) If difference within acceptable scope - exit
    5) Else, approximate to shift gear up/down
    6) If shifting up, do it, exit
    7) Else, check shift down barrier(ticks, you default to 5 second) if true - shift down, exit; else exit.
    Last edited by crazycheese; 07-21-2011 at 10:34 AM.

  9. #69
    Join Date
    May 2011
    Posts
    58

    Default

    Quote Originally Posted by crazycheese View Post
    I seriously think that clocking should not be bound with the load, but instead to the maxframerate.
    I suspect it wouldn't work with fast-paced games, but with games like Fallout 2 it might.

    It would certainly be possible to interpolate some more clock frequencies between clock modes and then select an exact match depending on framerate. Any idea how to get the framerate?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •