Page 2 of 3 FirstFirst 123 LastLast
Results 11 to 20 of 28

Thread: AMD Fusion On Gallium3D Leaves A Lot To Be Desired

  1. #11
    Join Date
    Jul 2010
    Posts
    504

    Default

    I see. What I am wondering about is why do graphics card drivers require this amount of manpower? Is it the hardware or the API? How is it possible to write driver code that could be more than ten times faster? Or is it that the hardware doesn't map to the exposed API(OpenGL) and requires complex translation? Sorry if the questions sound silly, I've never worked with bare hardware.
    Last edited by log0; 04-16-2012 at 09:56 AM.

  2. #12
    Join Date
    Apr 2008
    Location
    Zagreb, Croatia
    Posts
    115

    Default

    Open source driver seems roughly the same level of performance
    as open source driver for Intel's SB graphics. I guess one should
    really use Catalyst. If your card is supported, though.

  3. #13
    Join Date
    Jun 2009
    Posts
    2,932

    Default

    Quote Originally Posted by log0 View Post
    I see. What I am wondering about is why do graphics card drivers require this amount of manpower? Is it the hardware or the API? How is it possible to write driver code that could be more than ten times faster? Or is it that the hardware doesn't map to the exposed API(OpenGL) and requires complex translation? Sorry if the questions sound silly, I've never worked with bare hardware.
    N.B. I'm not a driver developer, just an interested observer

    In this case, the open source driver was forced into a low-power mode, while the proprietary driver was going on at full blast. Also, it's possible that not all functionality (such as tiling) was enabled on the open source driver. When there's an order-of-magnitude difference, then either something is wrong, or the driver is too new and there's lots of work needed still.

    The problem with OpenGL drivers (and GPU drivers in general) is that they are amazingly complex hardware that takes incredible amounts of code (especially full OpenGL support). It's much more complex than a network card driver or a mouse driver. With most chips, the Gallium3d drivers for radeons are around 60-70% of the proprietary driver, which is as close as you can get with "regular" effort.

    Then the things get complicated. A GPU driver runs on the CPU and often has to do many things before it can prepare a frame for rendering. If it is not optimised, then the time adds up, lots of little delays all over the stack, which need to be optimised one-by-one, hundreds of them. This is very time-intensive and takes a lot of manpower. If you are running something at 100 frames per second ,then this quickly adds up and makes a huge difference. Even a small delay multiplied by 100 becomes a long wait. That's why the developers are first focusing on getting a driver working correctly, and only then try to optimise it.

    With some work, and Tom's VLIW packetiser and the new shader compiler, and the Hyper-Z support, things should come to more than 80% of the proprietary performance, perhaps even more (rough guess). That's really good, and the additional work after than becomes too complex, with very little gain.

  4. #14
    Join Date
    Sep 2010
    Posts
    229

    Default

    Quote Originally Posted by log0 View Post
    I see. What I am wondering about is why do graphics card drivers require this amount of manpower?
    I think AMD & Nvidia easily have > 100 people programming on their (closed source) drivers, so "5 extra developers" isn't a big amount of manpower...

  5. #15
    Join Date
    Nov 2008
    Location
    Madison, WI, USA
    Posts
    877

    Default

    Quote Originally Posted by pingufunkybeat View Post
    In this case, the open source driver was forced into a low-power mode, while the proprietary driver was going on at full blast. Also, it's possible that not all functionality (such as tiling) was enabled on the open source driver. When there's an order-of-magnitude difference, then either something is wrong, or the driver is too new and there's lots of work needed still.
    From my A6-3500 series (via ssh, the machine is currently idle sitting at a mythtv front end screen):

    me@mybox:/sys/class/drm/card0/device# cat /sys/class/drm/card0/device/power_method
    profile

    me@mybox:/sys/class/drm/card0/device# cat /sys/class/drm/card0/device/power_profile
    default

    me@mybox:/sys/kernel/debug/dri/0# cat /sys/kernel/debug/dri/0/radeon_pm_info
    default engine clock: 200000 kHz
    current engine clock: 11880 kHz
    default memory clock: 667000 kHz


    So an order of magnitude difference between Catalyst and r600g is to be expected if Michael left the power management in its default state. If he forced the APU under Gallium3D into high performance mode (or maybe dynpm profile), things would have probably been different.

    I'm not positive about how the default clocking on the APUs work, but I'm seeing some variation in the GPU clock on my machine. It goes as low as 7Mhz and as high as 30Mhz when idling, and I'm not sure how conservative the reclocking (which seems enabled by EFI/BIOS by default) actually is. So forcing the APU to high-performance mode might help things.

  6. #16
    Join Date
    Jan 2009
    Posts
    626

    Default

    Quote Originally Posted by log0 View Post
    I see. What I am wondering about is why do graphics card drivers require this amount of manpower? Is it the hardware or the API? How is it possible to write driver code that could be more than ten times faster? Or is it that the hardware doesn't map to the exposed API(OpenGL) and requires complex translation? Sorry if the questions sound silly, I've never worked with bare hardware.
    Yes, the translation is really complex as far as OpenGL is concerned. Implementing a performant shader compiler is also not easy. Then, there are hardware optimizations which you can use, like texture tiling, hierarchical Z-Stencil buffers, colorbuffer compression, etc.

    We need a driver which:
    1) doesn't starve the GPU by doing too much CPU work
    2) doesn't synchronize the CPU with the GPU, so that the two can operate asynchronously
    3) takes advantage of every hardware feature which improves performance

    FYI, I was told by some NVIDIA guy face-to-face a few years ago that their Vista GPU drivers had 20 million lines of code. The entire Linux kernel has only 14.3M.

  7. #17
    Join Date
    Dec 2007
    Posts
    2,395

    Default

    FWIW, we have hundreds of developers working on the closed source AMD driver and the closed driver was ~40 million LOC last time I checked which was a while back.

  8. #18
    Join Date
    May 2007
    Posts
    231

    Default

    Quote Originally Posted by pingufunkybeat View Post
    N.B. I'm not a driver developer, just an interested observer

    In this case, the open source driver was forced into a low-power mode, while the proprietary driver was going on at full blast. Also, it's possible that not all functionality (such as tiling) was enabled on the open source driver. When there's an order-of-magnitude difference, then either something is wrong, or the driver is too new and there's lots of work needed still.

    The problem with OpenGL drivers (and GPU drivers in general) is that they are amazingly complex hardware that takes incredible amounts of code (especially full OpenGL support). It's much more complex than a network card driver or a mouse driver. With most chips, the Gallium3d drivers for radeons are around 60-70% of the proprietary driver, which is as close as you can get with "regular" effort.

    Then the things get complicated. A GPU driver runs on the CPU and often has to do many things before it can prepare a frame for rendering. If it is not optimised, then the time adds up, lots of little delays all over the stack, which need to be optimised one-by-one, hundreds of them. This is very time-intensive and takes a lot of manpower. If you are running something at 100 frames per second ,then this quickly adds up and makes a huge difference. Even a small delay multiplied by 100 becomes a long wait. That's why the developers are first focusing on getting a driver working correctly, and only then try to optimise it.

    With some work, and Tom's VLIW packetiser and the new shader compiler, and the Hyper-Z support, things should come to more than 80% of the proprietary performance, perhaps even more (rough guess). That's really good, and the additional work after than becomes too complex, with very little gain.
    I don't understand why people believe shader optimization is a big issue. On all the benchmark in this article a better shader would most likely wouldn't make a mesurable differences. Marek has far better point to explain the gap.Oh and if you want to convince yourself that shader is not a big issue, take a big shader of doom3, do a sample gl program that use that shader to draw quad covering biggest fbo possible on your generation, draw thousand of time, then hand optimize the shader and hack r600g to use your hand optimized version. Compare, last time i did such things the difference wasn't that big.

  9. #19
    Join Date
    Nov 2009
    Location
    Italy
    Posts
    970

    Default

    glisse it's because nouveau is faster than radeon: considering nouveau isn't backed by nvidia and there isn't any documentation that's quite strange and peoples started searching for a culprit.
    ## VGA ##
    AMD: X1950XTX, HD3870, HD5870
    Intel: GMA45, HD3000 (Core i5 2500K)

  10. #20
    Join Date
    Feb 2008
    Location
    Linuxland
    Posts
    5,187

    Default

    @glisse

    Do you mean TGSI or r600 asm?

    My GSOC shader (TGSI) was 20% faster when hand-optimized compared to Mesa's GLSL compiler. But that's only at TGSI level, I believe it would be much faster if properly compiled (maybe wrong word) down to r600g asm instead of the simple replacement that I understand is the current status.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •