Page 3 of 3 FirstFirst 123
Results 21 to 25 of 25

Thread: X.Org SoC: Gallium3D H.264, OpenGL 3.2, GNU/Hurd

  1. #21
    Join Date
    May 2009
    Posts
    80

    Default

    VP6 is the most widespread so that too.

  2. #22
    Join Date
    Jan 2009
    Posts
    1,445

    Default I wasn't aware of that.

    Quote Originally Posted by bridgman View Post
    I really believe that the "missing link" so far has been someone grafting libavcodec onto the driver stack so that processing can be incrementally moved from CPU to GPU.

    Also, I forgot to mention that the other benefit of a shader-based implementation is that there are a lot of cards in use today which have a fair amount of shader power but which do not have dedicated decoder HW (ATI 5xx, for example).
    I knew that awhile back attempts were made to use shaders to handle specific aspects of the decode process, but then I don't recall anything else from the dev. I also don't recall what codec was worked upon. A problem, IIRC, was that the shaders were fairly slow in their efforts , but I don't recall what hardware it was being tested upon. Presumably, as you say, a more modern card would be better able to handle the load. A question I always had was one of power efficiency. Offloading for high bitrate material is a necessity, certainly, but for a lower bitrate target I wonder if a cpu wouldn't be more efficient, especially for a simpler codec like Theora.

  3. #23
    Join Date
    Oct 2007
    Location
    Toronto-ish
    Posts
    7,516

    Default

    The work over Gallium3D was with MPEG2, using the XvMC API, mostly by Younes Manton on Nouveau :

    http://bitblitter.blogspot.com/

    Cooper then got a good chunk of that code running on the 300g ATI driver before getting dragged off to other projects.

    Somewhere in there a video API was defined and at least partially implemented, not exactly sure who did what there.

    I don't think we have any good power efficiency numbers yet re: whether CPU or GPU shaders do the offloadable work more efficiently. First priority was offloading enough work to the GPU so that the remainder could be handled by a single CPU thread, since the MT version of the CPU codecs wasn't very mature, and without the ability to use multiple CPU cores anything near 100% of a single core meant frame dropping and other yukkies.

    Since then, multithread decoders seem to have become more stable (at least more people seem to be using them), so the pull for GPU decoding has dropped somewhat. I don't know the status of the MT codecs right now, ie whether they are easily accessible to all users or whether they still need a skilled user to build and tweak 'em.

  4. #24
    Join Date
    Oct 2009
    Posts
    2,131

    Default

    Quote Originally Posted by bridgman View Post
    The work over Gallium3D was with MPEG2, using the XvMC API, mostly by Younes Manton on Nouveau :

    http://bitblitter.blogspot.com/

    Cooper then got a good chunk of that code running on the 300g ATI driver before getting dragged off to other projects.

    Somewhere in there a video API was defined and at least partially implemented, not exactly sure who did what there.

    I don't think we have any good power efficiency numbers yet re: whether CPU or GPU shaders do the offloadable work more efficiently. First priority was offloading enough work to the GPU so that the remainder could be handled by a single CPU thread, since the MT version of the CPU codecs wasn't very mature, and without the ability to use multiple CPU cores anything near 100% of a single core meant frame dropping and other yukkies.

    Since then, multithread decoders seem to have become more stable (at least more people seem to be using them), so the pull for GPU decoding has dropped somewhat. I don't know the status of the MT codecs right now, ie whether they are easily accessible to all users or whether they still need a skilled user to build and tweak 'em.
    The status of MT is that ffmpeg-mt blows. Better than a single thread, but it pretty much requires a 4-core chip in order to play a typical 10GB/movie file. The coreavc hacks (i.e. wine+coreavc+mplayer) work OK, seems to degrade output quality as the CPU limits are reached rather than dropping frames and going out of sync. Still pretty well maxes out a 2-core CPU and gets the heat up so much that you end up with the vacuum cleaner effect (unless you have a ***HUGE*** heat sync and a big slow fan). Vacuum cleaner effect + movie == very bad.

  5. #25
    Join Date
    Oct 2007
    Location
    Toronto-ish
    Posts
    7,516

    Default

    OK, I guess it might be interesting to see if anyone is actively profiling that code to see where the CPU time is going, and how much of that time is going to "shader-friendly" tasks like MC and filtering.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •