Page 4 of 4 FirstFirst ... 234
Results 31 to 38 of 38

Thread: Will H.264 VA-API / VDPAU Finally Come To Gallium3D?

  1. #31
    Join Date
    Oct 2007
    Location
    Under the bridge
    Posts
    2,142

    Default

    Quote Originally Posted by yesterday View Post
    How the hell is MPEG-2 "free and open"? It is patented and licensed by MPEG-LA.

    MPEG-1 is basically MPEG-2 with lower resolution. It doesn't need hardware acceleration anyway.
    By the time webm/h264 gets acceleration is complete, these codecs won't need acceleration either. The future looks bright, indeed!

  2. #32
    Join Date
    Nov 2008
    Location
    Madison, WI, USA
    Posts
    870

    Default

    Quote Originally Posted by popper View Post
    did you put your current alpha/beta prototype code on a github or do you intend to soon? and when do you assume your thesis will be done

    there's all those other bit's and peace's of openCL/Cuda code mentioned on the x264dev logs etc too, it's not clear if they do/cover other stuff besides your "I've managed to convert Subpixel Prediction (sixtap + bilinear), IDCT, Dequantization, and the VP8 Loop filter" routines yet , as no ones bothered to collect them up and list the github etc location's on a web page somewhere
    I've been pushing to github since I started work in late October:
    http://github.com/awatry/libvpx.opencl

    The bound copy of my thesis is due in 3 weeks, final draft 3/31 or 4/1, don't remember which .

    I've only had Nvidia hardware to test on since my Radeon 4770 doesn't support the byte_addressable_store extension (5000-series and up only), but it runs on my GF9400m and a GTX 480 in current Ubuntu just fine. It also works fine on AMD Stream CPU-based OpenCL. I've gotten it working in Mac OS using CPU CL, but there's a bug in the Mac GPU-based acceleration that kills it every time and I haven't had time to track it down yet.

    Like I said, I'm hoping to keep working on this after graduation, either as a hobby, or professionally if someone's willing to pay. I've gotten the OpenCL initialization framework in place, have all of the memory management taken care of, and have most of the major parts of the decoding available as CL kernels.

    The next step that needs to be done is increasing the parallelism, as I'm currently capping out at 336 threads max, and the common case is only a few dozen threads, not enough to even approach achieve performance parity with the CPU-only paths. I've figured out a few ways to do that, especially in the loop filter (which accounts for 50% or so of the CPU-only execution time on a few of the 1080p videos I've profiled ). The sub-pixel prediction/motion compensation and Dequantization/IDCT will take a bit more work to thread effectively, but I think it can be done.

  3. #33
    Join Date
    Nov 2008
    Location
    Madison, WI, USA
    Posts
    870

    Default

    Quote Originally Posted by pingufunkybeat View Post
    Now we need Clover
    Why do you think I've been pushing for the GSoC proposal people who've been posting in the forums here to try to finish off the Clover state tracker?

    I'm sick of using the binary Nvidia drivers on my desktop/laptop, and I'd love to be able to switch back to the OSS drivers.

  4. #34
    Join Date
    Jan 2009
    Posts
    515

    Default

    If anyone interested, or would pick up this GSoC project, I do have some very early vaapi state_tracker code. I just got more important things to do, so I haven't touched it for a while. But the one doing the GSoC project, could get it if he/ she wants it.

  5. #35
    Join Date
    Jan 2007
    Posts
    459

    Default

    Quote Originally Posted by tball View Post
    If anyone interested, or would pick up this GSoC project, I do have some very early vaapi state_tracker code. I just got more important things to do, so I haven't touched it for a while. But the one doing the GSoC project, could get it if he/ she wants it.
    if your going to offer this or any other code then its always a good thing to put the github direct URL link in your post somewhere and put it on github if not already done so.

    then someone might make reference to it and encourage uptake and OC then there's always an off site backup if you loose your local hard drive with all that work on

  6. #36
    Join Date
    Nov 2008
    Location
    Madison, WI, USA
    Posts
    870

    Default

    Quote Originally Posted by popper View Post
    ... and OC then there's always an off site backup if you loose your local hard drive with all that work on
    This... I feel a bit more comfortable knowing that I have a minimum of 7 identical copies of my thesis code spread across at least 5 physical locations.

  7. #37
    Join Date
    Jan 2007
    Posts
    459

    Default

    Quote Originally Posted by Veerappan View Post
    This... I feel a bit more comfortable knowing that I have a minimum of 7 identical copies of my thesis code spread across at least 5 physical locations.
    LOL i thought you might.

    by the way although it's no direct use for for the gfx code side, i noticed on one of Jason Garrett-Glaser latest ffmpeg VP8: optimization patches Diego Elio Pettenò flameeyes mentioned the pahole utility from acmel's dwarves is designed to find the cacheline boundaries in structures, dont know if it's any good for the CPU side, but worth mentioning anyway just in case.

    http://ffmpeg.org/pipermail/ffmpeg-d...ch/109377.html

  8. #38
    Join Date
    Nov 2008
    Location
    Madison, WI, USA
    Posts
    870

    Default

    Quote Originally Posted by popper View Post
    LOL i thought you might.
    I had a co-worker lose a drive last summer that wasn't backed up, and realized that other than a periodic external drive backup of my Mac and Windows partitions, I didn't have much of a system in place.

    So now my desktop is running hardware RAID 1 with git checkouts in both Linux and Windows partitions, and my laptop has git checkouts of my stuff on all 3 of its operating systems (Win7, Mac, Linux). Both laptop and desktop are periodically backed up to external drives (separate drives for each system). Eventually, I'll probably store those drives in my desk at work, but for now they're on a shelf.

    I've got a co-located server in another state, the github master repository, and a checkout on my work computer. My HTPC has a copy as well (also RAID 1), just to provide another machine to test on.

    I know it's excessive, but I really don't want to try to use the "hard drive ate my homework" excuse. I knew people in undergrad who used that one, and it sounded lame even then.

    As far as the cache-line software goes, it could come in handy for profiling the CPU decoder. The reference VP8 decoder does force alignment to certain boundaries on many of its structures, but I haven't seen any work on cache line boundary detection (it may have happened, I just haven't seen it).

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •