View Full Version : AMD/ATI: where is the FLOSS video decoding?
Dieter
09-03-2008, 11:37 AM
So ATI will allow decoding of Blu-ray using a "Catalyst" driver.
But only available to OEMs. Only for Linux. And I assume that
"Catalyst" is binary only? (Wikipedia doesn't say)
Binary-only drivers are 100% USELESS!
We need FLOSS drivers that decode video. Xv and XvMC at a minimum.
Longer term add H.264 and friends.
When are the UVD docs coming out?
http://xbitlabs.com/news/multimedia/display/20080902113646_ATI_to_Enable_High_Definition_Video _Playback_on_Linux_Based_Computers.html
deanjo
09-03-2008, 11:49 AM
So ATI will allow decoding of Blu-ray using a "Catalyst" driver.
But only available to OEMs. Only for Linux. And I assume that
"Catalyst" is binary only? (Wikipedia doesn't say)
Binary-only drivers are 100% USELESS!
We need FLOSS drivers that decode video. Xv and XvMC at a minimum.
Longer term add H.264 and friends.
When are the UVD docs coming out?
http://xbitlabs.com/news/multimedia/display/20080902113646_ATI_to_Enable_High_Definition_Video _Playback_on_Linux_Based_Computers.html
DMCA would not allow FOSS Bluray Decryption. Much like DVD decryption.
Dieter
09-03-2008, 12:13 PM
> DMCA would not allow FOSS Bluray Decryption. Much like DVD decryption.
I'm not looking for Bluray decryption. Just decoding of unencrypted
video.
MU_Engineer
09-05-2008, 10:40 AM
> DMCA would not allow FOSS Bluray Decryption. Much like DVD decryption.
I'm not looking for Bluray decryption. Just decoding of unencrypted
video.
The DRM decryption was integrated with the original UVD in the R600 so tightly that AMD/ATi says that they cannot make any of it public, lest people use it to decrypt the video DRM in violation of the DMCA. The UVD2 in RV670 and R700 cards (HD 3000 and HD 4000) has the DRM stuff slightly more isolated from the decoding and AMD/ATi *may* be able to publish enough specs for Xorg devs to use it while not giving away enough to allow for bypassing the DRM.
It all goes back to the fact that the studios are of the tack that if you provide a mechanism for somebody to do something illegal, that in itself is forbidden as well. That's an idiotic way to think as then we'd have to ban cars that can go over 15 mph (as they could break some speed limit somewhere), knives (as they could stab somebody), and matches (as they could be used for arson.) The MPAA cartel has enough control that if you tee them off with their DRM stance, they will not allow you to have any future decoding capabilities. There are also NDAs and other legal agreements not to knowingly disclose information for people to use to crack the DRM. Naturally, AMD does not want to run afoul of that. So they have to be very careful in what they do and do not give specs for and probably will further separate the DRM from the decoding in newer cards so they can publish the decoding specs without any hassle from the MPAA. That and you can expect to see some of the massive shader power modern cards have put to use in decoding when things like DRI2 comes out. That's the Xorg developers' can of worms right there, not AMD's or any other hardware manufacturer's.
Here's an idea for AMD: just don't put any kind of decrypting technology in your cards. Stay away from the MPAA and their DRM by targeting your cards to viewers of unencrypted video. You won't have to pay them any licensing fees, you won't have to sign any NDAs, you'll have one less component to manufacture, and you'll be able to open the docs up. Seriously, plenty of people just want to view home videos and hi-def content that was purposefully left unencrypted by the authors.
Oh, and BTW: Intel is opening up their video decode acceleration (http://www.phoronix.com/scan.php?page=news_item&px=NjcwNQ) right now (unencrypted, obviously)! AMD should be able to do the same.
- Intel's Linux and Windows driver development teams are working together in sharing their video code between the two platforms. Intel video playback on Linux should improve as a result, but first they're waiting on permission to release some of the Intel 965 video code that's more structured on the Windows side than their current Assembly-based implementation.
bridgman
09-05-2008, 07:37 PM
Here's an idea for AMD: just don't put any kind of decrypting technology in your cards. Stay away from the MPAA and their DRM by targeting your cards to viewers of unencrypted video. You won't have to pay them any licensing fees, you won't have to sign any NDAs, you'll have one less component to manufacture, and you'll be able to open the docs up. Seriously, plenty of people just want to view home videos and hi-def content that was purposefully left unencrypted by the authors.
The problem with that is we would lose perhaps 80% of our market, since OEMs would no longer purchase our products. No BluRay playback = essentially no OEM customers. We would not have the sales to continue funding state-of-the-art GPUs and would have to more or less drop out of the GPU business.
Oh, and BTW: Intel is opening up their video decode acceleration (http://www.phoronix.com/scan.php?page=news_item&px=NjcwNQ) right now (unencrypted, obviously)!
Intel already has some decode acceleration (XvMC is the API for MPEG2 decode acceleration)-- the quoted article just talks about "some better video acceleration code" from Windows.
AMD should be able to do the same.
We have already released the info required to accelerate the MC part of decode acceleration, and I expect we should be able to release the IDCT portion as well. There is another level of decode acceleration which nobody has opened up yet (the dedicated H.264/VC1 decode hardware) and that's what all the fuss is about.
Hi, thanks for responding!
The problem with that is we would lose perhaps 80% of our market, since OEMs would no longer purchase our products. No BluRay playback = essentially no OEM customers. We would not have the sales to continue funding state-of-the-art GPUs and would have to more or less drop out of the GPU business.
Perhaps I don't think like an OEM, but why can't OEMs simply tell their customers: our computers work great for unencrypted videos, but they don't do encrypted ones. That will show people why they shouldn't buy DRM-ed junk. (Rhetorical question...)
And what about graphics cards for mini-notebooks? I mean, there's not even any room for DVD/Blue-Ray drives in those devices, so why would an OEM require DVD/Blue-Ray decryption technology for them?
Also, according to Wikipedia (http://en.wikipedia.org/wiki/Comparison_of_ATI_Graphics_Processing_Units), UVD was only introduced in r600 and later. Can AMD open up the video decoding capabilities (MPEG, H.264/VC1, etc) for r100-r500?
bridgman
09-05-2008, 08:55 PM
And what about graphics cards for mini-notebooks? I mean, there's not even any room for DVD/Blue-Ray drives in those devices, so why would an OEM require DVD/Blue-Ray decryption technology for them?
Good point. If we did a specialized product for mini-notes it might be possible to skip the DRM stuff. Not sure what the extra cost would be (it's less work to leave it in than take it out) but worth looking into.
Also, according to Wikipedia (http://en.wikipedia.org/wiki/Comparison_of_ATI_Graphics_Processing_Units), UVD was only introduced in r600 and later. Can AMD open up the video decoding capabilities (MPEG, H.264/VC1, etc) for r100-r500?
There are two main parts to decode acceleration - IDCT and MC.
For MC, we have provided the info required on 5xx and earlier already -- MC acceleration is done on the 3D engine with some special rounding modes -- it's just waiting for someone to want XvMC enough to go implement it or for us to have some free time.
For IDCT, the IDCT hardware is actually there right through to the end of the 6xx family, ie it overlaps with UVD and only goes away with the arrival of UVD2 in the 780 and HD4xxx parts. I am going to look at opening up IDCT hardware after we get 6xx/7xx 3D engine and some basic power management info released (I'm still thinking both of those are higher priority).
The IDCT hardware was designed for MPEG2 decode but at first glance it looks like it should work for H.264 and VC-1 as well (VC-1 and one of the H.264 levels has a variety of different block sizes). MC should work fine for any of the standards.
The XvMC API can support MC-only or IDCT+MC decoding today, so there should be enough info out now to write an XvMC driver for R3xx through R5xx.
I'd really like h.264 decode. I like the special DRM free part idea if it means someone would write a nice open driver for the h.264 decoder. OTOH, can decode be done using the shaders? We should have the all the specs we need already. Just need to find someone to do the coding. Or, maybe, we're back to the missing memory manager before we can do anything else. Oh well, stick with the plan: release the 3D and Power docs.
Rant begins here:
Seems every thread asking for stable, fast, feature filled drivers (open or closed) ends with "need memory manager/xorg infrastructure". Allocate the programming resources to Xorg (if you haven't already).
bridgman
09-05-2008, 09:20 PM
Rant begins here:
Seems every thread asking for stable, fast, feature filled drivers (open or closed) ends with "need memory manager/xorg infrastructure". Allocate the programming resources to Xorg (if you haven't already).
Rant no more. Dave Airlie (Red Hat) is working on the memory manager and our devs are picking up some of the work that Dave would otherwise probably have to do himself. The memory manager work is in the "modesetting-gem" branch of DRM :
http://cgit.freedesktop.org/mesa/drm/log/?h=modesetting-gem
Redeeman
09-05-2008, 10:24 PM
dont forget, that you need _NOT_ have actual decryption in your hardware to be able to still play bluray.
what you meant to say is: "to be able to cater to the moronity of hollywood and some really ugly bluray players too lazy to decrypt themselves, WE have taken it upon ourselves to bend over, but dont worry, you, as a customer, get to bend even FURTHER.."
an elcheapo elcheap cheap celeron 1.6ghz can do 65MiB/s AES 256bit decryption on 1 core using an ordinary implementation, which i know for a fact can be optimized more.
this is the slowest kind of cpu you will find ANYONE having, and the slowest on the market (save of atom, but hah, you are NEVER EVER gonna sell amd graphics for those system!!!).
going further, i seem to recall(but dont hang me up on this) that bluray encryption is actually only 128bit - im no specialist in that area, but i would imagine this would make it faster.. BUT, for the sake or argument, ill ignore that.
i believe the highest bitrate you are ever going to find on a bluray disc or hddvd disc is somewhere around 50 megabits, actually i think smaller, because most of the ASIC's for decoding only specifies up to 41/47 mbits.. the personal highest 1080p h264 file i have peaks at 48mbit..
so.. at this point we gotta decode 6.25MiB/s of data at the peak times, on our cpu.. thats around 10% of ONE core on the slowest system you get today, and its peak values, and its ignore blatant things such as further optimized decryption and 128bit vs 256bit.
Then you could have your hardware ONLY do the actual decoding, and not care about this horribly small UNIMPORTANT matter
then applications such as powerdvd and such shit, can focus on bending over in hollywood, while you can deliver what PEOPLE actually want..
and before you start mentioning "trusted path" and crap.. thats bullshit :P we can play the format, we can pirate it. we can ALL. besides, you could just tell the morons in hollywood that its "ALL SECURE", and they'd believe you, they always do with the drm morons..
Dieter
09-05-2008, 11:44 PM
> Good point. If we did a specialized product for mini-notes it
> might be possible to skip the DRM stuff. Not sure what the
> extra cost would be (it's less work to leave it in than take
> it out) but worth looking into.
Not just for "mini-notes", for everyone.
Very simple: seperate out the DRM crap. Power for the DRM crap
circuitry is a seperate pin on the die. When mounting the die
into the package, either connect or do not connect the power for
the DRM crap circuitry. Then attach the lid. Chips with no
power to DRM crap circuitry cannot do the DRM crap. And don't
waste power on the DRM crap circuitry. So the non-DRM chips
are greener. Everything except the DRM crap circuitry gets
documented.
Cost: the dies are all the same so you don't have to make another
expensive mask set. Any die with a flaw in the DRM crap circuitry
gets put in the FLOSS bin rather than thrown away, so yield goes up
slightly. The cost of connecting vs not connecting power inside
the package should be minimal. Avoiding a seperate mask set is the
big win.
----------------
> I am going to look at opening up IDCT hardware after we get 6xx/7xx
> 3D engine and some basic power management info released (I'm still
> thinking both of those are higher priority).
Priority is the other way around, here's why:
What percentage of the population watches some form of video (TV,
movies, gootube, ...)? 99.9% ? (very large market, hint hint)
What percentage of the population would benefit from power management?
95% ? (everyone not running folding?)
What percentage of the population designs stuff using 3D CAD and
could benefit from 3D acceleration? 0.1% ?
----------------
If the UVD/UVD2 in the R[67]00 are a big problem, then get video
decoding working on the Rage family and R[1-5]00 first, and
work on R[67]00 later.
And make sure the folks designing the R800 and beyond isolate
the DRM crap.
MU_Engineer
09-05-2008, 11:55 PM
> Good point. If we did a specialized product for mini-notes it
> might be possible to skip the DRM stuff. Not sure what the
> extra cost would be (it's less work to leave it in than take
> it out) but worth looking into.
Not just for "mini-notes", for everyone.
Very simple: seperate out the DRM crap. Power for the DRM crap
circuitry is a seperate pin on the die. When mounting the die
into the package, either connect or do not connect the power for
the DRM crap circuitry. Then attach the lid. Chips with no
power to DRM crap circuitry cannot do the DRM crap. And don't
waste power on the DRM crap circuitry. So the non-DRM chips
are greener. Everything except the DRM crap circuitry gets
documented.
Cost: the dies are all the same so you don't have to make another
expensive mask set. Any die with a flaw in the DRM crap circuitry
gets put in the FLOSS bin rather than thrown away, so yield goes up
slightly. The cost of connecting vs not connecting power inside
the package should be minimal. Avoiding a seperate mask set is the
big win.
..except for the board partners. Do you think that very many will want to offer a second otherwise-identical SKU with the DRM connections (GPUs are BGA, not PGA) not attached? My guess is "heck no" as they'd be left with a pile of them left over as OEMs and retailers don't want to sell a product that Joe Six Pack Windows User accidentally buys and it won't play the "Pretty Woman" Blu-Ray disc he has, so he gripes and grips about $RETAILER and $BOARD_MAKER being crappy.
The separated DRM and decode sections would be enough. An OSS driver could just ignore that part of the chip with no consequences. That little handful of transistors making up the DRM decode section wouldn't draw all that much power anyway.
Good point. If we did a specialized product for mini-notes it might be possible to skip the DRM stuff. Not sure what the extra cost would be (it's less work to leave it in than take it out) but worth looking into.
Less costs more? I would think that not having to pay for the DRM hardware/licensing fees every time a board is manufactured will cancel any engineering costs. Anyway, mini-notebooks would make a good excuse to create a DRM-less design which can then act as a template for full-sized laptops and desktops.
..except for the board partners. Do you think that very many will want to offer a second otherwise-identical SKU with the DRM connections (GPUs are BGA, not PGA) not attached? My guess is "heck no" as they'd be left with a pile of them left over as OEMs and retailers don't want to sell a product that Joe Six Pack Windows User accidentally buys and it won't play the "Pretty Woman" Blu-Ray disc he has, so he gripes and grips about $RETAILER and $BOARD_MAKER being crappy.
All they'd have to do is to warn customers about the pitfalls of encrypted disks. And just to prove that the hardware is capable of playing unencrypted video well, they can include a demo DVD with the motherboard or point people to a website to download the video. Then Joe Six Pack will rightfully blame the MPAA and avoid buying their DRM junk next time he's looking for a video. The sooner "encrypted content" becomes loathed by customers, the better.
MU_Engineer
09-06-2008, 10:37 AM
All they'd have to do is to warn customers about the pitfalls of encrypted disks. And just to prove that the hardware is capable of playing unencrypted video well, they can include a demo DVD with the motherboard or point people to a website to download the video. Then Joe Six Pack will rightfully blame the MPAA and avoid buying their DRM junk next time he's looking for a video. The sooner "encrypted content" becomes loathed by customers, the better.
You mis-understand the average customer, who routinely blames their computer's hardware manufacturer for software problems (e.g. "This Dell sucks as it made Outlook lose my e-mails!") They will not blame the MPAA for anything as they don't even KNOW about DRM. They are used to DVDs "just working" and think of Blu-Ray and other DRMed stuff as just "high-def DVDs" and expect them to work well. If they stick a Blu-Ray disk into their shiny new HP with an AMD GPU that does not support DRM, what do they do? They don't blame the MPAA (they don't know who the MPAA is), they don't blame AMD (they don't even know who AMD is)- they blame HP! HP, Dell, and every other OEM knows this and they do NOT want to risk losing sales in a very cut-throat market.
Unfortunately, about the only way DRM will go away is if some device that does something other than play back a physical disk becomes popular and people actually see what DRM is designed to prevent them from doing and get teed off. Portable MP3 players, particularly the iPod showed people what music DRM was and how it was a PITA and what happened? There is a big push *away* from DRM. I guess we'll need for people to want to rip Blu-Ray disks to a portable player before video DRM starts to go away.
Dieter
09-06-2008, 07:37 PM
MU_Engineer> ..except for the board partners. Do you think that very
MU_Engineer> many will want to offer a second otherwise-identical SKU
MU_Engineer> with the DRM connections (GPUs are BGA, not PGA) not attached?
There are tons of nearly but not quite identical boards (and
various other products) out there, so obviously most companies
don't mind. I know the head of one company that would be very
interested in making boards with documented GPU chips. Getting
boards made is NOT a problem.
Bridgman> Good point. If we did a specialized product for mini-notes it
Bridgman> might be possible to skip the DRM stuff. Not sure what the extra
Bridgman> cost would be (it's less work to leave it in than take it out)
Bridgman> but worth looking into.
MU_Engineer> The separated DRM and decode sections would be enough.
Perhaps. I was just responding to Bridgman's interest,
pointing out a way to do it without the cost of a 2nd
mask set.
MU_Engineer> That little handful of transistors making up the DRM decode
MU_Engineer> section wouldn't draw all that much power anyway.
I don't work at ATI so I don't know how many transistors the
DRM crap uses. But unless Hollywood is willing to pay my electric
bill, I want those useless transistors powered off.
Stan> And just to prove that the hardware is capable of playing
Stan> unencrypted video well, they can include a demo DVD with the
Stan> motherboard or point people to a website to download the video.
Both. HD video is too big to download with a POTS modem, which
many people still use. The cost of including a URL is basically
zero, so you might as well include it.
MU_Engineer> I guess we'll need for people to want to rip Blu-Ray disks
MU_Engineer> to a portable player before video DRM starts to go away.
I'd think that would be happening by now?
There are two main parts to decode acceleration - IDCT and MC.
For MC, we have provided the info required on 5xx and earlier already -- MC acceleration is done on the 3D engine
Is the 3D engine the optimal component for 5xx MC acceleration (ie. does Windows use 3D for the same purpose), or is it the second best choice on Linux because the the real hardware designed for 5xx MC acceleration won't be documented?
I am going to look at opening up IDCT hardware
Excellent, thanks!
The IDCT hardware was designed for MPEG2 decode but at first glance it looks like it should work for H.264 and VC-1 as well (VC-1 and one of the H.264 levels has a variety of different block sizes). MC should work fine for any of the standards.
Accelerating Ogg Theora would be cool, if at all possible :)
The XvMC API can support MC-only or IDCT+MC decoding today, so there should be enough info out now to write an XvMC driver for R3xx through R5xx.
Wonderful, thanks a lot for summarizing what is possible with the current amount of documentation!
bridgman
09-06-2008, 11:42 PM
Is the 3D engine the optimal component for 5xx MC acceleration (ie. does Windows use 3D for the same purpose), or is it the second best choice on Linux because the the real hardware designed for 5xx MC acceleration won't be documented?
We use the 3D engine for MC with MPEG2 on Windows. Not sure whether we use it for H.264/VC-1 in Windows or if the players do everything in software, will try to find out. Anyways, the 3D engine is the only MC mechanism in a 5xx part (except for the RV550 aka HD2300 which was a hybrid -- a 5xx GPU with UVD.
Accelerating Ogg Theora would be cool, if at all possible :)
At first glance it looks like it should be possible. Theora uses 8x8 blocks (16x16 macroblocks) so the same hardware should work. I took a quick skim through the Theora spec and didn't see anything specific about rounding details during motioni comp processing but I think the MC-mode rounding in the 3D engine is pretty generic and should work.
Someone still has to *write* the decoder, of course.
http://theora.org/doc/Theora
W3ird_N3rd
09-16-2008, 08:49 PM
Just to get an idea: imagine a video acceleration API (like VAAPI) would be supported by ffmpeg to accelerate h.264. And imagine the open R3xx-R5xx drivers would support MC and IDCT in VAAPI (or anything ffmpeg would support).
Now say an h.264 movie eats exactly 100% CPU (single core) with the loop filter disabled. How much CPU % would still be used with the loop filter still disabled and MC and IDCT handled by the graphics card? Would it be worth it?
Because for me, although UVD would be very cool, I would already be very happy if I can play everything without my CPU being so close to 100% or even framedropping. I don't mind my CPU still having to do some work.
bridgman
09-17-2008, 10:39 AM
It's still not clear to me why the in-loop deblocking filter can't be implemented in shaders. It's just a filter kernel and modern GPUs are pretty good at that kind of work.
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.