PDA

View Full Version : radeonhd r6xx-7xx EXA performance patch


Obscene_CNN
05-20-2009, 10:48 AM
Hi everyone

There has been discussions about how much coding efficiency improved speed in video drivers. One opinion is that with the CPU stuffing video card commands in a buffer for the video card to DMA them into its command buffer as needed that coding efficiency doesn't matter much. The other opinion is that every little bit of speed helps.

With a little slack time at work I decided to hot rod radeonhd's r6xx/7xx exa acceleration code and do some tests and create a patch. I don't expect this patch to be accepted because its sloppy and it makes the code unmaintainable. The reason for doing it was to see what some of the potential for speed improvement actually was.

Benchmarking with the command "x11perf -v1.3 -all" between the latest git and my patch showed gains in some tests greater than 10%. Notable are the gains gains in the move window, put image, and
circle drawing tests. There were a few slow downs too but overall its a win.

People with fast video cards, slow processors, or both will see the most gains.

My patch should apply cleanly to radeonhd-1.2.5 all the way through a git pull of todays date.

latest patch http://lists.opensuse.org/radeonhd/2009-05/msg00288.html

Performance of a previous version http://lists.opensuse.org/radeonhd/2009-04/msg00208.html

monraaf
05-21-2009, 12:11 AM
My patch should apply cleanly to radeonhd-1.2.5 all the way through a git pull of todays date.

latest patch http://lists.opensuse.org/radeonhd/2009-05/msg00288.html

Is there someplace I can download the patch? copy/pasting it from that message results in a malformed patch...

Obscene_CNN
05-21-2009, 03:28 AM
Is there someplace I can download the patch? copy/pasting it from that message results in a malformed patch...

Oops my bad.

Sorry.

http://pastebin.com/f304c7654

EDIT:
Note when downloading from this link it will down load as a dos text file with a CR/LF at the end of every line.

we can fix it with the command

sed 's/.$//' patch > fixed.patch

ssam
05-21-2009, 06:24 AM
is it feasible to clean the patch up into something maintainable that might get accepted, or is it intrinsically messy. a 10% speed up would be nice.

Obscene_CNN
05-21-2009, 10:16 AM
I could clean it up a little, however its inherently messy. As long as there are new chips being supported by this section of code it is not reasonable to expect this to be incorporated. There may be some things that could go in but I doubt it would make much improvement in speed.

Obscene_CNN
05-21-2009, 12:38 PM
By the way,

I would appreciate any before and benchmark data from people trying my patch. Also please include speed of processor and gpu chip. Thanks

monraaf
05-21-2009, 01:05 PM
I did some quick testing with GtkPerf, and I couldn't find a significant performance increase.



Test rounds: 100 Test All

Driver: Minimum Total time:
--------------------------------------------------------
radeon (latest git) 5.37
radeonhd (latest git) 5.36
radeonhd (latest git + patch) 5.32


CPU: AMD Athlon(tm) Dual Core Processor 4850e @ 2800MHz
GPU: ATI Radeon HD 3200 Graphics (IGP)

Obscene_CNN
05-21-2009, 01:27 PM
try changing the number of test rounds gtkperf does from 100 to 1000. Your CPU is quite a bit faster than mine so I would expect less of an improvement. (I have only a 2GHz laptop with a radeonhd 3100)

x11perf is a better benchmark if you have several hours to spare ;)

Thanks for sharing your results

RealNC
05-21-2009, 01:32 PM
If an improvement will only appear after accumulating times for several hours, then I'd say such an improvement is rather irrelevant :P

Obscene_CNN
05-21-2009, 01:36 PM
Its not the improvement accumulating over several hours. Its the fact that "x11perf -v1.3 -all" takes about 2.5 hours to run regardless of how fast the machine and video card are.

RealNC
05-21-2009, 02:23 PM
How's that possible?

monraaf
05-21-2009, 02:36 PM
try changing the number of test rounds gtkperf does from 100 to 1000. Your CPU is quite a bit faster than mine so I would expect less of an improvement. (I have only a 2GHz laptop with a radeonhd 3100)


Test rounds: 1000 Test All

Driver: Total time:
--------------------------------------------------------
radeon (latest git) 108.77
radeonhd (latest git + patch) 109.14


Most of the time however is spend in GtkTextView - Add text (54 seconds) and the more test rounds the slower the widget becomes.

x11perf probably is a better way to benchmark graphics performance, but I don't have several hours to spare :)

Obscene_CNN
05-21-2009, 02:40 PM
How's that possible?

x11perf runs a benchmark for a set period of time and counts the number of times an operation is performed.

Obscene_CNN
05-21-2009, 02:45 PM
Test rounds: 1000 Test All

Driver: Total time:
--------------------------------------------------------
radeon (latest git) 108.77
radeonhd (latest git + patch) 109.14


Most of the time however is spend in GtkTextView - Add text (54 seconds) and the more test rounds the slower the widget becomes.

x11perf probably is a better way to benchmark graphics performance, but I don't have several hours to spare :)

I don't particularly like gtkperf. I can underclock my 2GHz CPU to 500MHz and get 20% better scrolling times in it.

Strange... thanks for testing my patch and sharing the results though.

conholster
05-21-2009, 04:35 PM
radeonhd+patch 100x gtkperf 6.47
radeonhd+patch 1000x gtkperf 102.10

tormods latest git radeonhd 100x gtkperf 7.04
tormods latest git radeonhd 1000x gtkperf 111.something

I was a little excited and forgot to save the results before installing and resstarting X :D

I can re-run with tormods during the weekend.

Kubuntu 9.04, kernel 2.6.30rc6 x86_64, amd athlon x2 5600+ 2.8GHz, radeon hd3850

bridgman
05-21-2009, 04:41 PM
There was a bit of discussion about performance results on IRC a little while ago; one thing to remember is that the patch only affects performance if the proper drm is installed and if EXA hardware acceleration is being used. If the driver is running with shadowfb acceleration then the patch won't make any difference...

Obscene_CNN
05-21-2009, 04:58 PM
conholster,

Thanks for trying my patch out and sharing your results. That is a lot more like I expected. Almost a 10% increase

monraaf
05-21-2009, 05:40 PM
If the driver is running with shadowfb acceleration then the patch won't make any difference...

Well, I'm sorry that my test results aren't what one would like them to be but if I was running with shadowfb acceleration I wouldn't get an Xv adapter, now would I?


$ xvinfo
X-Video Extension version 2.2
screen #0
Adaptor #0: "RadeonHD Textured Video"

Obscene_CNN
05-21-2009, 06:20 PM
No offense monraaf,

We didn't mean to offend you with our speculation on the IRC and I didn't think it would end up here. Sorry. It just seemed strange that your results for 1000 test rounds were so close. Another factor behind the IRC speculation was the fact I got better times on my slower laptop (91 seconds without the patch vs 85 seconds with it). I have done extensive and careful benchmarking on this patch and it just seems strange. Possible error on your part of doing the test was one thing that had to be taken into consideration however.

Seeing that someone got numbers close to what you had initially with the same speed processor albeit with a different card adds credibility to your results. With so few results to go off of your results are as valid as anybody's.

bridgman
05-21-2009, 06:33 PM
Yeah, definitely no offence intended and the apology should be mine if that came off the wrong way. Just thought it was better if we talked straight rather than being sneaky and asking for a log :D

I think we were all expecting a bit more difference -- the ratio between CPU and GPU performance should definitely affect results (ie with a 3200 GPU you would see less speedup than with a 3850, since the 3200 is more likely to be GPU limited) but I didn't think there was enough parallelism in the current driver code to totally eliminate the effect of the patch.

tball
05-22-2009, 05:44 AM
This may be a stupid question, but does radeonhd support tear free xv out of the box now? Or is there something I have to enable in xorg.conf. I am asking because I use radeon right now, and are rather satisfied with its performance.

If I can switch to radeonhd without performanceloss with xv, I am willing to test your patch. :)

Obscene_CNN
05-22-2009, 08:04 AM
Radeon and radeonhd share most of the same code. Radeonhd supports tear free video using Xv just the same as radeon. My patch does a little to improve performance of Xv but it largely won't make much difference except saving on cpu cycles.

tball
05-22-2009, 09:57 AM
Just tried radeonhd master and r6xx-r7xx-branch.

1. It doesn't use EXA but ShadowFB on my computer.
2. There is some static noise around text, when doing VT switch.
3. xvinfo:

$ xvinfo
X-Video Extension version 2.2
screen #0
no adaptors present

4. Radeonhd doesn't seem to support any powersaving options. I know those options in radeon isn't completed og fancy right now, but at least they exist.

I will jump back to radeon. But couldn't you just port the patch to radeon? :D

bridgman
05-22-2009, 10:10 AM
Yangman added power saving options to radeonhd as well : http://lists.opensuse.org/radeonhd/2009-04/msg00163.html

EDIT - from IRC discussion this might be 5xx only, not sure. Will try to confirm.

tball
05-22-2009, 10:27 AM
Yangman added power saving options to radeonhd as well : http://lists.opensuse.org/radeonhd/2009-04/msg00163.html

EDIT - from IRC discussion this might be 5xx only, not sure. Will try to confirm.

Thx bridgman.

I didn't find anything related power saving options in the man pages for radeonhd, and the power saving options for radeon didn't work with radeonhd.

I have a r600 based card, if you wanted to know :)

Obscene_CNN
05-22-2009, 10:34 AM
Power management in radeonhd should work for all cards it supports as it uses the AtomBIOS calls if I remember correctly.

tball,

My patch should apply against the master branch of radeonhd. It sounds as though you are having a problem with the dri though. Without dri working the code in my patch won't even be run. I would recommend getting getting radeonhd to work using EXA without my patch first. I think the same dri modules that work with radeon work with radeonhd but I'm not sure.

I haven't looked at the radeon code but porting my patch probably won't be a trivial task.

tball
05-22-2009, 10:39 AM
Power management in radeonhd should work for all cards it supports as it uses the AtomBIOS calls if I remember correctly.

tball,

My patch should apply against the master branch of radeonhd. It sounds as though you are having a problem with the dri though. Without dri working the code in my patch won't even be run. I would recommend getting getting radeonhd to work using EXA without my patch first. I think the same dri modules that work with radeon work with radeonhd but I'm not sure.

I haven't looked at the radeon code but porting my patch probably won't be a trivial task.

I have an already working dri / drm installation with r6xx-r7xx-branch. Radeon works very well with that version and I thinks its odd radeonhd won't use it. Is there any radeonhd specific options i forgot in my xorg.conf? Should i specify manually, that I want to use dri/drm -> exa with radeonhd somewhere?

Obscene_CNN
05-22-2009, 10:46 AM
tball,

In your device section you should have the following options set for radeonhd to use DRI and EXA.



Option "AccelMethod" "EXA"
Option "DRI" "On"

tball
05-22-2009, 12:14 PM
tball,

In your device section you should have the following options set for radeonhd to use DRI and EXA.



Option "AccelMethod" "EXA"
Option "DRI" "On"

Works. But how do you patch radeonhd git? It says 24 out of 24 hunks failed.

$ patch -p1 < patch
patching file src/r600_exa.c
Hunk #1 FAILED at 39.
Hunk #2 FAILED at 100.
Hunk #3 FAILED at 116.
Hunk #4 FAILED at 595.
Hunk #5 FAILED at 938.
Hunk #6 FAILED at 960.
Hunk #7 FAILED at 1115.
Hunk #8 FAILED at 1269.
Hunk #9 FAILED at 1914.
Hunk #10 FAILED at 1924.
Hunk #11 FAILED at 1954.
Hunk #12 FAILED at 2075.
Hunk #13 FAILED at 2147.
Hunk #14 FAILED at 2162.
Hunk #15 FAILED at 2243.
Hunk #16 FAILED at 2319.
Hunk #17 FAILED at 2358.
Hunk #18 FAILED at 2462.
Hunk #19 FAILED at 2606.
Hunk #20 FAILED at 2909.
Hunk #21 FAILED at 2930.
21 out of 21 hunks FAILED -- saving rejects to file src/r600_exa.c.rej
patching file src/r600_state.h
Hunk #1 FAILED at 204.
1 out of 1 hunk FAILED -- saving rejects to file src/r600_state.h.rej
patching file src/r600_textured_videofuncs.c
Hunk #1 FAILED at 48.
Hunk #2 FAILED at 66.
Hunk #3 FAILED at 86.
Hunk #4 FAILED at 97.
Hunk #5 FAILED at 224.
Hunk #6 FAILED at 245.
Hunk #7 FAILED at 275.
Hunk #8 FAILED at 295.
Hunk #9 FAILED at 321.
Hunk #10 FAILED at 398.
Hunk #11 FAILED at 445.
Hunk #12 FAILED at 469.
Hunk #13 FAILED at 480.
Hunk #14 FAILED at 491.
Hunk #15 FAILED at 540.
Hunk #16 FAILED at 623.
Hunk #17 FAILED at 672.
17 out of 17 hunks FAILED -- saving rejects to file src/r600_textured_videofuncs.c.rej
patching file src/r6xx_accel.c
Hunk #1 FAILED at 64.
Hunk #2 FAILED at 76.
Hunk #3 FAILED at 107.
Hunk #4 FAILED at 154.
Hunk #5 FAILED at 175.
Hunk #6 FAILED at 225.
Hunk #7 FAILED at 293.
Hunk #8 FAILED at 339.
Hunk #9 FAILED at 382.
Hunk #10 FAILED at 441.
Hunk #11 FAILED at 550.
Hunk #12 FAILED at 559.
Hunk #13 FAILED at 574.
Hunk #14 FAILED at 614.
Hunk #15 FAILED at 663.
Hunk #16 FAILED at 683.
Hunk #17 FAILED at 765.
Hunk #18 FAILED at 837.
Hunk #19 FAILED at 925.
Hunk #20 FAILED at 1114.
Hunk #21 FAILED at 1161.
Hunk #22 FAILED at 1184.
Hunk #23 FAILED at 1417.
patch unexpectedly ends in middle of line
Hunk #24 FAILED at 1461.
24 out of 24 hunks FAILED -- saving rejects to file src/r6xx_accel.c.rej

Obscene_CNN
05-22-2009, 12:26 PM
tball


Are you in the xf86-video-radeonhd directory when you issue the patch command?

tball
05-22-2009, 12:30 PM
tball


Are you in the xf86-video-radeonhd directory when you issue the patch command?

Yes. But nevermind. It worked with the diff/patch tool in kde.



Sustem:
2.6.29-ARCH Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz
HD 3650 mobility

Test rounds: 1000
Driver: RadeonHD-git
Total time: 66,67

Test rounds: 1000
Driver: RadeonHD-git with patch
Total time: 62,07

Test rounds: 1000
Driver: Radeon-git
Total time: 65,43

BUT
Test rounds: 1000
Driver: Fglrx
Total time: 185,38


How can fglrx be so much slower. The lack of EXA support?

Obscene_CNN
05-22-2009, 01:00 PM
Ah its a DOS CR/LF text file that gets downloaded.....

we can fix it
try the commands

sed 's/.$//' patch > fixed.patch

patch -p1 < fixed.patch

Obscene_CNN
05-22-2009, 01:01 PM
oops, too late. thanks for testing.

edit looks like a 7% improvement

misiu_mp
05-22-2009, 01:21 PM
Ah its a DOS CR/LF text file that gets downloaded.....

we can fix it
try the commands

sed 's/.$//' patch > fixed.patch

patch -p1 < fixed.patch

There is a tool called dos2unix. Easy and quick.

forum1793
05-23-2009, 09:33 AM
How can fglrx be so much slower. The lack of EXA support?

Is it possible that fglrx.ko (from /lib...) is not loaded but xorg uses the fglrx.drv located at /usr...? This might be a VESA fallback if fglrx.ko not loaded but xorg.conf still requires it. Does lsmod | grep fglrx show its loaded?

tball
05-23-2009, 10:04 AM
Is it possible that fglrx.ko (from /lib...) is not loaded but xorg uses the fglrx.drv located at /usr...? This might be a VESA fallback if fglrx.ko not loaded but xorg.conf still requires it. Does lsmod | grep fglrx show its loaded?

I don't think vesa runs kwin composite very well ;)

fglrx.ko is loaded alright and I didn't use composite when running gtkperf.

Obscene_CNN
05-26-2009, 01:24 AM
I have updated the patch for a little more speed. :D

download it from here http://pastebin.com/f781ff0f and save it as a file.

Then run dos2unix on it.

It then should be able to be applied to current git or radeonhd-1.2.5

Obscene_CNN
05-27-2009, 04:08 AM
Yet more speed :D

Download the latest patch here

http://pastebin.com/ff1e8c3c

Note when downloading from this link it will down load as a dos text file with a CR/LF at the end of every line.

we can fix it with the command

sed 's/.$//' patch > fixed.patch

or run dos2unix on the patch file.

once again it should be able to be applied to current git or radeonhd-1.2.5

zika
05-27-2009, 05:02 AM
Yet more speed :D

Download the latest patch here

http://pastebin.com/ff1e8c3c

Note when downloading from this link it will down load as a dos text file with a CR/LF at the end of every line.

we can fix it with the command

sed 's/.$//' patch > fixed.patch

or run dos2unix on the patch file.

once again it should be able to be applied to current git or radeonhd-1.2.5Can You explain how the patch is applied... Sorry for dumb question but I do not want to mess what need not to be messed ... :)

Obscene_CNN
05-27-2009, 09:09 AM
zika,

okay.

First make sure you can build and install the sources as shown here

http://www.x.org/wiki/radeonhd

Next make sure you have acceleration working.

Next, download the patch from this location and save it as the filename 'patch' in your xf86-video-radeonhd directory.

http://pastebin.com/ff1e8c3c

now you have 2 options.

From the xf86-video-radeonhd directory type the command
dos2unix -n patch fixed.patch

Or

From the xf86-video-radeonhd directory type the command
sed 's/.$//' patch > fixed.patch


finally from the xf86-video-radeonhd issue the command
patch -p1 < fixed.patch


I hope this helps

Edit:
Note you can remove the patch with this command if it causes trouble
patch -p1 -R < fixed.patch

zika
05-27-2009, 11:52 AM
zika,

okay.

First make sure you can build and install the sources as shown here

http://www.x.org/wiki/radeonhd

Next make sure you have acceleration working.

Next, download the patch from this location and save it as the filename 'patch' in your xf86-video-radeonhd directory.

http://pastebin.com/ff1e8c3c

now you have 2 options.

From the xf86-video-radeonhd directory type the command
dos2unix -n patch fixed.patch

Or

From the xf86-video-radeonhd directory type the command
sed 's/.$//' patch > fixed.patch


finally from the xf86-video-radeonhd issue the command
patch -p1 < fixed.patch


I hope this helps

Edit:
Note you can remove the patch with this command if it causes trouble
patch -p1 -R < fixed.patch
Thank You. The problem is that I've installed radeon{,hd} and drm in some another way, through ppa's and, lately, with easy-drm-modules-installer, so I do not have xf86-video-radeonhd directory ... I will investigate the situation more, I'm up-to date to tormod, pendretti, xorg-crack and radeon-rewrite (with both radeon{,hd}) ...
Thank You,again ...

conholster
05-28-2009, 01:28 AM
gtkperf doesn't seem very consistent, I ran the test a few times at x1000 with your latest patch. These are the results
1 Total time: 103.18
2 Total time: 104.41
3 Total time: 105.26
4 Total time: 99.90

On the fourth run I moved the mouse cursor between the screens (dualhead)

Same with x100 if I move the mouse cursor to the other monitor there's a 1 second diffrence between the results.

Kano
05-28-2009, 02:58 AM
You should set your cpu to performance mode, otherwise you bench speedstep/powernow...

conholster
05-28-2009, 04:06 AM
I got coolnquite disabled in the BIOS.

cat /proc/cpuinfo says 2800.242 MHz

Obscene_CNN
05-28-2009, 11:34 AM
As I have stated before I'm not a fan of gtkperf.

I get my most consistent results when running gkrellm along side it for some strange reason.

also to see what freq your cpu is working at at the moment you need to

cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq

kernelOfTruth
06-05-2009, 05:37 PM
Yangman added power saving options to radeonhd as well : http://lists.opensuse.org/radeonhd/2009-04/msg00163.html

EDIT - from IRC discussion this might be 5xx only, not sure. Will try to confirm.

that's great news !

thanks bridgman & kudos to Yang Zhao / Yangman :)

@Obscene_CNN:

thanks for making the (already fast) radeonhd even faster ! :)

bridgman
06-05-2009, 06:17 PM
I think Matthias just posted a patch to the radeonhd list which turns on clock gating for 5xx/rs6xx parts, which should give some additional power savings. I stress *should* because nobody has had a chance to actually verify power reduction yet ;)