PDA

View Full Version : :confused: XPress 200M troubles with git mesa/xf86-video-ati


MùPùF
05-10-2009, 08:55 AM
Hi,

I've been running under the OSS driver for almost a year now and it finally gets faster and faster each release. Great !

1st Bug :
The yesterday, I recompiled mesa and got a great performance boost. Enough to be able to play in fact. Thus, I tried to stress the driver and see what happens. The stability is excellent, I haven't been able to crash it while playing.
But, when trying to change the resolution, my screen changes the resolution properly but game content is gone and computer seems to be stuck.

In my Xorg logs, nothing seems to be wrong, this is the line added to the log when I launch a game :

init memmap
init common
init crtc1
init pll1
freq: 78750000
best_freq: 78760000
best_feedback_div: 33
best_frac_feedback_div: 0
best_ref_div: 2
best_post_div: 3
restore memmap
(II) RADEON(0): RADEONRestoreMemMapRegisters() :
(II) RADEON(0): MC_FB_LOCATION : 0x7fff7800 0x7fff7800
(II) RADEON(0): MC_AGP_LOCATION : 0x81ff8000
restore common
restore crtc1
restore pll1
finished PLL1
set RMX
set TVDAC
enable TVDAC
disable LVDS


So, any idea of what could go wrong ?

2nd Bug :
When I go on the radeon feature matrix on the wiki of freedesktop, I see textured video should be faster than overlay video.
In fact, it depends on the video size and on the screen size.
I would like to add that I'm using a 26" screen running at the resolution 1920*1200.

For example, When playing an HD video (720p) fullscreen, I'm able to play the video on my screen but I get some tearing.
When watching a small video fullscreen, the framerate drops dramatically to 3 or 4 fps.

The video overlay works great on both cases but it looks awful on my screen (it was looking great with my previous 19" screen). What I get is a lot of aliasing when playing at the native video size. In full screen, it gets smoother but still not as good as textured video.

So, here is my guess. It seems like the video overlay is a texture of the size of the screen that is resized to fit to the video player's window. it could explain why I get a poor video quality due to the fact that the video card would not resize the picture using cubic algorithms. Am I wrong ?

I really hope there is room for improvement, I don't want to change my computer now :s

Once again, I am ready to help and try whatever you would like me to try. With a friend, we may help this summer with the implementation of VDPAU in gallium3D. He asked for it for the gsoc, but has been refused.

bridgman
05-10-2009, 10:50 AM
A lot of games can't handle having the resolution changed under them. Are you changing res via game controls or... ?

The "textured video is faster than overlay" info is not correct, even though it's in a wiki :D

"Better" might be more appropriate; you have more filtering options and can use it with a compositor, but slow chips may not have enough horsepower to render the video and do the compositing. Are you running with Compiz on or off ? At minimum you might want to try "Unredirect Fullscreen Windows". I don't remember the option for disabling bicubic filtering on the Xv scaleup but that might also be worth a try.

The video overlay is a hardware buffer the size of the video stream's native resolution, scaled up in hardware then fed into the display stream going to your screen (bypassing the frame buffer). Filtering depends on the chip, not sure if the filtering level goes down as the screen res goes up but I imagine something has to give somewhere.

MùPùF
05-10-2009, 11:01 AM
A lot of games can't handle having the resolution changed under them. Are you changing res via game controls or... ?

Yes, I'm changing the resolution via the game controls.


The "textured video is faster than overlay" info is not correct, even though it's in a wiki :D

"Better" might be more appropriate; you have more filtering options and can use it with a compositor, but slow chips may not have enough horsepower to render the video and do the compositing.

The video overlay is a hardware buffer the size of the video stream's native resolution, scaled up in hardware then fed into the display stream going to your screen (bypassing the frame buffer). Filtering depends on the chip, not sure if the filtering level goes down as the screen res goes up but I imagine something has to give somewhere.
OK, are there any Xorg/DRI options for me to get a better filtering with the "video overlay" or to lower the filtering with "textured video" ?
Otherwise, is textured video really greedy and a replacement like VDPAU would allow me to get a smooth video experience or there ain't any chance for me to get this ?

bridgman
05-10-2009, 11:17 AM
APIs like VDPAU add GPU support for decode, but they all end up using shaders for the final (Xv) stages, so VDPAU would help if you were CPU-limited. Then again, there's no decoding hardware in your GPU anyways other than the old IDCT engine so VDPAU / VA-API / XvBA aren't really options anyways.

MùPùF
05-10-2009, 11:30 AM
APIs like VDPAU add GPU support for decode, but they all end up using shaders for the final (Xv) stages, so VDPAU would help if you were CPU-limited. Then again, there's no decoding hardware in your GPU anyways other than the old IDCT engine so VDPAU / VA-API / XvBA aren't really options anyways.

Arg, so, no way to change the filtering through some options in xorg.conf of "overlay video" or "textured video" ?

Thanks a lot for the explanation.

bridgman
05-10-2009, 11:48 AM
Possibly not via the conf, but according to the radeon man page you can control bicubic via xvattr :

.SH TEXTURED VIDEO ATTRIBUTES
The driver supports the following X11 Xv attributes for Textured Video.
You can use the "xvattr" tool to query/set those attributes at runtime.

.TP
.BI "XV_VSYNC"
XV_VSYNC is used to control whether textured adapter synchronizes
the screen update to the monitor vertical refresh to eliminate tearing.
It has two values: 'off'(0) and 'on'(1). The default is
.B 'on'(1).

.TP
.BI "XV_BICUBIC"
XV_BICUBIC is used to control whether textured adapter should apply
a bicubic filter to smooth the output. It has three values: 'off'(0), 'on'(1)
and 'auto'(2). 'off' means never apply the filter, 'on' means always apply
the filter and 'auto' means apply the filter only if the X and Y
sizes are scaled to more than double to avoid blurred output. Bicubic
filtering is not currently compatible with other Xv attributes like hue,
contrast, and brightness, and must be disabled to use those attributes.
The default is
.B 'auto'(2).

MùPùF
05-10-2009, 01:18 PM
Thanks a lot ! It helped a bit but it is still not sufficient.

I would have hoped the GPU would have more horsepower :s
I'll cope with it as much as possible and wait until r700 gets a proper 3D support.

bridgman
05-10-2009, 01:49 PM
It wouldn't hurt to check CPU utilization and clock speed. The Xpress 200 uses system memory, so the CPU clock can affect how fast the GPU can run.

IIRC the Xpress 200 is a 2-pipe chip running at a few hundred MHz, so in an absolutely perfect world it can write maybe 600M pixels/sec. At your screen res you're writing over 120M/sec just for a basic Xv pass; if you add a couple of texture reads that's probably getting close to maxing out the available bandwidth since the CPU is still pushing a lot of memory around as well.

Anyways, make sure you're not maxed out on CPU and that the CPU is running at full speed. It wouldn't hurt to try turning off the XV_VSYNC option as well; that also slows down rendering by waiting for the screen refresh and redraw operations to be in a proper relationship.

MùPùF
05-10-2009, 02:38 PM
I checked the clock speed. it is was already at its maximum.
But CPU utilisation is above 50% and is sometimes capsed at 50% (so, one core at full time).

Maybe mesa/gallium3D can help on this part, true ?

bridgman
05-10-2009, 05:18 PM
Mesa (OpenGL) on its own probably won't help, but decode acceleration written over Gallium3D could reduce CPU utilization at the expense of higher GPU load. If your GPU does turn out to be the bottleneck then putting more load on the GPU may not work for you. Try using xvattr to turn off the XV_VSYNC attribute; that may get your frame rate up at the expense of more tearing.

The best solution for your performance problem might be for us to trade displays -- I'll send you my 15" display in return for your 24" screen :D

MùPùF
05-10-2009, 06:14 PM
I tried It out, not sure I got an higher frame rate but I definitely got so much tearing that I haven't seen a single picture of some movies that would not be cut into 3 pieces :D
There's something strange, the HD (720p) video I use as a benchmark was actually perfect ???.

Ah ah, wrong !, it is only a 26" screen not a 24" :p
Do they actually pay you at AMD ? If so, you should consider buying one, it is so huge you can actually have 3 files side by side ! :D
It could also be a good way to stress your graphic card a bit ;)

Thanks a lot for your help and sharing your knowledge, I wish I could help you back. Maybe one day ;)
Bye ! Have a nice week !

agd5f
05-11-2009, 02:42 AM
Regarding the mode change bug, if the game uses the xvidmode extension to change the mode, that is the problem. That extension is not multi-head aware so it ends up reprogramming the default crtc and output which don't necessarily correspond to the crtc(s) and output(s) that are active. Better to change the mode before hand using xrandr.

agd5f
05-11-2009, 02:45 AM
Diagonal tearing and some performance improvements for planar video are now available in xf86-video-ati git master. You might want to try that.

MùPùF
05-11-2009, 03:03 AM
I do not really have dual-head. Only one screen is active, the LVDS is off (xrandr --output LVDS off ; xrandr --output VGA-0 auto). Games that are crashing when changing resolution are based upon the Quake 3 engine. What do you want me to try with xrandr ?

I'm was already running the latest version of xf86-video-ati git master. The only thing that has changed since is Alex Deutcher's commit (RV770: add missing pci id) 40 minutes ago.

agd5f
05-11-2009, 10:00 AM
I do not really have dual-head. Only one screen is active, the LVDS is off (xrandr --output LVDS off ; xrandr --output VGA-0 auto). Games that are crashing when changing resolution are based upon the Quake 3 engine. What do you want me to try with xrandr ?


xvidmode is an old extension that does not understand that LVDS is off and you are only using VGA. It just blindly changes the mode on one output and one crtc regardless of the current topology. My suggestion is to change the mode to whatever mode you want to run with xrandr first before starting the game, e.g.,
xrandr --output VGA-0 --mode 800x600


I'm was already running the latest version of xf86-video-ati git master. The only thing that has changed since is Alex Deutcher's commit (RV770: add missing pci id) 40 minutes ago.

Ok. you should have the latest bits then.

MùPùF
05-11-2009, 07:31 PM
My suggestion is to change the mode to whatever mode you want to run with xrandr first before starting the game, e.g.,
xrandr --output VGA-0 --mode 800x600


I doesn't change anything as the game reset the resolution with its own settings (and it works). But, when I try to change the resolution in game, it crashes. :(

MùPùF
05-16-2009, 06:56 PM
OK, some more silly questions that are hopefully not that silly :

First of all, I did some more testing about in game resolution changing. agd5f told it was a multi-head problem, so I tried without multi-head at all (just using LVDS, unplugging the VGA cable) and it crashes just the same way. So, this is something wrong with the driver.
Were should I bug report ?

The second one is still about gaming. Two weeks ago, I started being able to play 3D games with more than 3 fps in 640*480. I am now at the point to be able to play tremulous with a resolution of 1600*1000 and 35fps. It is quite a big improvement ! But at the same time, some games are damnly slow ! For example, Extreme Tux Racer, I can't get more than 2 fps.
Another strange behavior is about Nexuiz. Everything is smooth and fast but as soon as I spot a player or a weapon, the framerate drops from 50 to 3fps.
I assume this problem is caused by a lack in vertex/pixel shaders. My question is, is it a hardware or software problem ?
I also would like to say nexuiz takes almost no CPU. Would it be possible to get this missing shader software rendered ? :confused:

Thanks by advance

bridgman
05-16-2009, 07:21 PM
OK, assuming the problem is in the driver and not in the game, you can find the bugzilla link on the home page of x.org, at http://www.x.org (yeah, I'm in "teach a man to fish" mode today ;)).

Really bad slowdowns areu usually caused by software fallbacks, ie doing something in software rather than on the GPU. You can use driconf to "disable low impact fallbacks" (presumably low impact refers to visual impact, not performance impact), although on the latest mesa that option is apparently on by default. Anyways, if you haven't already tried driconf it should be your next step.

Mesa will execute vertex shaders in SW as far as I know, it's just not fast. Is the CPU still low during the really slow times ?

MùPùF
05-17-2009, 05:35 AM
OK, assuming the problem is in the driver and not in the game, you can find the bugzilla link on the home page of x.org, at http://www.x.org (yeah, I'm in "teach a man to fish" mode today ;)).
OK, it doesn't depend on the game. I tried 4 different games and the problem is just the same. Moreover, there were no problem a few months back. I'm about to bug report it.
EDIT : Here is the bug with some more information (after a more intensive test) : https://bugs.freedesktop.org/show_bug.cgi?id=21778


Really bad slowdowns are usually caused by software fallbacks, ie doing something in software rather than on the GPU. You can use driconf to "disable low impact fallbacks" (presumably low impact refers to visual impact, not performance impact), although on the latest mesa that option is apparently on by default. Anyways, if you haven't already tried driconf it should be your next step.

Mesa will execute vertex shaders in SW as far as I know, it's just not fast. Is the CPU still low during the really slow times ?
Hmm hmm, 'Disable Low-Impact fallback' was on off, I tried turning it on and the game was even slower (as you expected). The CPU load is always very low (almost 0) on nexuiz while on some other games as Tremulous, a complete core is used and it runs pretty fast.