PDA

View Full Version : random X freeze with latest ati driver


Pages : 1 [2]

bridgman
07-09-2009, 08:39 AM
Note that the Windows driver runs a series of diagnostics at startup and automatically downclocks the AGP bus and turns off other options like FastWrite as needed to obtain reliable operation. The marketing folks called it SmartGart : http://ati.amd.com/products/catalyst/SmartGart-FAQ.pdf

If your system runs reliably with Windows that means it should be possible to run it reliably with Linux as well, but the Windows driver stack automatically steps down AGP settings as needed and you'll need to do that manually with Linux drivers. When you see one Linux driver run reliably and another not, that usually means the *defaults* for AGP settings are different between those drivers.

tormod
07-09-2009, 09:30 AM
Note that the Windows driver runs a series of diagnostics at startup and automatically downclocks the AGP bus and turns off other options like FastWrite as needed to obtain reliable operation.
Thanks for the info. I always wondered how the Windows drivers got this right. I have even seen bug reports where some Windows utility tells the user it is using 4x but Linux would need 2x to be stable, so I guess the utility just didn't tell the actual, corrected rate but what was default or "wanted" rate.

Exershio, please make sure you file a bug report if AGPMode fixes your issue, so that we can make a driver quirk for this hardware combination.

agd5f
07-09-2009, 09:56 AM
Thanks for the info. I always wondered how the Windows drivers got this right. I have even seen bug reports where some Windows utility tells the user it is using 4x but Linux would need 2x to be stable, so I guess the utility just didn't tell the actual, corrected rate but what was default or "wanted" rate.


I'm not sure what the windows utility shows, however there's more to it than just the agp mode. You can test AGP reads and writes separately and just use one or the other depending on which is stable. Also you can use combinations of AGP gart and the integrated gart for different things depending on what parts of AGP are stable. In addition, I'm sure there are tons of chipset specific AGP quirks.

mlau
07-09-2009, 02:19 PM
I too experience random hard lockups with -video-ati git sources since around April 2009. Reducing AGP rate didn't help. Interestingly
it seems to only occur with GTK apps; quickly switching between tabs with lots of pictures in firefox or selecting various messages in rapid succession in claws-mail is sufficient to kill the system fairly quickly. I've never had a hang so far with only using KDE/Qt apps.

System is a mobility9700/64MB and a intel 855PM chipset, and up-to-date
git versions of more or less all X components.

mars
07-09-2009, 03:27 PM
Whenever I run a 3d intensive application, it'll render properly for a few seconds, then my screen goes black, my monitor loses signal, and my computer completely freezes. It takes no input from keyboard/mouse, audio starts stuttering and sounds like a buzzsaw, and I'm forced to shutdown with the power button.

Whenever I'm playing a game that uses harware acceleration but is not very intensive (such as StepMania), the game runs fine for 3-5 minutes and then the same problem I stated previously occurs. Screen goes black, unresponsive, etc.

Whenever I'm just running a simple KDE desktop without any desktop effects, I experience no problems. It's only when I'm running an OpenGL application does the problem happen.

Same problem occurs here with new 9.6 catalyst driver.
uname -a
Linux 2.6.29-sabayon #1 SMP Thu Jun 11 13:33:04 UTC 2009 x86_64 AMD Phenom(tm) II X4 955 Processor AuthenticAMD GNU/Linux
lspci on http://www.sabayonlinux.org/pastie/1249
xorg.log on http://www.sabayonlinux.org/pastie/1247
xorg.conf on http://www.sabayonlinux.org/pastie/1248

Hope this is of use to the developers.
I will be happy to give more information or test things.

bridgman
07-09-2009, 05:10 PM
Can you please move this to the proprietary driver section ?

Please note that the 9.6 driver does not support 2.6.29 or higher kernels. There are some unofficial patches available; have you already applied those ?

Exershio
07-09-2009, 06:33 PM
Well, I just got done installing Neverwinter Nights, and running the game on completely high settings (while it drags my frame rate down to 10 hehe) does NOT crash my computer at all anymore! This is with my AGPMode set to 1x.

I don't see why it doesn't work with 4x. While my motherboard only supports up to AGP 4x, the Catalyst Control Center in Windows XP says my card is running at 4x, and there's no crashing there. Unless it just SAYS it's running at 4x and it's really running at 1x/2x?

I'm not sure, but I'm really happy that I can finally run OpenGL applications without any problems :D Thank you to everyone that helped.

As for filing a bug report on this issue, where should I do that?

edit: I can now confirm that I do not experience the crash with AGP 2x as well. It seems to only crash when I have AGP at 4x.

agd5f
07-09-2009, 06:46 PM
Well, I just got done installing Neverwinter Nights, and running the game on completely high settings (while it drags my frame rate down to 10 hehe) does NOT crash my computer at all anymore! This is with my AGPMode set to 1x.

I don't see why it doesn't work with 4x. While my motherboard only supports up to AGP 4x, the Catalyst Control Center in Windows XP says my card is running at 4x, and there's no crashing there. Unless it just SAYS it's running at 4x and it's really running at 1x/2x?


It's hard to say what the exact problem. The AGP chipset driver in windows may have some special workaround that the linux AGP chipset driver doesn't have, etc. In reality, AGP modes don't really have that much effect on performance. I doubt you could tell much difference for most things.

I'm not sure, but I'm really happy that I can finally run OpenGL applications without any problems :D Thank you to everyone that helped.

As for filing a bug report on this issue, where should I do that?

https://bugs.freedesktop.org

mars
07-09-2009, 11:34 PM
Can you please move this to the proprietary driver section ?

Please note that the 9.6 driver does not support 2.6.29 or higher kernels. There are some unofficial patches available; have you already applied those ?

I am aware of that, only the symptoms are so similar.
That is the main reason i posted it here.

As far as i know the patches are included in the distro by its developers.

bridgman
07-10-2009, 12:05 AM
Sure, but it's a *completely* different driver with absolutely no code in common, and screen black/unresponsive is a very generic problem. The chances of the two issues having a common root cause are (holds fingers together) pretty small. I wasn't aware that Sabayon was including patched fglrx drivers in its packages but I guess that's possible.

mars
07-10-2009, 01:06 AM
...The AGP chipset driver in windows may have some special workaround that the linux AGP chipset driver doesn't have, etc. In reality, AGP modes don't really have that much effect on performance.
https://bugs.freedesktop.org

Yes of course, nevertheless it does have some influence.
Referring to http://en.opensuse.org/ATI/Troubleshooting#ATI_AGP_Cards
Here it is stated "If fglrx is still not working, set the AGP aperture Size in the BIOS to the size of the physical card memory."
I know this is not the AGP mode but the aperture size.
Also i recall from the very early days of the nvidia linuxdriver ('t was in the riva tnt time) that AGP modes had considerable effect on driverperformance.

@ Bridgman, i will get back to you later in the proprietary drivers section on hickups witg catalyst 9.6 in sabayonlinux.

Kind regards.

bitnick
07-11-2009, 02:22 PM
I just stumbled upon this bug _in a repeatable way_, after updating my old computer, switching from gnome to xfce4 and updating a lot of X-related packages.

Now running (gentoo packages):
x11-base/xorg-server-1.5.3-r6
x11-base/xorg-x11-7.2
x11-drivers/xf86-video-ati-6.12.1-r1
x11-libs/gtk+-2.14.7-r2
xfce-base/xfce4-4.6.1

Default xorg.conf, no changes except setting Option "AccelMethod" "EXA".

Graphics card is R420 JI [Radeon X800PRO].
PCI bridge: nVidia Corporation nForce3 250Gb AGP Host to PCI Bridge (rev a2)

I created a new account on the computer, and while setting up Thunderbird for the new user I "noticed" it always hung on the second page of the account wizard - where you enter your name and email address. (Same symptoms as before: dead keyboard, frozen screen except the mouse cursor moves around jerkily (a few Hz update frequency of the position), SysRq working, nothing in the logs... - the only dfference was this time it was repeatable.)

So I tried different combinations of settings in xorg.conf to get rid of it, and -lo and behold- Option "AGPMode" "4" did the trick (I thought this was the default?). At least I do no longer get the hang repeatably; perhaps it will still hang sometimes. But this setting absolutely had some effect on this bug for me.

I also tried "AGPFastWrite" "off", "GARTSize" "32", "MigrationHeuristics" "greedy", "AccelDFS" "false", "DynamicClocks" "on"/"off" with no effect (didn't test all combinations though).

Unfortunately, I'm quite sure I tried setting the AGPMode option the last time I had the error, without effect. But perhaps it will give someone a clue.

agd5f
07-11-2009, 02:47 PM
Now running (gentoo packages):
x11-base/xorg-server-1.5.3-r6
x11-base/xorg-x11-7.2
x11-drivers/xf86-video-ati-6.12.1-r1
x11-libs/gtk+-2.14.7-r2
xfce-base/xfce4-4.6.1

So I tried different combinations of settings in xorg.conf to get rid of it, and -lo and behold- Option "AGPMode" "4" did the trick (I thought this was the default?). At least I do no longer get the hang repeatably; perhaps it will still hang sometimes. But this setting absolutely had some effect on this bug for me.


AGP is made of fail. The default in older versions of the xf86-video-ati driver was the lowest supported agp mode (1x or 4x), with newer versions we leave it at whatever mode the bios set up as that was more reliable overall. However, there are always a few combinations that work better with particular modes. Please file a bug (https://bugs.freedesktop.org) and attach your xorg log and I'll add a quirk to the driver for your card/agp bridge combination so it will work out of the box. Also, I would suggest trying xf86-video-ati from git master or the stable 6.12-branch as there have been a number of EXA related fixes since 6.12.1.

bitnick
07-11-2009, 03:44 PM
Bug 22726 (https://bugs.freedesktop.org/show_bug.cgi?id=22726) submitted.

agd5f
07-13-2009, 02:24 PM
Bug 22726 (https://bugs.freedesktop.org/show_bug.cgi?id=22726) submitted.

Could you please attach your xorg log and the output of lspci -vn?

drees
07-14-2009, 06:25 PM
Looks like my bug https://bugs.freedesktop.org/show_bug.cgi?id=20348 is very similar to the original poster's.

In my case, Firefox or OpenOffice typically triggers it. Screen freezes up with the mouse moving jerkily a couple times a second with X using 100% cpu.

Only that that "works" for me is to disable DRI. I was thinking of trying different AGP/PCI settings since then I might be able to retain some acceleration, but if the GPU is onboard, it should be PCI by default, right? The Xorg.log doesn't seem to indicate either way.

Exershio
07-14-2009, 07:12 PM
Hmm, I think I figured this all out. In Windows XP, the ATI CCC always told me my card was running at AGP 4x, yet the memory freq was always 195mhz. Under Linux @ AGP 1x, my mem freq is 195mhz...

When I changed it to 2x in Linux, all of a sudden my mem freq was 390mhz. I think the CCC in Windows was lying to me and was really running my card at 1x the entire time despite the fact it "said" it was running at 4x. So when I ran my card at 4x under Linux, it crashes. 1x and 2x do not crash however.

Just figured I'd let you know about that if it's any help.

agd5f
07-14-2009, 07:55 PM
Hmm, I think I figured this all out. In Windows XP, the ATI CCC always told me my card was running at AGP 4x, yet the memory freq was always 195mhz. Under Linux @ AGP 1x, my mem freq is 195mhz...

When I changed it to 2x in Linux, all of a sudden my mem freq was 390mhz. I think the CCC in Windows was lying to me and was really running my card at 1x the entire time despite the fact it "said" it was running at 4x. So when I ran my card at 4x under Linux, it crashes. 1x and 2x do not crash however.

Just figured I'd let you know about that if it's any help.

what do you mean by memory frequency? AGP mode doesn't affect the mem clocks on the card.

Exershio
07-14-2009, 11:19 PM
what do you mean by memory frequency? AGP mode doesn't affect the mem clocks on the card.

Are you sure about that? when I had AGP 1x set, my mem clock was 195mhz. When I bumped it up to 2x, it all of a sudden was reported as 390mhz (according to rovclock)

agd5f
07-14-2009, 11:56 PM
Are you sure about that? when I had AGP 1x set, my mem clock was 195mhz. When I bumped it up to 2x, it all of a sudden was reported as 390mhz (according to rovclock)

Yes, I'm sure. I wouldn't trust rovclock too much. Last time I looked at it it was pretty buggy. The actual default clocks are printed in your xorg log, e.g.,

(II) RADEON(0): ref_freq: 2700, min_out_pll: 64800, max_out_pll: 120000, min_in_pll: 600, max_in_pll: 1600, xclk: 40000, sclk: 600.000000, mclk: 500.000000

sclk is the engine clock and mclk is the memory clock.

Exershio
07-15-2009, 12:01 AM
Ah alright, I see.

(II) RADEON(0): ref_freq: 2700, min_out_pll: 20000, max_out_pll: 40000, min_in_pll: 40, max_in_pll: 3000, xclk: 19575, sclk: 250.000000, mclk: 195.750000

Off Topic: and yeah, I noticed rovclock wasn't working too well. I tried overclocking my core from 250mhz to 450mhz (perfectly stable in Windows XP), but it didn't make any difference at all to performance, as if it wasn't even doing anything.