PDA

View Full Version : Ati-Driver 8.6/8.7 freeze on 32/64 bit


Dragonlord
07-23-2008, 02:07 PM
I tried now various combinations: Gentoo 64bit, Ubuntu 64bit, Ubuntu 32bit with drivers 8.6 and 8.7 . The result is always the same: the system freezes roughly 15 seconds into using a 3D application like glxgears or the UT2k4Demo ( that happens to be around on that disk ). It looks therefore like a permanent bug which happens across architectures and distributions.

Some ATI guy can take stand to this? Why are your drivers looking up permanently in the most vital things: 3D?

To the guys who did the benchmark ( where it obviously seems to run ), what main board are you using? How many RAM are you using? From what I gathered from the review you use a 32bit Ubuntu 8.04 . I used the exact same to be sure it is not an architecture or distribution problem.

EDIT: Are there certain BIOS options which are known to cause troubles? I've set mostly all to auto and optimal values. What else could I test to narrow down the problem?

Everything else works except the lookup. Chances are the drivers are not "safe" enough for certain situations but figuring this out would be helpful also to other people.

bridgman
07-23-2008, 05:12 PM
Some ATI guy can take stand to this? Why are your drivers looking up permanently in the most vital things: 3D?

Quick answer is "they don't lock up on our test systems" ie we'll need to find out which of the differences between our systems and your system is causing the lockup to appear. It may be a bad card but let's look for other things first.

I'll take a quick skim through your other posts and see if you have all the system info there, but could I ask you also to post a bug on the bugzilla system (http://ati.cchtml.com) ?

EDIT - OK, found most of what I was looking for :

- Asus M3N-HT Deluxe ( nForce 780a SLI )
- AMD Athlon 64 X2 6400+ ( Windsor )
- 2x 1GB Kingston HyperX DDR2-1066 CL5
- Sapphire Radeon HD 4870 ( 512MB )

Nothing jumps out at me in terms of potential problems; it seems like a pretty solid configuration although I don't know if we have any test coverage on that specific chipset.

Dragonlord
07-24-2008, 10:27 AM
Unfortunately I've got no account there. Maybe this adds to it:

# uname -a
Linux dragonworld 2.6.25-gentoo-r7 #1 SMP PREEMPT Wed Jul 23 23:06:30 CEST 2008 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 6400+ AuthenticAMD GNU/Linux

media-libs/mesa-6.5.2-r1
x11-base/xorg-server-1.3.0.0-r6
x11-drivers/ati-drivers-8.501

That's the Gentoo build I tested. The Ubuntu build is the same as the editors here used.

EDIT:
There's something fishy in the logs but I don't know if this has anything to do with the problem.

First what looks good:
(II) Primary Device is: PCI 04:00:0
(II) ATI Proprietary Linux Driver Version Identifier:8.50.3
(II) ATI Proprietary Linux Driver Release Identifier: UNSUPPORTED-8.501
(II) ATI Proprietary Linux Driver Build Date: Jun 2 2008 22:47:36
(--) Chipset Supported AMD Graphics Processor (0x9440) found
(II) AMD Video driver is running on a device belonging to a group targeted for this release
(II) AMD Video driver is signed
Suggests this should all work ( hey, we are supported... well... really? )

But what is not correct:
(--) fglrx(0): VideoRAM: 262144 kByte, Type: GDDR5
O'rly? Nope guys, it's 512 not 256MB GPU RAM!

Maybe the entire beast (http://rptd.ch/misc/Xorg.0.log) contains more fishy things I don't spot right now.

smlbstcbr
07-24-2008, 12:45 PM
Unfortunately I've got no account there. Maybe this adds to it:



That's the Gentoo build I tested. The Ubuntu build is the same as the editors here used.

EDIT:
There's something fishy in the logs but I don't know if this has anything to do with the problem.

First what looks good:

Suggests this should all work ( hey, we are supported... well... really? )

But what is not correct:

O'rly? Nope guys, it's 512 not 256MB GPU RAM!

Maybe the entire beast (http://rptd.ch/misc/Xorg.0.log) contains more fishy things I don't spot right now.

Are you sure you are not using shared memory, like in windows? What I mean, half the memory reported in Windows belongs to your system memory and the other half is your video card.

Dragonlord
07-24-2008, 01:22 PM
Are you sure you are not using shared memory, like in windows? What I mean, half the memory reported in Windows belongs to your system memory and the other half is your video card.
The Radeon HD 4870 is a dedicated graphic card not an onboard graphic chip. ;)

legume
07-25-2008, 09:07 AM
But what is not correct:

O'rly? Nope guys, it's 512 not 256MB

That's "normal" both my working 512 cards say 256 there - amdcccle reports the full 512.

smlbstcbr
07-25-2008, 01:56 PM
The Radeon HD 4870 is a dedicated graphic card not an onboard graphic chip. ;)

Actually, it does happen as well with dedicated graphics board. I own an X300 graphics board and it has 128 MB of on board RAM but it borrows another 128 MB from my system. CCCLE shows the onboard memory (I suppose), but the X log shows the 256 (I suppose). A bit messy, but the drivers are very functional, at least with my Gentoo;)

Dragonlord
07-25-2008, 08:04 PM
Well this card has for sure 512MB dedicated memory and it for sure is bugged in the 3D mode :(

smlbstcbr
07-25-2008, 10:06 PM
Well this card has for sure 512MB dedicated memory and it for sure is bugged in the 3D mode :(

:( Bad luck, I suppose. Have you reported this issue to the ATI Linux through feedback?, hopefully they will resolve this issue, that is present in fglrx.

bridgman
07-25-2008, 10:46 PM
The 512MB vs 256MB thing is almost definitely a red herring, I don't think that is related to your crashing.

Nothing jumps out at me from the log other than the VideoOverlay option being on (it should ounly be on for pre-R5xx parts). You want TexturedVideo on, not VideoOverlay, but that wouldn't cause a crash. There's an "aticonfig --initial" (or something like that) command that resets the amdpcsdb contents to default; did you run that after installation ?

Looking for some way to determine if this is a driver/system issue or a card / power supply / heat issue. The usual way is to run some games under Windows and see if it runs reliably there; is that an option ?

Dragonlord
07-26-2008, 10:56 AM
The 512MB vs 256MB thing is almost definitely a red herring, I don't think that is related to your crashing.
Not for me though. As GameDev I'm looking to push the 512MB since 256MB I've got already on my nVidia dev station ;)

Nothing jumps out at me from the log other than the VideoOverlay option being on (it should ounly be on for pre-R5xx parts). You want TexturedVideo on, not VideoOverlay, but that wouldn't cause a crash. There's an "aticonfig --initial" (or something like that) command that resets the amdpcsdb contents to default; did you run that after installation ?
I forgot to reset the texture video after testing around. I did once call the aticonfig at the beginning of my odysee but after this ( and in the newest fresh install ) I did just alter the xorg.conf . You said it resets the amdpcsdb or something. What exactly is this file doing? Maybe I can take a look at this bugger.

Looking for some way to determine if this is a driver/system issue or a card / power supply / heat issue. The usual way is to run some games under Windows and see if it runs reliably there; is that an option ?
The card is Radeon HD 4870. The case is a Thermaltake Aguila with back and front case fan. The power supply a Thermaltake Toughpower 750W. lm_sensors reports splendid temperatures so everything is coooler as cool ( in a double sense :D ). I doubt it's a heat or power supply problem.

That said I've got a vanilla XP on that machine but I did not test it yet since I need first an USB extender to attach the dev-disk to this machine with all the stuff sitting on it ( I don't want to use the current engine alpha build from SVN for testing... could bog the test ). But I'll have some test done soon.

bridgman
07-26-2008, 11:29 AM
Not for me though. As GameDev I'm looking to push the 512MB since 256MB I've got already on my nVidia dev station ;)


Sorry, I meant "the message doesn't mean you don't have 512MB". I think that message shows the size of the aperture (maxed at 256MB) not the size of the video memory. The aperture is only relevent for software rendering; the graphics engine doesn't go through the aperture and can access the entire VRAM space.

Dragonlord
07-27-2008, 06:41 PM
Looks like you did a good job on the drivers... crashes Windows too :D . I could not get a game running yet since it crashed once during installing and the other time I just scrolled in a file list while the system froze and rebooted itself. I've installed the drivers from the included disk which is 5.01 or something like this.

bridgman
07-27-2008, 07:15 PM
OK, probably 8.501. This is sounding more like a hardware problem if you're also seeing problems under Windows. The only thing I can think of that we haven't asked about yet is the power connectors on the video card; sometimes they don't seat properly so you don't get enough power to the card...

Dragonlord
07-27-2008, 08:20 PM
Well this particular Thermaltake case is a bit... particular. They have an HD carrier right behind the front case fan. This guy is a little blocker in that the power connectors on the card push against this guy. I had to carefully fit it in there while heavily bending the cables on the plug to make it fit. Damn I hate fighting for millimeters. How "fragile" are those connectors on the 4870? Because if they are rather fragile I could try some metal works on the carrier to carve out a right hole in the carrier to give the plugs some space.

alexcorscadden
07-28-2008, 01:12 PM
We've also seen total system lockups on a system here with an HD3870. I'm waiting to hear back as to whether we can reproduce the problem on other AMD cards on the same system (Nvidia hardware seems to run well on this system). When I moved the card to my box, I wasn't able to reproduce the problem.

The system that crashed has an ICH8 northbridge and a Core 2 Quad while my box is an ICH7 with a Core 2 Duo. As in Dragonlord's report, the machine that crashes also has issues running in windows (UT3 causes crashes, as well as our own software stack). This is very worrying for us as we're trying very hard to ensure a good experience for all linux users but it's very hard to do when you can't trust the underlying drivers.

I can get exact specs if that would be helpful.

Dragonlord
07-28-2008, 02:38 PM
I had thought myself to use UT3 as a test platform but since it is a heavy crash-worthy game ( even with newest patch it's a walk on a thin line before you get a hard-lockup... and this happens on an nVidia system ) I do not consider it too representative. Something like Crysis ( heavy on hardware but not too crashy as far as I know ) would be better but I do not have a disk in house here.

Dragonlord
07-28-2008, 05:07 PM
Okay, now I really start to get mad at you ( ATI ). I've got rid of the HD carrier and moved the HD top slot ( as I found there a hidden carrier ). Straightened the power cables, pulled out the card and blew the connectors ( as if this ever worked but still I do it :P ) seating it back in refitting the connectors. Still Linux as well as Windows hard-locks after trying 3D or doing some 2D work. The only thing I can agree with is that not calling aticonfig --initial does indeed cause a hardlock right on X startup but even with this it fails.

Starts to piss me off. Troubles with Linux I could understand but messing up Windows too is pulling my nerve.

bridgman
07-28-2008, 11:21 PM
This is starting to sound like a bad card.

Dragonlord
07-29-2008, 10:50 AM
Anything to ensure this? After all I got the card through an online shop ( as shops in my vicinity didn't have it yet :O ) and replacements are not what they are best known for :/

bridgman
07-29-2008, 12:42 PM
Have you run other high end 3d graphics cards in the same system, or is it a new build ?

Dragonlord
07-29-2008, 01:27 PM
It's a fresh build.

Dragonlord
08-18-2008, 12:15 PM
#bridgeman:
I finally got reply from the replacement center today ( god damn this took ages!!! ). They confirm the card I obtained has a defect. It's being replaced now ( send back to ATI... geez... this going to take ages again!!! ). So the experiences in this topic are so far void until I get the new card.

bugmenot
08-19-2008, 06:49 PM
Here another card that it freezes under Linux and Windows Vista, with WinXP it is perfect.

4850 + mainboard asus P5Q

THE PROBLEM IS THE *FGLRX* DRIVER!!!

Dragonlord
08-19-2008, 09:16 PM
I don't know if this has any relevance but while booting up glance at the topside of the card and see how many of the 4 red leds are lit. I had only 3 of them lit which already made me get suspicious if something could be wrong since from my experience with electronics ( from the company I am in for the IT ) leds are usually meant to be lit all... or else something is wrong. Let's see... still waiting for the replacement card :(

Dragonlord
09-02-2008, 04:32 PM
Finally got the replacement card. I plugged it in and this are the results.

Windows:
Starting a game as soon as the first 3D is rendered it shows a couple of frames and then full-freeze. After some seconds exit to desktop, 4-bit colors, error message about ati2dvag stopped working normally and reboot bla bla. WTF?! At least in contrary to the last time it did not reboot automagically but still unworkable!

Linux:
glxgears runs now for maybe 30 seconds before doing a full-freeze. Hard reboot required. unworkable!

Jesus... what the fuck is going on here? This is now the second card and it does run neither on Windows nor on Linux and keeps locking up the machine. I'm out of ideas. This machine is now dead since weeks.

About the error in windows ( the only lead we have so far ) the internet says "deadloop in driver". This would mean you ati guys messed up heavily in your drivers. I used for Windows the newest driver from the website, status today. For Linux I used 822 ( gentoo numbering, not sure what it refers to ). I'm really in a dire need for a solution since this machine is now de-factor dead since weeks.

forum1793
09-02-2008, 08:25 PM
You could have gotten a 2nd bad board. Customer reviews at newegg show this happening a little too frequently across the whole gamut of products.

Do you have or can you borrow a different graphics card to try out? This would isolate other system components. If problem using other card such as nvidia then you know problem is in mboard or powersupply.


You said bios had optimised settings. I've had problems with these in past on a machine years ago (iwill mboard) and machine chose wrong memory settings. Look on net for your specific memory and see what others are using. See if that matches your bios. I didn't read through all previous posts but maybe you ran memtest?

I don't know how you could verify powersupply is not problem unless you had a full set of other equipment or if you had a separate power supply you could try with this equipment. (although it is pain to connect those little mboard connecters at the front/bottom.)

Dragonlord
09-03-2008, 06:12 AM
Currently no board of the same scale. Maybe though I can ask on my local shop corner if he has some time and a spare card to haul into the beast to see if it works. I'm though rather positive it's not a HW problem since with the new card I could so far only crash the system using 3D in contrary to the defunct old card which blew also during 2D.

The memtest though would be an idea. The problem seems to be unpredictable.

What goes for the mobo it's mostly set to Auto all the way. I'm not sure if this can cause troubles, especially with this thing called "AI" which my motherboard line loves to brag about. But I'll see what the net knows about this.

Dragonlord
09-03-2008, 08:57 AM
- Mem-Check: Everything OK
- BIOS Updated: Still the same problem ( although now Windows auto-rebooted... like with the broken card )

:(

Kano
09-03-2008, 12:35 PM
I guess you have got a good ati resistant board ;)

Dragonlord
09-03-2008, 03:49 PM
Okay, sometimes miracles happen in life... just that I don't know what happened. I try to sum up the status quo as well as what I gather so far. Please bridgman have a look at this as maybe ( maybe ) this points to a flaw in the design of the 4870.

Status Quo:
Old Card: Tried reseating it, blowing the connectors, placing it in alternate pci-e slots, didn't work. Tech support said it is faulty and replaced it

New Card:
Now it gets strange. After I updated the BIOS I had been back to the situation of the old card, hence Windows auto-reboots and Linux hangs. I reseated the card and also blew the connectors to no avail.

Then I Tried to card on the second ( instead of the primary ) pci-e slot. The miracle happened.

Windows: messed with 2D stuff, played two games ( one less graphic hungry one graphic hungry ). I had no single crash and it ran at full speed ( I assume, not benched things yet ).

Linux: messed with 2D stuff, played video, played OpenArena and UT2004-Demo, messed with Blender3D, compiled and run my game-engine/game.
Video Playback: Worked rather flawlessly. Watched some videos, had no tearing, no deinterlace troubles, run smoothly. Only hickup is when going back from full-screen. Closing MPlayer fixes this. ( ATI-TODO: Fix the full-screen bug )
Playing OA/UT2K4: No problems. Played OA in window mode and UT2K4 in full screen mode. No crashes, no delays, no slowdowns, smooth as it should be. And the fans kept purring silently like kitties :D .
Own Engine/Game: Worked too... sort of... after dealing with ATI specific OpenGL troubles. Seems though to also work with in-progress ( hence not fully stable ) programs/games. Surprised to see this flawless record after the past troubles.
Blender3D: This is a problem. Going full-screen it tends to tear and using other windows is futile like in the MPlayer full screen case. Good news: with a little hack it can be helped though. Just start Blender3D with "blender -p 0 0 X Y" where X and Y are your screen dimensions minus lets say 50. I used "blender -p 0 0 1600 1000" and it starts without tearing. You can go full-screen after this point without any troubles.

Only negative point: Starting KDM Login Manager caused lockup. I resorted to startxfce4 and used an XFCE4 session. Had not a single crash nor other problems and shutting down also worked without a lockup.

Analysis:
Now comes the hard part... the "why". I have no idea why it works. I have 3 pci-e 16x slots in the machine here. The ASUS board though has a little particularity which might be a problem or it might be a design flaw in the 4870. This is where bridgman comes into play. I'll write down what's in my manual, maybe this helps to detect the problem ( 1st and 3rd slots are blue, 2nd is black... i assume this is the universal one and the one the card is working in right now ):
3 PCI-E slots x16 primary + universals ( max 8x ).
Universal slot changes operation frequency depending on used PCI-E card.

Slot occupation:
occupation | slot 1 | slot 2 | slot 3 |
-----------+--------+--------+--------+
1 card | 16x | 1x | 16x |
2 cards | 16x | 1x | 16x |
3 cards | 8x | 8x | 8x |
Manual recommends plugging a single card into the primary ( blue ) slot. There it shows the crashing symptoms. In the second ( black ) slot it works. One interpretation is that the primary slot is defunct, which though I thought as then the card should not have worked at all ( I think ). The second interpretation is that the card itself ( by design ) has troubles with 16x mode. Could it be the board switches to a lower frequency mode in the middle slot which is not too high for this card? Grasping for straws maybe but something has to trigger this. If somebody can shed some light on this or would like to test himself what could be wrong he's welcome.

bridgman
09-03-2008, 04:43 PM
I guess my first question would be "what happens with the card in the third slot ?". If that works OK then a bad slot on the mobo sounds most likely.

The frequencies should be the same whether you are running an x1 or x16 bus, but the driver would not be able to feed work to the card as quickly so utilization and power consumption would probably be a bit less -- but if that is the issue then problems should return in slot 3.

Dragonlord
09-03-2008, 09:25 PM
Unfortunately the card doesn't fit there so this is a no-go. I'm also not so fond of jeopardizing a working setup. I just tried to see if it's possible to figure out if others have troubles with their slots too and if downgraded slots do help.