Page 1 of 3 123 LastLast
Results 1 to 10 of 28

Thread: Crash hunting in Radeon KMS

  1. #1
    Join Date
    Nov 2009
    Posts
    45

    Default Crash hunting in Radeon KMS

    Hi,

    Me and some of my fellow Archers, are plagued by crashes when running latest mesa,libdrm,xf86-video, and kernel GIT while using radeon KMS. UMS is fine. It causes crashes (not kernel "oops") of the video driver ("no signal" on the monitor) and makes the machine unresponsive to keyboard or SSH.

    We think that we've found a way to reproduce it in KDE SC 4.4:
    http://bugzilla.kernel.org/show_bug.cgi?id=15276#c19

    Could you please take a look at the bug report and see if you can reproduce the crash? Please state your kernel/mesa/libdrm version and card.

    Please help us find the bug responsible for the crashes.

    Cheers

  2. #2
    Join Date
    Nov 2009
    Location
    Italy
    Posts
    938

    Default

    So I'm not the only one!
    I have lot of crashes too (latest mesa, libdrm, xf86-video-ati, drm-radeon-testing).

  3. #3
    Join Date
    Nov 2009
    Location
    Italy
    Posts
    938

    Default

    HD3870 here.

  4. #4
    Join Date
    Nov 2009
    Posts
    45

    Default

    Anyone else experiences crashes in KMS when there is no video signal, the same audio plays over and over again (as if from a small buffer), there are no blinking leds (not a kernel oops) and the machine is unresponsive?

    Please, the more information we can gather in the bug report the better. It has already been assigned to the devs.

  5. #5
    Join Date
    May 2009
    Location
    Exodus hair
    Posts
    76

    Default

    Quote Originally Posted by Neuro View Post
    Anyone else experiences crashes in KMS when there is no video signal, the same audio plays over and over again (as if from a small buffer), there are no blinking leds (not a kernel oops) and the machine is unresponsive?

    Please, the more information we can gather in the bug report the better. It has already been assigned to the devs.
    Yes, I have a thread on UbuntuForum about that freeze... ATI Radeon HD3650 512M... Nothing useful I did not find in logs that I've looked in...

  6. #6
    Join Date
    Jul 2007
    Posts
    446

    Default Have you tried connecting a serial console or a net console?

    Quote Originally Posted by Neuro View Post
    Please, the more information we can gather in the bug report the better. It has already been assigned to the devs.
    Assuming that you have a second machine available to host the console, of course. But that way you'd be able to read any dmesg information, even when the local console stops responding.

  7. #7
    Join Date
    Oct 2007
    Posts
    178

    Default

    Cross-posted from the Archlinux BBS:

    A question to you guys with mysteriously crashing systems: are you all running a recent build of KDE/KWin?

  8. #8

    Default

    Quote Originally Posted by korpenkraxar View Post
    Cross-posted from the Archlinux BBS:

    A question to you guys with mysteriously crashing systems: are you all running a recent build of KDE/KWin?
    It seems that most of us are running KDE 4.4 SC.

  9. #9
    Join Date
    Oct 2008
    Posts
    11

    Post

    Same here, looks like that i am not the only one (Mobility X1400). I tried to upgrade to 2.6.33-rc7 but got the same crashes (btw i couldnt reproduce with the test case above). I get system lockup if the radeon is compiled as a module and a kernel panic (blinking leds) if the module is integrated into the kernel.

    The only good thing is that in one of there crashes dont affected the whole system and i managed to get a backtrace:

    Code:
    Feb 12 11:53:36 codemobile kernel: [50278.131689] ------------[ cut here ]------------
    Feb 12 11:53:36 codemobile kernel: [50278.131716] WARNING: at /usr/src/linux-2.6.32-gentoo-r4/lib/kref.c:43 kref_get+0x20/0x30()
    Feb 12 11:53:36 codemobile kernel: [50278.131721] Hardware name: MM061
    Feb 12 11:53:36 codemobile kernel: [50278.131725] Modules linked in: rfcomm sco bnep l2cap aufs squashfs i8k vboxnetadp vboxnetflt vboxdrv loop radeon btusb ttm b44 drm_kms_helper b43 mac80211 led_class bluetooth ssb cfbcopyarea cfbimgblt intel_agp rtc_cmos cfbfillrect
    Feb 12 11:53:36 codemobile kernel: [50278.131788] Pid: 5, comm: events/0 Not tainted 2.6.32-gentoo-r4-bfs313 #4
    Feb 12 11:53:36 codemobile kernel: [50278.131792] Call Trace:
    Feb 12 11:53:36 codemobile kernel: [50278.131813]  [<ffffffff81040323>] ? warn_slowpath_common+0x73/0xb0
    Feb 12 11:53:36 codemobile kernel: [50278.131820]  [<ffffffff8121ec20>] ? kref_get+0x20/0x30
    Feb 12 11:53:36 codemobile kernel: [50278.131843]  [<ffffffffa00c9633>] ? ttm_bo_delayed_delete+0x73/0x1b0 [ttm]
    Feb 12 11:53:36 codemobile kernel: [50278.131852]  [<ffffffffa00c9770>] ? ttm_bo_delayed_workqueue+0x0/0x30 [ttm]
    Feb 12 11:53:36 codemobile kernel: [50278.131878]  [<ffffffffa00c9782>] ? ttm_bo_delayed_workqueue+0x12/0x30 [ttm]
    Feb 12 11:53:36 codemobile kernel: [50278.131885]  [<ffffffff81057be5>] ? worker_thread+0x195/0x310
    Feb 12 11:53:36 codemobile kernel: [50278.131904]  [<ffffffff8105c620>] ? autoremove_wake_function+0x0/0x30
    Feb 12 11:53:36 codemobile kernel: [50278.131910]  [<ffffffff81057a50>] ? worker_thread+0x0/0x310
    Feb 12 11:53:36 codemobile kernel: [50278.131916]  [<ffffffff8105c26e>] ? kthread+0x8e/0xa0
    Feb 12 11:53:36 codemobile kernel: [50278.131935]  [<ffffffff8103981c>] ? schedule_tail+0x3c/0xe0
    Feb 12 11:53:36 codemobile kernel: [50278.131942]  [<ffffffff8100ceea>] ? child_rip+0xa/0x20
    Feb 12 11:53:36 codemobile kernel: [50278.131948]  [<ffffffff8105c1e0>] ? kthread+0x0/0xa0
    Feb 12 11:53:36 codemobile kernel: [50278.131964]  [<ffffffff8100cee0>] ? child_rip+0x0/0x20
    Feb 12 11:53:36 codemobile kernel: [50278.131968] ---[ end trace b06f625ae300867a ]---
    Feb 12 11:53:55 codemobile kernel: [50297.131806] CPU 0
    Feb 12 11:53:55 codemobile kernel: [50297.131829] Modules linked in: rfcomm sco bnep l2cap aufs squashfs i8k vboxnetadp vboxnetflt vboxdrv loop radeon btusb ttm b44 drm_kms_helper b$
    Feb 12 11:53:55 codemobile kernel: [50297.131829] Modules linked in: rfcomm sco bnep l2cap aufs squashfs i8k vboxnetadp vboxnetflt vboxdrv loop radeon btusb ttm b44 drm_kms_helper b$
    Feb 12 11:53:55 codemobile kernel: [50297.131926] Pid: 5, comm: events/0 Tainted: G        W  2.6.32-gentoo-r4-bfs313 #4 MM061
    Feb 12 11:53:55 codemobile kernel: [50297.131953] RIP: 0010:[<ffffffffa00c8412>]  [<ffffffffa00c8412>] ttm_bo_release_list+0xc2/0xd0 [ttm]
    Feb 12 11:53:55 codemobile kernel: [50297.131990] RSP: 0000:ffff88007f883d90  EFLAGS: 00010202
    Feb 12 11:53:55 codemobile kernel: [50297.132014] RAX: 0000000000000002 RBX: ffff88003a159e00 RCX: ffff880069970e00
    Feb 12 11:53:55 codemobile kernel: [50297.132030] RDX: 0000000000000001 RSI: ffffffffa00c8350 RDI: ffff88003a159e44
    Feb 12 11:53:55 codemobile kernel: [50297.132055] RBP: ffff88003a159e44 R08: ffff88007f882000 R09: 0000000000000000
    Feb 12 11:53:55 codemobile kernel: [50297.132080] R10: 000000000000c940 R11: 00000000ffffffff R12: ffff88007c963400
    Feb 12 11:53:55 codemobile kernel: [50297.132095] R13: ffff88003a159e44 R14: ffff88003a159e00 R15: ffff88003a159e44
    Feb 12 11:53:55 codemobile kernel: [50297.132121] FS:  0000000000000000(0000) GS:ffff880001a00000(0000) knlGS:0000000000000000
    Feb 12 11:53:55 codemobile kernel: [50297.132147] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    Feb 12 11:53:55 codemobile kernel: [50297.132161] CR2: 00007f9a32c5b000 CR3: 0000000069a6c000 CR4: 00000000000006f0
    Feb 12 11:53:55 codemobile kernel: [50297.132186] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    Feb 12 11:53:55 codemobile kernel: [50297.132211] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Feb 12 11:53:55 codemobile kernel: [50297.132237] Process events/0 (pid: 5, threadinfo ffff88007f882000, task ffff88007f841ae0)
    Feb 12 11:53:55 codemobile kernel: [50297.132274]  ffff88003a159e44 ffffffffa00c8350 ffff88007c963868 ffffffff8121ebc3
    Feb 12 11:53:55 codemobile kernel: [50297.132286] <0> ffff88003a159e44 ffff88003a159eb8 ffff88007d611c88 ffffffffa00c966d
    Feb 12 11:53:55 codemobile kernel: [50297.132316] <0> ffff88007f841cf8 ffff880000000000 000000007f882000 ffff88007c963400
    Feb 12 11:53:55 codemobile kernel: [50297.132383]  [<ffffffffa00c8350>] ? ttm_bo_release_list+0x0/0xd0 [ttm]
    Feb 12 11:53:55 codemobile kernel: [50297.132414]  [<ffffffff8121ebc3>] ? kref_put+0x33/0x70
    Feb 12 11:53:55 codemobile kernel: [50297.132442]  [<ffffffffa00c966d>] ? ttm_bo_delayed_delete+0xad/0x1b0 [ttm]
    Feb 12 11:53:55 codemobile kernel: [50297.132471]  [<ffffffffa00c9770>] ? ttm_bo_delayed_workqueue+0x0/0x30 [ttm]
    Feb 12 11:53:55 codemobile kernel: [50297.132501]  [<ffffffffa00c9782>] ? ttm_bo_delayed_workqueue+0x12/0x30 [ttm]
    Feb 12 11:53:55 codemobile kernel: [50297.132530]  [<ffffffff81057be5>] ? worker_thread+0x195/0x310
    Feb 12 11:53:55 codemobile kernel: [50297.132557]  [<ffffffff8105c620>] ? autoremove_wake_function+0x0/0x30
    Feb 12 11:53:55 codemobile kernel: [50297.132573]  [<ffffffff81057a50>] ? worker_thread+0x0/0x310
    Feb 12 11:53:55 codemobile kernel: [50297.132599]  [<ffffffff8105c26e>] ? kthread+0x8e/0xa0
    Feb 12 11:53:55 codemobile kernel: [50297.132627]  [<ffffffff8103981c>] ? schedule_tail+0x3c/0xe0
    Feb 12 11:53:55 codemobile kernel: [50297.132655]  [<ffffffff8100ceea>] ? child_rip+0xa/0x20
    Feb 12 11:53:55 codemobile kernel: [50297.132671]  [<ffffffff8105c1e0>] ? kthread+0x0/0xa0
    Feb 12 11:53:55 codemobile kernel: [50297.132696]  [<ffffffff8100cee0>] ? child_rip+0x0/0x20
    Feb 12 11:53:55 codemobile kernel: [50297.132889]  RSP <ffff88007f883d90>
    Feb 12 11:53:55 codemobile kernel: [50297.132914] ---[ end trace b06f625ae300867b ]---
    Feb 12 11:55:22 codemobile kernel: [50384.032617] ------------[ cut here ]------------
    Feb 12 11:55:22 codemobile kernel: [50384.032657] WARNING: at /usr/src/linux-2.6.32-gentoo-r4/lib/kref.c:43 kref_get+0x20/0x30()
    Feb 12 11:55:22 codemobile kernel: [50384.032682] Hardware name: MM061
    Feb 12 11:55:22 codemobile kernel: [50384.032706] Modules linked in: rfcomm sco bnep l2cap aufs squashfs i8k vboxnetadp vboxnetflt vboxdrv loop radeon btusb ttm b44 drm_kms_helper b$
    Feb 12 11:55:22 codemobile kernel: [50384.032807] Pid: 2263, comm: X Tainted: G      D W  2.6.32-gentoo-r4-bfs313 #4
    Feb 12 11:55:22 codemobile kernel: [50384.032832] Call Trace:
    Feb 12 11:55:22 codemobile kernel: [50384.032849]  [<ffffffff81040323>] ? warn_slowpath_common+0x73/0xb0
    Feb 12 11:55:22 codemobile kernel: [50384.032877]  [<ffffffff8121ec20>] ? kref_get+0x20/0x30
    Feb 12 11:55:22 codemobile kernel: [50384.032908]  [<ffffffffa00c823d>] ? ttm_bo_unreserve+0x8d/0x100 [ttm]
    Feb 12 11:55:22 codemobile kernel: [50384.032961]  [<ffffffffa0109063>] ? radeon_object_list_unreserve+0x33/0x50 [radeon]
    Feb 12 11:55:22 codemobile kernel: [50384.033002]  [<ffffffffa0117c69>] ? radeon_cs_parser_fini+0x109/0x110 [radeon]
    Feb 12 11:55:22 codemobile kernel: [50384.033053]  [<ffffffffa011842a>] ? radeon_cs_ioctl+0x11a/0x1e0 [radeon]
    Feb 12 11:55:22 codemobile kernel: [50384.033072]  [<ffffffff812b580a>] ? drm_ioctl+0x18a/0x3b0
    Feb 12 11:55:22 codemobile kernel: [50384.033122]  [<ffffffffa0118310>] ? radeon_cs_ioctl+0x0/0x1e0 [radeon]
    Feb 12 11:55:22 codemobile kernel: [50384.033152]  [<ffffffff810d7662>] ? do_sync_read+0xe2/0x120
    Feb 12 11:55:22 codemobile kernel: [50384.033179]  [<ffffffff810e6ac2>] ? vfs_ioctl+0x82/0xb0
    Feb 12 11:55:22 codemobile kernel: [50384.033195]  [<ffffffff810e6c18>] ? do_vfs_ioctl+0x88/0x570
    Feb 12 11:55:22 codemobile kernel: [50384.033222]  [<ffffffff81044973>] ? do_setitimer+0x1c3/0x240
    Feb 12 11:55:22 codemobile kernel: [50384.033248]  [<ffffffff810e7149>] ? sys_ioctl+0x49/0x80
    Feb 12 11:55:22 codemobile kernel: [50384.033264]  [<ffffffff8100bfab>] ? system_call_fastpath+0x16/0x1b
    Feb 12 11:55:22 codemobile kernel: [50384.033289] ---[ end trace b06f625ae300867c ]---
    Feb 12 11:56:37 codemobile kernel: [50458.938300] SysRq : Keyboard mode set to system default
    Feb 12 11:56:39 codemobile kernel: [50461.262824] SysRq : Terminate All Tasks
    I dont know how to try to debug this properly because is very hard to reproduce on my laptop (sometimes i work 20+ hours and nothing happens, and other times i got the freeze/panic in 2 hours or less).

  10. #10
    Join Date
    Nov 2009
    Posts
    45

    Default

    codestation, nice one with the trace

    Are you getting that under KDE 4.4? Are you loosing monitor signal? It's a bit awkward because I ceased to get kernel logs in syslog after I moved to kernel 2.6.33- in february.

    Cheers,
    Michal

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •