The GPU lock with Rv670 and openarena is nothing new - it seems to have been a feature for almost a year (I haven't used rv670 for most of that time). On noticing the new gpu reset code in drm-next-3.9-wip I decided to provoke it on my AGP box and got - Feb 3 20:46:56 nf7 kernel: radeon 0000:02:00.0: GPU lockup CP stall for more than 10000msec Feb 3 20:46:56 nf7 kernel: radeon 0000:02:00.0: GPU lockup (waiting for 0x00000000000111c8 last fence id 0x00000000000111c6) Feb 3 20:46:56 nf7 kernel: radeon 0000:02:00.0: f51df600 unpin not necessary Feb 3 20:46:56 nf7 kernel: radeon 0000:02:00.0: Saved 217 dwords of commands on ring 0. Feb 3 20:46:56 nf7 kernel: BUG: unable to handle kernel paging request at f87aec0c Feb 3 20:46:56 nf7 kernel: IP: [<fa033a3e>] radeon_fence_process+0x7e/0x160 [radeon] Feb 3 20:46:56 nf7 kernel: *pde = 3702f067 *pte = 00000000 Feb 3 20:46:56 nf7 kernel: Oops: 0000 [#1] PREEMPT Feb 3 20:46:56 nf7 kernel: Modules linked in: radeon fbcon font bitblit ttm softcursor drm_kms_helper drm fb fbdev i2c_algo_bit i2c_core cfbcopyarea cfbimgblt cfbfillrect ehci_pci ehci_hcd nvidia_agp agpgart ohci_hcd usbhid usbcore usb_common snd_intel8x0 snd_ac97_codec ac97_bus forcedeth Feb 3 20:46:56 nf7 kernel: Pid: 2511, comm: openarena.i386 Not tainted 3.8.0-rc3-gc7fb5ff #1 /NF7-S/NF7 (nVidia-nForce2) Feb 3 20:46:56 nf7 kernel: EIP: 0060:[<fa033a3e>] EFLAGS: 00210246 CPU: 0 Feb 3 20:46:56 nf7 kernel: EIP is at radeon_fence_process+0x7e/0x160 [radeon] Feb 3 20:46:56 nf7 kernel: EAX: f87aec0c EBX: f73b2848 ECX: f73b2848 EDX: 00000000 Feb 3 20:46:56 nf7 kernel: ESI: 00000000 EDI: f73b2000 EBP: e4765d8c ESP: e4765d58 Feb 3 20:46:56 nf7 kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Feb 3 20:46:56 nf7 kernel: CR0: 8005003b CR2: f87aec0c CR3: 1e75f000 CR4: 000007d0 Feb 3 20:46:56 nf7 kernel: DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 Feb 3 20:46:56 nf7 kernel: DR6: ffff0ff0 DR7: 00000400 Feb 3 20:46:56 nf7 kernel: Process openarena.i386 (pid: 2511, ti=e4764000 task=f70ad2b0 task.ti=e4764000) Feb 3 20:46:56 nf7 kernel: Stack: Feb 3 20:46:56 nf7 kernel: c13d4e58 00765d7c 00000018 f73b2848 f73b2888 00000872 00000003 00000872 Feb 3 20:46:56 nf7 kernel: 00000000 0000000c 00000003 f73b2a5c e4765e0c e4765da4 fa034850 f73b2000 Feb 3 20:46:56 nf7 kernel: f73b2000 f73b2a5c e4765e0c e4765dc4 fa049bb8 f73b28e8 00000000 c12b06cd Feb 3 20:46:56 nf7 kernel: Call Trace: Feb 3 20:46:56 nf7 kernel: [<c13d4e58>] ? sub_preempt_count+0x8/0x80 Feb 3 20:46:56 nf7 kernel: [<fa034850>] radeon_fence_count_emitted+0x20/0x90 [radeon] Feb 3 20:46:56 nf7 kernel: [<fa049bb8>] radeon_ring_backup+0x38/0x100 [radeon] Feb 3 20:46:56 nf7 kernel: [<c12b06cd>] ? _dev_info+0x2d/0x30 Feb 3 20:46:56 nf7 kernel: [<fa020b22>] radeon_gpu_reset+0x62/0x1d0 [radeon] Feb 3 20:46:56 nf7 kernel: [<f8be99c6>] ? ttm_bo_unreserve+0x26/0x50 [ttm] Feb 3 20:46:56 nf7 kernel: [<fa04821d>] radeon_gem_handle_lockup.part.2+0xd/0x20 [radeon] Feb 3 20:46:56 nf7 kernel: [<fa048b13>] radeon_gem_wait_idle_ioctl+0xb3/0xd0 [radeon] Feb 3 20:46:56 nf7 kernel: [<fa048a60>] ? radeon_gem_busy_ioctl+0xf0/0xf0 [radeon] Feb 3 20:46:56 nf7 kernel: [<f8af9e82>] drm_ioctl+0x402/0x460 [drm] Feb 3 20:46:56 nf7 kernel: [<fa048a60>] ? radeon_gem_busy_ioctl+0xf0/0xf0 [radeon] Feb 3 20:46:56 nf7 kernel: [<c104f469>] ? ktime_add_safe+0x9/0x60 Feb 3 20:46:56 nf7 kernel: [<c104fe80>] ? hrtimer_forward+0xa0/0x190 Feb 3 20:46:56 nf7 kernel: [<f8af9a80>] ? drm_copy_field+0x80/0x80 [drm] Feb 3 20:46:56 nf7 kernel: [<c10eb65a>] do_vfs_ioctl+0x7a/0x590 Feb 3 20:46:56 nf7 kernel: [<c101ef0b>] ? lapic_next_event+0x1b/0x20 Feb 3 20:46:56 nf7 kernel: [<c106978d>] ? clockevents_program_event+0x9d/0x150 Feb 3 20:46:56 nf7 kernel: [<c106ab18>] ? tick_program_event+0x28/0x30 Feb 3 20:46:56 nf7 kernel: [<c1050612>] ? hrtimer_interrupt+0x182/0x2f0 Feb 3 20:46:56 nf7 kernel: [<c13d4e58>] ? sub_preempt_count+0x8/0x80 Feb 3 20:46:56 nf7 kernel: [<c1048cf9>] ? __rcu_read_unlock+0x9/0x60 Feb 3 20:46:56 nf7 kernel: [<c10f4bc7>] ? fget_light+0x77/0xd0 Feb 3 20:46:56 nf7 kernel: [<c10ebbac>] sys_ioctl+0x3c/0x70 Feb 3 20:46:56 nf7 kernel: [<c13d7ffa>] sysenter_do_call+0x12/0x22 Feb 3 20:46:56 nf7 kernel: Code: 8b 4d d4 8d 84 51 f0 00 00 00 8b 74 c7 08 89 75 e0 8b 74 c7 0c 8b 45 d8 83 c0 08 80 bf ec 0d 00 00 00 0f 84 bf 00 00 00 8b 40 0c <8b> 00 39 45 e8 8b 4d ec 89 c3 ba 01 00 00 00 0f 47 ce 39 f1 77 Feb 3 20:46:56 nf7 kernel: EIP: [<fa033a3e>] radeon_fence_process+0x7e/0x160 [radeon] SS:ESP 0068:e4765d58 Feb 3 20:46:56 nf7 kernel: CR2: 00000000f87aec0c Feb 3 20:46:56 nf7 kernel: ---[ end trace 457588a7cbc40235 ]---
(In reply to comment #0) > The GPU lock with Rv670 and openarena is nothing new - it seems to have been > a feature for almost a year (I haven't used rv670 for most of that time). > > On noticing the new gpu reset code in drm-next-3.9-wip I decided to provoke > it on my AGP box and got - Hmm I just managed to get the same running drm-fixes so it's not wip maybe it's because I am now using llvm. In the (no llvm) past with other kernels this GPU lock normally went quite well - the game often just continued for a while, until it hit another one.
(In reply to comment #1) > (In reply to comment #0) > > The GPU lock with Rv670 and openarena is nothing new - it seems to have been > > a feature for almost a year (I haven't used rv670 for most of that time). > > > > On noticing the new gpu reset code in drm-next-3.9-wip I decided to provoke > > it on my AGP box and got - > > Hmm I just managed to get the same running drm-fixes so it's not wip maybe > it's because I am now using llvm. In the (no llvm) past with other kernels > this GPU lock normally went quite well - the game often just continued for a > while, until it hit another one. It's nothing to do with llvm seems like it's a feature of more recent kernels.
I've observed this too, and it feels like it got worse within 3.13 development process. This is the last one in the logs: Jan 6 14:54:08 localhost kernel: radeon 0000:01:00.0: GPU lockup CP stall for more than 10034msec Jan 6 14:54:08 localhost kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000003dde last fence id 0x0000000000003ddd on ring 0) Jan 6 14:54:08 localhost kernel: [drm:rv770_stop_dpm] *ERROR* Could not force DPM to low. Jan 6 14:54:08 localhost kernel: [drm] Disabling audio 0 support Jan 6 14:54:08 localhost kernel: BUG: unable to handle kernel paging request at ffffc90402080ffc Jan 6 14:54:08 localhost kernel: IP: [<ffffffff813d8d9e>] radeon_ring_backup+0xbe/0x140 Jan 6 14:54:08 localhost kernel: PGD 11b028067 PUD 0 Jan 6 14:54:08 localhost kernel: Oops: 0000 [#1] PREEMPT SMP Jan 6 14:54:08 localhost kernel: Modules linked in: nfs lockd sunrpc snd_hda_codec_hdmi snd_hda_codec_realtek ath9k snd_hda_intel ath9k_common snd_hda_codec ath9k_hw snd_hwdep ath snd_pcm mac80211 snd_timer acer_wmi broadcom cfg80211 snd i2c_piix4 rfkill tg3 k10temp wmi soundcore sr_mod cdrom snd_page_alloc acpi_cpufreq ohci_pci ohci_hcd Jan 6 14:54:08 localhost kernel: CPU: 1 PID: 2836 Comm: kwin Not tainted 3.13.0-rc7-00012-gf0a679a #183 Jan 6 14:54:08 localhost kernel: Hardware name: Packard Bell EasyNote TK81/SJV52_DN, BIOS V2.14 07/27/2011 Jan 6 14:54:08 localhost kernel: task: ffff8800a81a7800 ti: ffff8800a8346000 task.ti: ffff8800a8346000 Jan 6 14:54:08 localhost kernel: RIP: 0010:[<ffffffff813d8d9e>] [<ffffffff813d8d9e>] radeon_ring_backup+0xbe/0x140 Jan 6 14:54:08 localhost kernel: RSP: 0018:ffff8800a8347ce8 EFLAGS: 00010246 Jan 6 14:54:08 localhost kernel: RAX: 0000000000000000 RBX: ffff88011a6d0f20 RCX: 0000000000000000 Jan 6 14:54:08 localhost kernel: RDX: 00000000000efc04 RSI: ffffc90402080ffc RDI: ffffea000015ffc0 Jan 6 14:54:08 localhost kernel: RBP: 00000000ffffffff R08: ffff880005700000 R09: 00000000fffffffa Jan 6 14:54:08 localhost kernel: R10: 0000000000000008 R11: 0000000000000100 R12: ffff88011a6d0ef8 Jan 6 14:54:08 localhost kernel: R13: 000000000003bf01 R14: ffff8800a8347d50 R15: 0000000000000000 Jan 6 14:54:08 localhost kernel: FS: 00007f1f6a8717c0(0000) GS:ffff88011fc80000(0000) knlGS:0000000000000000 Jan 6 14:54:08 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 6 14:54:08 localhost kernel: CR2: ffffc90402080ffc CR3: 00000000a8314000 CR4: 00000000000007e0 Jan 6 14:54:08 localhost kernel: Stack: Jan 6 14:54:08 localhost kernel: ffffffff813c5875 ffff88011a6d0000 ffff88011a6d0f20 ffff8800a8347d50 Jan 6 14:54:08 localhost kernel: 0000000000000000 ffff88011a6d0018 ffffffff813aad3e ffff88011a6d0700 Jan 6 14:54:08 localhost kernel: 00000001a8347df8 ffff88011a6d0f20 0000000000000000 ffff88006d74e048 Jan 6 14:54:08 localhost kernel: Call Trace: Jan 6 14:54:08 localhost kernel: [<ffffffff813c5875>] ? radeon_gart_table_vram_unpin+0x85/0x120 Jan 6 14:54:08 localhost kernel: [<ffffffff813aad3e>] ? radeon_gpu_reset+0xae/0x250 Jan 6 14:54:08 localhost kernel: [<ffffffff813c5233>] ? radeon_bo_wait+0xf3/0x150 Jan 6 14:54:08 localhost kernel: [<ffffffff813d6dc5>] ? radeon_gem_handle_lockup.part.6+0x5/0x10 Jan 6 14:54:08 localhost kernel: [<ffffffff813841a5>] ? drm_ioctl+0x485/0x580 Jan 6 14:54:08 localhost kernel: [<ffffffff810a51b5>] ? do_futex+0x105/0xc70 Jan 6 14:54:08 localhost kernel: [<ffffffff813a8975>] ? radeon_drm_ioctl+0x55/0xa0 Jan 6 14:54:08 localhost kernel: [<ffffffff8115d6b7>] ? do_vfs_ioctl+0x2c7/0x490 Jan 6 14:54:08 localhost kernel: [<ffffffff810a5d9c>] ? SyS_futex+0x7c/0x170 Jan 6 14:54:08 localhost kernel: [<ffffffff811671df>] ? fget_light+0x8f/0xf0 Jan 6 14:54:08 localhost kernel: [<ffffffff8115d920>] ? SyS_ioctl+0xa0/0xc0 Jan 6 14:54:08 localhost kernel: [<ffffffff81638862>] ? system_call_fastpath+0x16/0x1b Jan 6 14:54:08 localhost kernel: Code: 49 89 06 74 78 41 8d 55 ff 49 89 c0 31 c9 48 8d 14 95 04 00 00 00 eb 08 0f 1f 44 00 00 4d 8b 06 48 8b 73 08 8d 45 01 48 8d 34 ae <8b> 36 41 89 34 08 23 43 64 48 83 c1 04 48 39 d1 89 c5 75 de 4c Jan 6 14:54:08 localhost kernel: RIP [<ffffffff813d8d9e>] radeon_ring_backup+0xbe/0x140 Jan 6 14:54:08 localhost kernel: RSP <ffff8800a8347ce8> Jan 6 14:54:08 localhost kernel: CR2: ffffc90402080ffc Jan 6 14:54:08 localhost kernel: ---[ end trace 3e2cca537a43e686 ]--- Hardware is a HD 5470 (ChipID = 0x68e0) Google points me to several bugreports from Red Hat/Fedora, so this bug seems to be not uncommon.
Old - no h/w to test so closing. Johannes if you can still produce this with current kernels etc please re-open.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.