I'm getting fence errors on RS785 (radeon HD 4200) in dmesg and disabled dri with vanilla 2.6.33. Error below and full dmesg attached. [ 6.179288] [drm] Initialized drm 1.1.0 20060810 [ 6.647815] [drm] radeon defaulting to kernel modesetting. [ 6.656630] [drm] radeon kernel modesetting enabled. [ 6.665470] radeon 0000:01:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 [ 6.674214] radeon 0000:01:05.0: setting latency timer to 64 [ 6.675384] [drm] radeon: Initializing kernel modesetting. [ 6.692803] [drm] register mmio base: 0xFE9F0000 [ 6.701275] [drm] register mmio size: 65536 [ 6.709638] HDA Intel 0000:00:14.2: PCI INT A -> GSI 16 (level, low) -> IRQ 16 [ 6.718524] ATOM BIOS: 113 [ 6.726772] [drm] Clocks initialized ! [ 6.736928] [drm] Detected VRAM RAM=128M, BAR=128M [ 6.737990] [drm] RAM width 32bits DDR [ 6.739095] [TTM] Zone kernel: Available graphics memory: 2029616 kiB. [ 6.740164] [drm] radeon: 128M of VRAM memory ready [ 6.743646] [drm] radeon: 512M of GTT memory ready. [ 6.744714] [drm] radeon: irq initialized. [ 6.745728] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 6.747053] [drm] Loading RS780 Microcode [ 6.748054] platform radeon_cp.0: firmware: requesting radeon/RS780_pfp.bin [ 6.767394] hda-codec: No codec parser is available [ 6.788131] alloc irq_desc for 20 on node 0 [ 6.788133] alloc kstat_irqs on node 0 [ 6.788139] EMU10K1_Audigy 0000:03:05.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20 [ 6.880352] platform radeon_cp.0: firmware: requesting radeon/RS780_me.bin [ 6.964294] platform radeon_cp.0: firmware: requesting radeon/R600_rlc.bin [ 7.059944] [drm] ring test succeeded in 1 usecs [ 7.061030] [drm] radeon: ib pool ready. [ 9.582117] [drm:radeon_fence_wait] *ERROR* fence(ffff88011e2403c0:0x00000001) 510ms timeout going to reset GPU [ 9.583167] radeon 0000:01:05.0: GPU softreset [ 9.584211] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xA0003030 [ 9.585249] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00000003 [ 9.586278] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20002040 [ 9.792461] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 9.793472] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEE [ 9.794533] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001 [ 9.795584] radeon 0000:01:05.0: R_000E60_SRBM_SOFT_RESET=0x00000C02 [ 9.796726] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0x00003030 [ 9.797704] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00000003 [ 9.798674] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20000040 [ 9.801112] [drm:radeon_fence_wait] *ERROR* fence(ffff88011e2403c0:0x00000001) 739ms timeout [ 9.802083] [drm:radeon_fence_wait] *ERROR* last signaled fence(0x00000001) [ 10.008268] [drm:r600_ib_test] *ERROR* radeon: ib test failed (sracth(0x8504)=0xCAFEDEAD) [ 10.009301] radeon 0000:01:05.0: IB test failed (-22). [ 10.010248] [drm] Enabling audio support [ 10.010428] [drm] Radeon Display Connectors [ 10.012283] [drm] Connector 0: [ 10.013201] [drm] VGA [ 10.014107] [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c [ 10.015027] [drm] Encoders: [ 10.015932] [drm] CRT1: INTERNAL_KLDSCP_DAC1 [ 10.016849] [drm] Connector 1: [ 10.017756] [drm] DVI-D [ 10.018655] [drm] HPD1 [ 10.019549] [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c [ 10.020451] [drm] Encoders: [ 10.021340] [drm] DFP3: INTERNAL_KLDSCP_LVTMA [ 10.206070] [drm] fb mappable at 0xF0141000 [ 10.206943] [drm] vram apper at 0xF0000000 [ 10.207820] [drm] size 5242880 [ 10.208682] [drm] fb depth is 24 [ 10.209530] [drm] pitch is 5120 [ 10.210447] fb: conflicting fb hw usage radeondrmfb vs VESA VGA - removing generic driver [ 10.211318] Console: switching to colour dummy device 80x25 [ 10.211436] Console: switching to colour frame buffer device 160x64 [ 10.217744] fb0: radeondrmfb frame buffer device [ 10.217768] registered panic notifier [ 10.217789] [drm] Initialized radeon 2.0.0 20080528 for 0000:01:05.0 on minor 0
Created attachment 33759 [details] full dmesg output
Does this happen all the time ?
yes - could this be sideport related?
(In reply to comment #3) > yes - could this be sideport related? > Not likely.
I tried to bisect this and found that 2.6.32 also has this issue (I got this system a few weeks ago only). 2.6.31 shows "ring test failed" and I guess support for rs785 was not added earlier. So this chip seems to never have worked with KMS.
I also tested the kernel from fredora 13a to see if I have a problem with my config, but it also shows fence errors. Other failed tests: Sideport -> UMA, limit memory from 4G to 2G. As this bug happens on a released kernel and also crashes X sometimes, I changed the severity to critical.
tried with nosmp, mem=2G (out of 4) and NO_HZ, NO_PREEMPT - no change. below is the log with glisse drm-radeon-next tree (grr - again slow chip clock default): [ 7.940041] [drm] Initialized drm 1.1.0 20060810 [ 8.479532] [drm] radeon defaulting to kernel modesetting. [ 8.482583] [drm] radeon kernel modesetting enabled. [ 8.492371] radeon 0000:01:05.0: PCI INT A -> Link[LNKC] -> GSI 10 (level, low) -> IRQ 10 [ 8.495381] radeon 0000:01:05.0: setting latency timer to 64 [ 8.496445] [drm] radeon: Initializing kernel modesetting. [ 8.499493] [drm] register mmio base: 0xFE9F0000 [ 8.502432] [drm] register mmio size: 65536 [ 8.505857] ATOM BIOS: 113 [ 8.508694] [drm] Clocks initialized ! [ 8.511485] [drm] 3 Power State(s) [ 8.514250] [drm] State 0 Default (default) [ 8.517008] [drm] 1 Clock Mode(s) [ 8.519743] [drm] 0 engine: 300000 [ 8.522456] [drm] State 1 Performance [ 8.525133] [drm] 1 Clock Mode(s) [ 8.527734] [drm] 0 engine: 200000 [ 8.530262] [drm] State 2 Default [ 8.532758] [drm] 1 Clock Mode(s) [ 8.535239] [drm] 0 engine: 500000 [ 8.537706] [drm] radeon: power management initialized [ 8.540196] radeon 0000:01:05.0: VRAM: 128M 0xC0000000 - 0xC7FFFFFF (128M used) [ 8.542724] radeon 0000:01:05.0: GTT: 512M 0xA0000000 - 0xBFFFFFFF [ 8.545807] [drm] Detected VRAM RAM=128M, BAR=128M [ 8.546508] [drm] RAM width 32bits DDR [ 8.550749] [TTM] Zone kernel: Available graphics memory: 1029488 kiB. [ 8.551439] [drm] radeon: 128M of VRAM memory ready [ 8.552113] [drm] radeon: 512M of GTT memory ready. [ 8.552785] [drm] radeon: irq initialized. [ 8.553447] [drm] GART: num cpu pages 131072, num gpu pages 131072 [ 8.554599] [drm] Loading RS780 Microcode [ 8.555252] platform radeon_cp.0: firmware: requesting radeon/RS780_pfp.bin [ 8.615296] platform radeon_cp.0: firmware: requesting radeon/RS780_me.bin [ 8.643581] platform radeon_cp.0: firmware: requesting radeon/R600_rlc.bin [ 8.688448] [drm] ring test succeeded in 1 usecs [ 8.689138] [drm] radeon: ib pool ready. [ 14.190109] radeon 0000:01:05.0: GPU lockup CP stall for more than 1000msec [ 14.190738] ------------[ cut here ]------------ [ 14.191396] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:234 radeon_fence_wait+0x35d/0x3c0 [radeon]() [ 14.192044] Hardware name: System Product Name [ 14.192671] GPU lockup (waiting for 0x00000001 last fence id 0x00000000) [ 14.193314] Modules linked in: snd_hda_intel(+) radeon(+) snd_emu10k1 snd_rawmidi ttm snd_hda_codec snd_ac97_codec ac97_bus drm_kms_helper snd_pcm snd_seq_device drm snd_util_mem amd64_edac_mod emu10k1_gp snd_timer i2c_algo_bit snd_hwdep firewire_ohci snd kobil_sct edac_core firewire_core shpchp gameport crc_itu_t asus_atk0110 pcspkr soundcore snd_page_alloc button usbserial k10temp edac_mce_amd i2c_piix4 pci_hotplug sr_mod sg cdrom sd_mod ahci fan processor pata_atiixp libata scsi_mod thermal thermal_sys [ 14.196212] Pid: 691, comm: work_for_cpu Not tainted 2.6.33 #3 [ 14.196888] Call Trace: [ 14.197575] [<ffffffff810466a8>] warn_slowpath_common+0x78/0xb0 [ 14.198260] [<ffffffff8104673c>] warn_slowpath_fmt+0x3c/0x40 [ 14.198937] [<ffffffffa033a5fd>] radeon_fence_wait+0x35d/0x3c0 [radeon] [ 14.199616] [<ffffffff81064070>] ? autoremove_wake_function+0x0/0x40 [ 14.200299] [<ffffffffa0375569>] r600_ib_test+0x189/0x300 [radeon] [ 14.200961] [<ffffffffa037d6e0>] r600_init+0x2e0/0x360 [radeon] [ 14.201627] [<ffffffffa03293ad>] radeon_device_init+0x29d/0x370 [radeon] [ 14.202297] [<ffffffffa032a1ee>] radeon_driver_load_kms+0x9e/0x1d0 [radeon] [ 14.202945] [<ffffffffa020140e>] drm_get_dev+0x34e/0x560 [drm] [ 14.203593] [<ffffffff8103c86d>] ? default_wake_function+0xd/0x10 [ 14.204227] [<ffffffff8105f7f0>] ? do_work_for_cpu+0x0/0x30 [ 14.204851] [<ffffffffa0397012>] radeon_pci_probe+0x10/0x270 [radeon] [ 14.205479] [<ffffffff81225d72>] local_pci_probe+0x12/0x20 [ 14.206100] [<ffffffff8105f803>] do_work_for_cpu+0x13/0x30 [ 14.206704] [<ffffffff81063b7e>] kthread+0x8e/0xa0 [ 14.207314] [<ffffffff81003b94>] kernel_thread_helper+0x4/0x10 [ 14.207902] [<ffffffff81063af0>] ? kthread+0x0/0xa0 [ 14.208503] [<ffffffff81003b90>] ? kernel_thread_helper+0x0/0x10 [ 14.209100] ---[ end trace 48fab13bc7a5b259 ]--- [ 14.209681] [drm] Disabling audio support [ 14.209708] radeon 0000:01:05.0: GPU softreset [ 14.210855] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xA0003030 [ 14.211443] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00000003 [ 14.212028] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x20002040 [ 14.339731] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 14.340318] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEE [ 14.355896] radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00000001 [ 14.372490] radeon 0000:01:05.0: R_008010_GRBM_STATUS=0xA0003030 [ 14.373084] radeon 0000:01:05.0: R_008014_GRBM_STATUS2=0x00000003 [ 14.373665] radeon 0000:01:05.0: R_000E50_SRBM_STATUS=0x2000B040 [ 14.375260] radeon 0000:01:05.0: GPU reset succeed [ 14.392243] [drm] Clocks initialized ! [ 14.519942] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 14.647642] radeon 0000:01:05.0: Wait for MC idle timedout ! [ 14.811733] [drm:r600_ring_test] *ERROR* radeon: ring test failed (scratch(0x8508)=0xCAFEDEAD) [ 14.812345] [drm:r600_resume] *ERROR* r600 startup failed on resume [ 14.812949] BUG: unable to handle kernel NULL pointer dereference at (null) [ 14.813336] IP: [<ffffffffa02668a4>] drm_helper_resume_force_mode+0x34/0x240 [drm_kms_helper] [ 14.813336] PGD 37d25067 PUD 37dd7067 PMD 0 [ 14.813336] Oops: 0000 [#1] SMP [ 14.813336] last sysfs file: /sys/module/snd_hda_intel/initstate [ 14.813336] CPU 0 [ 14.813336] Pid: 691, comm: work_for_cpu Tainted: G W 2.6.33 #3 M4A785TD-V EVO/System Product Name [ 14.813336] RIP: 0010:[<ffffffffa02668a4>] [<ffffffffa02668a4>] drm_helper_resume_force_mode+0x34/0x240 [drm_kms_helper] [ 14.813336] RSP: 0018:ffff88007e7cdc40 EFLAGS: 00010293 [ 14.813336] RAX: 0000000000000020 RBX: fffffffffffffff8 RCX: ffffc90011861740 [ 14.813336] RDX: 0000000000001740 RSI: 00000000411a0015 RDI: ffff88007e7eb800 [ 14.813336] RBP: ffff88007e7cdc70 R08: 0000000000001724 R09: 0000000000000000 [ 14.813336] R10: 000000000000028d R11: 0000000000000000 R12: ffff88007e7ebca0 [ 14.813336] R13: ffff88007e7eb800 R14: ffff88007e7ebcb8 R15: ffff88007ed8e930 [ 14.813336] FS: 00007f42af12d790(0000) GS:ffff880001c00000(0000) knlGS:0000000000000000 [ 14.813336] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 14.813336] CR2: 0000000000000000 CR3: 0000000037b64000 CR4: 00000000000006f0 [ 14.813336] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 14.813336] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 14.813336] Process work_for_cpu (pid: 691, threadinfo ffff88007e7cc000, task ffff88007e8f2540) [ 14.813336] Stack: [ 14.813336] 0000000000000007 ffff88007ed8e000 0000000000000000 ffff88007f4fc340 [ 14.813336] <0> 0000000000000000 ffff88007ed8e930 ffff88007e7cdca0 ffffffffa0328dee [ 14.813336] <0> ffff88007e7cdca0 ffff88007e02fbc0 ffff88007ed8e000 ffff88007e7cdcf0 [ 14.813336] Call Trace: [ 14.813336] [<ffffffffa0328dee>] radeon_gpu_reset+0xae/0xb0 [radeon] [ 14.813336] [<ffffffffa033a62f>] radeon_fence_wait+0x38f/0x3c0 [radeon] [ 14.813336] [<ffffffff81064070>] ? autoremove_wake_function+0x0/0x40 [ 14.813336] [<ffffffffa0375569>] r600_ib_test+0x189/0x300 [radeon] [ 14.813336] [<ffffffffa037d6e0>] r600_init+0x2e0/0x360 [radeon] [ 14.813336] [<ffffffffa03293ad>] radeon_device_init+0x29d/0x370 [radeon] [ 14.813336] [<ffffffffa032a1ee>] radeon_driver_load_kms+0x9e/0x1d0 [radeon] [ 14.813336] [<ffffffffa020140e>] drm_get_dev+0x34e/0x560 [drm] [ 14.813336] [<ffffffff8103c86d>] ? default_wake_function+0xd/0x10 [ 14.813336] [<ffffffff8105f7f0>] ? do_work_for_cpu+0x0/0x30 [ 14.813336] [<ffffffffa0397012>] radeon_pci_probe+0x10/0x270 [radeon] [ 14.813336] [<ffffffff81225d72>] local_pci_probe+0x12/0x20 [ 14.813336] [<ffffffff8105f803>] do_work_for_cpu+0x13/0x30 [ 14.813336] [<ffffffff81063b7e>] kthread+0x8e/0xa0 [ 14.813336] [<ffffffff81003b94>] kernel_thread_helper+0x4/0x10 [ 14.813336] [<ffffffff81063af0>] ? kthread+0x0/0xa0 [ 14.813336] [<ffffffff81003b90>] ? kernel_thread_helper+0x0/0x10 [ 14.813336] Code: 8d b7 b8 04 00 00 41 55 49 89 fd 41 54 4c 8d a7 a0 04 00 00 53 48 83 ec 08 48 8b 9f b8 04 00 00 48 83 eb 08 eb 05 90 48 8d 58 f8 <48> 8b 43 08 48 8d 53 08 49 39 d6 0f 18 08 0f 84 b8 01 00 00 80 [ 14.813336] RIP [<ffffffffa02668a4>] drm_helper_resume_force_mode+0x34/0x240 [drm_kms_helper] [ 14.813336] RSP <ffff88007e7cdc40> [ 14.813336] CR2: 0000000000000000 [ 14.840412] ---[ end trace 48fab13bc7a5b25a ]---
ok - turned out that the oops where pm related. When started with radeon.{dynpm,dynclks}=0 everything works fine! Unfortunately, I cannot test the GPU reset patches alone, as they do not apply to 2.6.33. Jérôme, could you please supply something relative to 2.6.33? Thanks!
Created attachment 34022 [details] [review] backported patch This bug report looks like a soliloquy. Anyway, I backported "drm/radeon/kms: fence cleanup + more reliable GPU lockup detection V4" myself to 2.6.33 and it fixes this problem. Can this be forwarded to upstream->stable?
Is it possible that bug 32662 is related to this one?
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.