Created attachment 108324 [details] gdm journal When the desktop environment starts up, two things can happen: 1. Before the login prompt, I get a corrupt desktop 2. After a successful login, I get a corrupt desktop In both cases, the corruption looks almost identical, so I'm assuming the problem is just different symptoms of the same problem, whether it's configuration or a bug.
Created attachment 108325 [details] Picture of what the screens look like
From another shell, when stopping and then restarting gdm: ------------------------------ Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: Saved 87 dwords of commands on ring 0. Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GPU softreset: 0x00000008 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS = 0xA0003828 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS_SE0 = 0x00000007 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS_SE1 = 0x00000007 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: SRBM_STATUS = 0x200000C0 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: SRBM_STATUS2 = 0x00000000 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: R_008678_CP_STALLED_STAT2 = 0x00010100 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: R_00867C_CP_BUSY_STAT = 0x00020180 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: R_008680_CP_STAT = 0x80038042 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GRBM_SOFT_RESET=0x00004001 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: SRBM_SOFT_RESET=0x00000100 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS = 0x00003828 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS_SE0 = 0x00000007 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS_SE1 = 0x00000007 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: SRBM_STATUS = 0x200000C0 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: SRBM_STATUS2 = 0x00000000 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: R_008680_CP_STAT = 0x00000000 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GPU reset succeeded, trying to resume Oct 22 18:04:40 eanna kernel: [drm] PCIE gen 2 link speeds already enabled Oct 22 18:04:40 eanna kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025D000). Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: WB enabled Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff88100e3e7c00 Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff88100e3e7c0c Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90016b1c418 Oct 22 18:04:40 eanna kernel: [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD) Oct 22 18:04:40 eanna kernel: [drm:evergreen_resume] *ERROR* evergreen startup failed on resume
Please attach your dmesg output. It looks like the GPU hangs and fails to recover which is what causes the corruption.
Does it matter whether the dmesg output is from a boot where the problem occurred, or can it be any dmesg output as long as it's recent?
Is the problem random or does it always occur? Preferrably a dmesg from a problematic situation, but either way is fine if it's hard to reproduce.
The problem is intermittent, but consistently so - in other words, I can't seem to predict what will produce the error, but it happens every second, third or fourth boot. You'll see from the attached dmesg, that I do have some btrfs + bcache stuff happening which may be contributing, or may just be noise.
Created attachment 108709 [details] dmesg log
This is still happening. A full power cycle seems to rectify it. A normal reboot seems to rectify it. If the system locks up for any reason, I get this behavior.
Can you try a newer kernel?
I'm on Arch so I've seen evergreen kernels. When I opened the bug I was at 3.16.4, I'm now at 3.18.6. [2014-11-02 19:36] [PACMAN] upgraded linux (3.16.4-1 -> 3.17.2-1) [2014-11-16 14:01] [PACMAN] upgraded linux (3.17.2-1 -> 3.17.3-1) [2014-11-26 08:03] [PACMAN] upgraded linux (3.17.3-1 -> 3.17.4-1) [2014-12-12 08:26] [PACMAN] upgraded linux (3.17.4-1 -> 3.17.6-1) [2015-01-20 07:59] [ALPM] upgraded linux (3.17.6-1 -> 3.18.2-2) [2015-01-30 17:22] [ALPM] upgraded linux (3.18.2-2 -> 3.18.4-1) [2015-02-17 20:01] [ALPM] upgraded linux (3.18.4-1 -> 3.18.6-1) There does seem to have been incremental improvements on this behavior.
Not sure if this signifies, but it's a similar log entry to those found when desktop gets corrupt. Completely cold boot produced the following journal snippet (no, couldn't get more): Apr 18 10:22:54 eanna kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000). Apr 18 10:22:54 eanna kernel: radeon 0000:42:00.0: WB enabled Apr 18 10:22:54 eanna kernel: radeon 0000:42:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff880f4aec2c00 Apr 18 10:22:54 eanna kernel: radeon 0000:42:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff880f4aec2c0c Apr 18 10:22:54 eanna kernel: radeon 0000:42:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90016b1c418 Apr 18 10:22:54 eanna kernel: [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD) Apr 18 10:22:54 eanna kernel: [drm:evergreen_resume] *ERROR* evergreen startup failed on resume Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: ring 0 stalled for more than 10386msec Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GPU lockup (current fence id 0x0000000000000001 last fence id 0x0000000000000004 on ring 0) Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: Saved 119 dwords of commands on ring 0. Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GPU softreset: 0x00000008 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS = 0xA0003828 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS_SE0 = 0x00000007 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS_SE1 = 0x00000007 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: SRBM_STATUS = 0x200000C0 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: SRBM_STATUS2 = 0x00000000 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: R_008678_CP_STALLED_STAT2 = 0x00010002 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: R_00867C_CP_BUSY_STAT = 0x00020180 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: R_008680_CP_STAT = 0x80038243 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GRBM_SOFT_RESET=0x00004001 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: SRBM_SOFT_RESET=0x00000100 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS = 0x00003828 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS_SE0 = 0x00000007 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GRBM_STATUS_SE1 = 0x00000007 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: SRBM_STATUS = 0x200000C0 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: SRBM_STATUS2 = 0x00000000 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: R_008680_CP_STAT = 0x00000000 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GPU reset succeeded, trying to resume Apr 18 10:23:05 eanna kernel: [drm] PCIE gen 2 link speeds already enabled Apr 18 10:23:05 eanna kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000). Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: WB enabled Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff880f4aec2c00 Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff880f4aec2c0c Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90016b1c418 Apr 18 10:23:05 eanna kernel: [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD) Apr 18 10:23:05 eanna kernel: [drm:evergreen_resume] *ERROR* evergreen startup failed on resume On power cycle system boots fine and gdm starts.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-ati/issues/111.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.