Bug 85385 - corrupt desktop - at and after login
Summary: corrupt desktop - at and after login
Status: RESOLVED MOVED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/Radeon (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) All
: medium critical
Assignee: xf86-video-ati maintainers
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-23 21:21 UTC by G. Richard Bellamy
Modified: 2019-11-19 07:48 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
gdm journal (73.82 KB, text/plain)
2014-10-23 21:21 UTC, G. Richard Bellamy
no flags Details
Picture of what the screens look like (1.46 MB, image/jpeg)
2014-10-23 21:24 UTC, G. Richard Bellamy
no flags Details
dmesg log (79.80 KB, text/plain)
2014-10-30 23:47 UTC, G. Richard Bellamy
no flags Details

Description G. Richard Bellamy 2014-10-23 21:21:26 UTC
Created attachment 108324 [details]
gdm journal

When the desktop environment starts up, two things can happen:
1. Before the login prompt, I get a corrupt desktop
2. After a successful login, I get a corrupt desktop

In both cases, the corruption looks almost identical, so I'm assuming the
problem is just different symptoms of the same problem, whether it's
configuration or a bug.
Comment 1 G. Richard Bellamy 2014-10-23 21:24:48 UTC
Created attachment 108325 [details]
Picture of what the screens look like
Comment 2 G. Richard Bellamy 2014-10-23 21:30:13 UTC
From another shell, when stopping and then restarting gdm:
------------------------------
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: Saved 87 dwords of commands
on ring 0.
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GPU softreset: 0x00000008
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS               = 0xA0003828
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS_SE0           = 0x00000007
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS_SE1           = 0x00000007
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   SRBM_STATUS               = 0x200000C0
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   SRBM_STATUS2              = 0x00000000
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010100
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   R_00867C_CP_BUSY_STAT     = 0x00020180
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   R_008680_CP_STAT          = 0x80038042
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GRBM_SOFT_RESET=0x00004001
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: SRBM_SOFT_RESET=0x00000100
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS               = 0x00003828
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS_SE0           = 0x00000007
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS_SE1           = 0x00000007
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   SRBM_STATUS               = 0x200000C0
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   SRBM_STATUS2              = 0x00000000
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   R_008680_CP_STAT          = 0x00000000
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: GPU reset succeeded, trying to resume
Oct 22 18:04:40 eanna kernel: [drm] PCIE gen 2 link speeds already enabled
Oct 22 18:04:40 eanna kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025D000).
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: WB enabled
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff88100e3e7c00
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff88100e3e7c0c
Oct 22 18:04:40 eanna kernel: radeon 0000:42:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90016b1c418
Oct 22 18:04:40 eanna kernel: [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD)
Oct 22 18:04:40 eanna kernel: [drm:evergreen_resume] *ERROR* evergreen startup failed on resume
Comment 3 Alex Deucher 2014-10-26 18:57:37 UTC
Please attach your dmesg output.  It looks like the GPU hangs and fails to recover which is what causes the corruption.
Comment 4 G. Richard Bellamy 2014-10-26 22:53:43 UTC
Does it matter whether the dmesg output is from a boot where the problem occurred, or can it be any dmesg output as long as it's recent?
Comment 5 Alex Deucher 2014-10-27 01:12:49 UTC
Is the problem random or does it always occur?  Preferrably a dmesg from a problematic situation, but either way is fine if it's hard to reproduce.
Comment 6 G. Richard Bellamy 2014-10-30 23:47:16 UTC
The problem is intermittent, but consistently so - in other words, I can't seem to predict what will produce the error, but it happens every second, third or fourth boot.

You'll see from the attached dmesg, that I do have some btrfs + bcache stuff happening which may be contributing, or may just be noise.
Comment 7 G. Richard Bellamy 2014-10-30 23:47:41 UTC
Created attachment 108709 [details]
dmesg log
Comment 8 G. Richard Bellamy 2015-03-06 18:51:31 UTC
This is still happening.

A full power cycle seems to rectify it. A normal reboot seems to rectify it.

If the system locks up for any reason, I get this behavior.
Comment 9 Alex Deucher 2015-03-06 22:08:30 UTC
Can you try a newer kernel?
Comment 10 G. Richard Bellamy 2015-03-07 01:56:08 UTC
I'm on Arch so I've seen evergreen kernels.

When I opened the bug I was at 3.16.4, I'm now at 3.18.6.

[2014-11-02 19:36] [PACMAN] upgraded linux (3.16.4-1 -> 3.17.2-1)
[2014-11-16 14:01] [PACMAN] upgraded linux (3.17.2-1 -> 3.17.3-1)
[2014-11-26 08:03] [PACMAN] upgraded linux (3.17.3-1 -> 3.17.4-1)
[2014-12-12 08:26] [PACMAN] upgraded linux (3.17.4-1 -> 3.17.6-1)
[2015-01-20 07:59] [ALPM] upgraded linux (3.17.6-1 -> 3.18.2-2)
[2015-01-30 17:22] [ALPM] upgraded linux (3.18.2-2 -> 3.18.4-1)
[2015-02-17 20:01] [ALPM] upgraded linux (3.18.4-1 -> 3.18.6-1)

There does seem to have been incremental improvements on this behavior.
Comment 11 G. Richard Bellamy 2015-04-18 17:26:34 UTC
Not sure if this signifies, but it's a similar log entry to those found when desktop gets corrupt.

Completely cold boot produced the following journal snippet (no, couldn't get more):
Apr 18 10:22:54 eanna kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000).
Apr 18 10:22:54 eanna kernel: radeon 0000:42:00.0: WB enabled
Apr 18 10:22:54 eanna kernel: radeon 0000:42:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff880f4aec2c00
Apr 18 10:22:54 eanna kernel: radeon 0000:42:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff880f4aec2c0c
Apr 18 10:22:54 eanna kernel: radeon 0000:42:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90016b1c418
Apr 18 10:22:54 eanna kernel: [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD)
Apr 18 10:22:54 eanna kernel: [drm:evergreen_resume] *ERROR* evergreen startup failed on resume
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: ring 0 stalled for more than 10386msec
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GPU lockup (current fence id 0x0000000000000001 last fence id 0x0000000000000004 on ring 0)
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: Saved 119 dwords of commands on ring 0.
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GPU softreset: 0x00000008
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS               = 0xA0003828
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS_SE0           = 0x00000007
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS_SE1           = 0x00000007
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   SRBM_STATUS               = 0x200000C0
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   SRBM_STATUS2              = 0x00000000
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010002
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   R_00867C_CP_BUSY_STAT     = 0x00020180
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   R_008680_CP_STAT          = 0x80038243
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GRBM_SOFT_RESET=0x00004001
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: SRBM_SOFT_RESET=0x00000100
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS               = 0x00003828
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS_SE0           = 0x00000007
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   GRBM_STATUS_SE1           = 0x00000007
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   SRBM_STATUS               = 0x200000C0
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   SRBM_STATUS2              = 0x00000000
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   R_008680_CP_STAT          = 0x00000000
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: GPU reset succeeded, trying to resume
Apr 18 10:23:05 eanna kernel: [drm] PCIE gen 2 link speeds already enabled
Apr 18 10:23:05 eanna kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000000025E000).
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: WB enabled
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff880f4aec2c00
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff880f4aec2c0c
Apr 18 10:23:05 eanna kernel: radeon 0000:42:00.0: fence driver on ring 5 use gpu addr 0x000000000005c418 and cpu addr 0xffffc90016b1c418
Apr 18 10:23:05 eanna kernel: [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xCAFEDEAD)
Apr 18 10:23:05 eanna kernel: [drm:evergreen_resume] *ERROR* evergreen startup failed on resume

On power cycle system boots fine and gdm starts.
Comment 12 Martin Peres 2019-11-19 07:48:06 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-ati/issues/111.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.