Created attachment 84133 [details]
dmesg of my system for information
As soon as the up to date firmware package is installed, my system hangs on suspend and again on resume with the screen turned off and the system not reacting to anything. Tested it on various kernels with the earliest being 3.7.10 (current openSUSE kernel) and the latest being 3.11-rc5.
I tried to get more information, but there are no logs, the screen is turned off and even netconsole did not show more. Do you have suggestions about how to debug this or things that I can try to narrow it down?
My GPU is a:
01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Redwood [Radeon HD 5670] (prog-if 00 [VGA controller])
Subsystem: PC Partner Limited Device e151
Flags: bus master, fast devsel, latency 0, IRQ 48
Memory at e0000000 (64-bit, prefetchable) [size=256M]
Memory at f4420000 (64-bit, non-prefetchable) [size=128K]
I/O ports at e000 [size=256]
Expansion ROM at f4400000 [disabled] [size=128K]
Capabilities:  Power Management version 3
Capabilities:  Express Legacy Endpoint, MSI 00
Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities:  Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
Capabilities:  Advanced Error Reporting
Kernel driver in use: radeon
Attaching dmesg of a running system just for info.
It's not the firmware per se. It's probably the new dpm code. Do you still get when dpm is disabled?
(In reply to comment #1)
> It's not the firmware per se. It's probably the new dpm code. Do you still
> get when dpm is disabled?
Indeed. If I turn off dpm, I can suspend successfully and more or less successfully resume. But a few seconds after resume my X server dies. May be related or not. Attaching a dmesg taken after that.
It's strange though that with an older kernel I still get hangs on suspend and resume even though they do not support dpm at all.
So how can I proceed?
Created attachment 84144 [details]
dmesg after X server crash after suspend/resume without dpm
Sounds like this may be a problem independent of dpm. Have you ever had successful suspend and resume? When you say X crash, do you you mean X hangs? system hangs? segfault?
(In reply to comment #4)
> Sounds like this may be a problem independent of dpm. Have you ever had
> successful suspend and resume?
I've always (meaning > 5 years) had successful suspend and resume on this machine. Until I updated the firmware files to test UVD and then dpm. With the original firmware contained in openSUSE's kernel-firmware-20130114git-1.2.1 package, suspend/resume works just fine. I started having problems immediately after updating the radeon firmware files from your FTP site. In the meantime, openSUSE shipped an update to the kernel-firmware package with which I see the same problems.
> When you say X crash, do you you mean X
> hangs? system hangs? segfault?
I mean the X server terminated unexpectedly and I got thrown back to the login screen. Other than this message and the part about GPU lockup in the dmesg dump I posted, I could not find any messages.
It seems like I have two independent problems which may explain the confusing results I got:
* with DPM enabled I get hard locks on suspend and sometimes on resume. After suspend I have to turn off power manually but it seems like it successfully writes the suspend image to disk. On resume it sometimes locks with disabled output, sometimes it works.
* Regardless of DPM enabled or not I get X server crashes within minutes after a suspend/resume cycle. This happens with kernel 3.11.1 with current firmware. When I downgrade my kernel-firmware package to 20130114git-1.2.1, this problem vanishes. But the firmware may be the original cause. With the old firmware I for example do not have direct rendering or acceleration.
At least I found a logfile giving more information about the X server crash. Attaching.
Is there anything else I can do to debug these problems?
Created attachment 86476 [details]
Xorg.0.log after X server crash after resume
Created attachment 86477 [details]
dmesg after X server crashed
THe only thing that has changed in the ucode is adding new ucode for UVD and SMC. If you use the newer firmware package but remove the UVD and/or SMC ucode images, you should get the same behavior as with the old firmware package. Since dpm is not enabled by default, I think the problem is probably with UVD.
Indeed! After removing CYPRESS_uvd.bin the X server crashes vanish. Only the hard locks with dpm remain.
(In reply to comment #6)
> * Regardless of DPM enabled or not I get X server crashes within minutes
> after a suspend/resume cycle. This happens with kernel 3.11.1 with current
> firmware. When I downgrade my kernel-firmware package to 20130114git-1.2.1,
> this problem vanishes. But the firmware may be the original cause. With the
> old firmware I for example do not have direct rendering or acceleration.
I have the same problem with kernel 3.11.6 on Gentoo Linux (stable) with a Radeon HD 4650. Hibernation itself works, but shortly after resuming the machine Xorg crashes with a bus error. Additionally dmesg contains messages about GPU lockups before or after hibernating. Removing the uvd/smc blobs stops Xorg from crashing, but disables direct rendering (including XVideo, …). Note that Xorg does *not* crash after resuming from suspend to RAM.
Downgrading to 3.10 (with the same firmware package) does not solve the problem, so I’m back to 3.4.67 for now, which works (both hibernate/direct rendering) for me.
Good news: somewhere between kernel 2.12.1 and 2.13-rc2 the hard lockup on suspend got fixed! I've suspended several times now without a single lockup. On resume though I still have lockups about half of the time I tried. This is without UVD firmware but with active DPM
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/375.