Bug 107781

Summary: amdgpu hangs on resume on Lenovo A475
Product: DRI Reporter: Alex Findler <walthervondervogelweide>
Component: DRM/AMDgpuAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: walthervondervogelweide
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
inxi output
none
lspci output
none
dmesg output after hang none

Description Alex Findler 2018-09-02 13:11:08 UTC
Created attachment 141410 [details]
inxi output

The system doesn't wake up correctly from susped to RAM. Sometimes it works, often the screen comes back black or with the last view before suspend, frozen. Happens more often when docked. SSH is still possible in most cases, only the graphical interface is dead.

dmesg shows this:

[32756.191289] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, last signaled seq=168244, last emitted seq=168246
[32756.191301] [drm] IP block:sdma_v3_0 is hung!
[32756.191305] [drm] GPU recovery disabled.

It doesn't matter whether suspend was called via lid switch, button, GUI or shell script, or if the lid signal is disabled or not.

OS: Fedora 28, kernel 4.17.19-200.fc28.x86_64
Comment 1 Alex Findler 2018-09-02 13:12:20 UTC
Created attachment 141411 [details]
lspci output
Comment 2 Alex Findler 2018-09-02 13:13:00 UTC
Created attachment 141412 [details]
dmesg output after hang
Comment 3 Michel Dänzer 2018-09-03 09:23:30 UTC
Did it work with older kernels?

Does iommu=soft on the kernel command line avoid the problem?
Comment 4 Alex Findler 2018-09-03 10:34:30 UTC
I've tried with all kernels that Fedora offered during the last two months, including 4.16 varieties. 

These are the kernel options I tried:
radeon.cik_support=0 amdgpu.cik_support=1 scsi_mod.scan=sync amdgpu.gpu_recovery=1 amdgpu.dc=1

I have now added the option iommu=soft, and it seems to solve the problem! I have tried suspend via script (echo mem > /sys/power/state), via hardware button, and via plasmashell option, and all come back, including wake-up by lid signal. I'll test this for a couple of days now and see if it works persistently, then report back. 

Thanks for your quick reaction, it is much appreciated.
Comment 5 Michel Dänzer 2018-09-03 15:45:43 UTC
Any chance you can try if the problem also occurs with an upstream 4.17 or 4.18 kernel?
Comment 6 Alex Findler 2018-09-03 18:46:16 UTC
I was afraid you'd ask that. Last time I built a kernel myself I was on Slackware 8. But I cloned the linux git repo yesterday in anticipation. I'll give it a try.

https://fedoraproject.org/wiki/Building_a_custom_kernel

I'll use these instructions. Which remote and baseline should I choose?
Comment 7 Michel Dänzer 2018-09-04 10:35:09 UTC
(In reply to Alex Findler from comment #6)
> I'll use these instructions. Which remote and baseline should I choose?

The linux-4.17.y or linux-4.18.y branch from linux-stable.git .
Comment 8 Alex Findler 2018-09-04 20:35:07 UTC
med@sorceress:~ 0> uname -av
Linux sorceress 4.18.5med+ #1 SMP Tue Sep 4 17:42:26 CEST 2018 x86_64 x86_64 x86_64 GNU/Linux

So far, so good. I booted with these kernel options: 

resume=/dev/mapper/fedora_linux-swap rd.lvm.lv=fedora_linux/root rd.luks.uuid=luks-deadfb82-77b2-4047-89d9-ea0ba6f38d03 rd.lvm.lv=fedora_linux/swap rhgb quiet uvcvideo.quirks=0x100 scsi_mod.scan=sync

And suspend appears to work. I'll keep testing this for a couple of days. If you want me to try something specific, I'd be glad to do so.
Comment 9 Alex Findler 2018-09-06 15:36:24 UTC
I've suspended by closing the lid, pushing the power button, using the Plasma option, docked, undocked, docked while suspended, and it all works quite well with 4.18.5.

Thank you very much for this quick resolution, thanks for the good work.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.