My laptop does not resume properly after a suspend. The problem seems to be with the amdgpu kernel module; it's often accompanied by a long stream of dmesg errors reported. Here's a sampling: [ 1494.980561] amdgpu 0000:00:01.0: GPU fault detected: 146 0x0b020504 [ 1494.980561] amdgpu 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 1494.980561] amdgpu 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 [ 1494.980561] VM fault (0x00, vmid 0) at page 0, read from '' (0x00000000) (0) [ 1494.995478] systemd-journald[498]: /dev/kmsg buffer overrun, some messages lost. [ 1494.980561] amdgpu 0000:00:01.0: GPU fault detected: 146 0x0b0a4004 [ 1494.980561] amdgpu 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 1494.980561] amdgpu 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 [ 1494.980561] VM fault (0x00, vmid 0) at page 0, read from '' (0x00000000) (0) [ 1494.995486] systemd-journald[498]: /dev/kmsg buffer overrun, some messages lost. Any ideas? I'm running the 4.2.4-1-default under openSUSE Tumbleweed. Here's some hwinfo data: 09: PCI 01.0: 0300 VGA compatible controller (VGA) [Created at pci.366] Unique ID: vSkL.bMI5Iw7ysWD SysFS ID: /devices/pci0000:00/0000:00:01.0 SysFS BusID: 0000:00:01.0 Hardware Class: graphics card Model: "ATI Carrizo" Vendor: pci 0x1002 "ATI Technologies Inc" Device: pci 0x9874 "Carrizo" SubVendor: pci 0x103c "Hewlett-Packard Company" SubDevice: pci 0x80af Revision: 0xc5 Driver: "amdgpu" Driver Modules: "drm" Memory Range: 0xe0000000-0xefffffff (ro,non-prefetchable) Memory Range: 0xf0000000-0xf07fffff (ro,non-prefetchable) I/O Ports: 0xf000-0xf0ff (rw) Memory Range: 0xff700000-0xff73ffff (rw,non-prefetchable) Memory Range: 0xff740000-0xff75ffff (ro,non-prefetchable,disabled) IRQ: 47 (129945 events) Module Alias: "pci:v00001002d00009874sv0000103Csd000080AFbc03sc00i00" Driver Info #0: Driver Status: amdgpu is active Driver Activation Cmd: "modprobe amdgpu" Config Status: cfg=new, avail=yes, need=no, active=unknown
Can you try kernel 4.3?
Please attach your xorg log and dmesg output.
Created attachment 119452 [details] dmesg output from 4.3.0-1.g7b374a4-default
Created attachment 119453 [details] Xorg.0.log from 4.3.0-1.g7b374a4-default
I've attached dmesg and Xorg.0.log files for 4.3.0-1.g7b374a4. It appears that the GPU faults have gone away, but the visual symptom is still the same; the screen is blank after a resume. You'll also note that there are a *lot* of "xhci_hcd 0000:00:10.0: WARN Successful completion on short TX" messages in the dmesg output. I suspect they're unrelated, but they don't appear under 4.2.3-1.4.
Does booting with apci_osi=Linux on the kernel command line help? A lot of new laptops use d3cold to support windows 10 which Linux in general doesn't support at the moment.
apci_osi=Linux doesn't seem to help. I have found that it does recover sometimes, albeit rarely, and more often when running Gnome with Wayland, rather than X11, and sometimes only after control-alt-backspace. I haven't done all that much testing, though, so this all may simply be coincidence. Any other debugging I could do?
Created attachment 119962 [details] suspend1.dmesg from 4.3.0-6.g6b3b033-default
Created attachment 119963 [details] suspend2.dmesg from 4.3.0-6.g6b3b033-default
Created attachment 119964 [details] suspend3.dmesg from 4.3.0-6.g6b3b033-default
Created attachment 119965 [details] Xorg.0.log from 4.3.0-6.g6b3b033-default
Over the past couple of Tumbleweed kernel upgrades (most recently kernel-default-4.3.0-6.1.g6b3b033), I've noticed that resumes succeed sometimes, and that if a resume fails, another one or two suspend/resume cycles will result in a successful resume. FYI, I'm also using ucode-amd-20151109git-35.1 and kernel-firmware-20151109git-35.1. I have attached Xorg.0.log and the following three "dmesg -c" outputs: suspend1.dmesg - before any suspends suspend2.dmesg - after two suspends, the first of which failed suspend3.dmesg - after one suspend that succeeded
David, can you try newer kernel? I get the same errors with Fedora kernel 4.5.4 on 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Tonga XT / Amethyst XT [Radeon R9 380X / R9 M295X] [1002:6938] (rev f1) but the errors are non-fatal, i.e. there is graphics corruption but no hangs. Restarting X removes graphics corruption.
(In reply to Vedran Miletić from comment #13) > David, can you try newer kernel? I get the same errors with Fedora kernel > 4.5.4 on > > 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. > [AMD/ATI] Tonga XT / Amethyst XT [Radeon R9 380X / R9 M295X] [1002:6938] > (rev f1) > > but the errors are non-fatal, i.e. there is graphics corruption but no > hangs. Restarting X removes graphics corruption. Sorry, Vendran, but I ended up buying another laptop (this time from a company that specializes in Linux laptops), so I no longer have a testbed for this.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.