| Summary: | amdgpu does not resume properly from suspend | ||
|---|---|---|---|
| Product: | DRI | Reporter: | David Walker <David> |
| Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> |
| Status: | RESOLVED INVALID | QA Contact: | |
| Severity: | normal | ||
| Priority: | medium | CC: | tiwai, vedran |
| Version: | XOrg git | ||
| Hardware: | x86-64 (AMD64) | ||
| OS: | Linux (All) | ||
| Whiteboard: | |||
| i915 platform: | i915 features: | ||
| Attachments: | |||
Can you try kernel 4.3? Please attach your xorg log and dmesg output. Created attachment 119452 [details]
dmesg output from 4.3.0-1.g7b374a4-default
Created attachment 119453 [details]
Xorg.0.log from 4.3.0-1.g7b374a4-default
I've attached dmesg and Xorg.0.log files for 4.3.0-1.g7b374a4. It appears that the GPU faults have gone away, but the visual symptom is still the same; the screen is blank after a resume. You'll also note that there are a *lot* of "xhci_hcd 0000:00:10.0: WARN Successful completion on short TX" messages in the dmesg output. I suspect they're unrelated, but they don't appear under 4.2.3-1.4. Does booting with apci_osi=Linux on the kernel command line help? A lot of new laptops use d3cold to support windows 10 which Linux in general doesn't support at the moment. apci_osi=Linux doesn't seem to help. I have found that it does recover sometimes, albeit rarely, and more often when running Gnome with Wayland, rather than X11, and sometimes only after control-alt-backspace. I haven't done all that much testing, though, so this all may simply be coincidence. Any other debugging I could do? Created attachment 119962 [details]
suspend1.dmesg from 4.3.0-6.g6b3b033-default
Created attachment 119963 [details]
suspend2.dmesg from 4.3.0-6.g6b3b033-default
Created attachment 119964 [details]
suspend3.dmesg from 4.3.0-6.g6b3b033-default
Created attachment 119965 [details]
Xorg.0.log from 4.3.0-6.g6b3b033-default
Over the past couple of Tumbleweed kernel upgrades (most recently kernel-default-4.3.0-6.1.g6b3b033), I've noticed that resumes succeed sometimes, and that if a resume fails, another one or two suspend/resume cycles will result in a successful resume. FYI, I'm also using ucode-amd-20151109git-35.1 and kernel-firmware-20151109git-35.1. I have attached Xorg.0.log and the following three "dmesg -c" outputs: suspend1.dmesg - before any suspends suspend2.dmesg - after two suspends, the first of which failed suspend3.dmesg - after one suspend that succeeded David, can you try newer kernel? I get the same errors with Fedora kernel 4.5.4 on 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Tonga XT / Amethyst XT [Radeon R9 380X / R9 M295X] [1002:6938] (rev f1) but the errors are non-fatal, i.e. there is graphics corruption but no hangs. Restarting X removes graphics corruption. (In reply to Vedran Miletić from comment #13) > David, can you try newer kernel? I get the same errors with Fedora kernel > 4.5.4 on > > 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. > [AMD/ATI] Tonga XT / Amethyst XT [Radeon R9 380X / R9 M295X] [1002:6938] > (rev f1) > > but the errors are non-fatal, i.e. there is graphics corruption but no > hangs. Restarting X removes graphics corruption. Sorry, Vendran, but I ended up buying another laptop (this time from a company that specializes in Linux laptops), so I no longer have a testbed for this. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
My laptop does not resume properly after a suspend. The problem seems to be with the amdgpu kernel module; it's often accompanied by a long stream of dmesg errors reported. Here's a sampling: [ 1494.980561] amdgpu 0000:00:01.0: GPU fault detected: 146 0x0b020504 [ 1494.980561] amdgpu 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 1494.980561] amdgpu 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 [ 1494.980561] VM fault (0x00, vmid 0) at page 0, read from '' (0x00000000) (0) [ 1494.995478] systemd-journald[498]: /dev/kmsg buffer overrun, some messages lost. [ 1494.980561] amdgpu 0000:00:01.0: GPU fault detected: 146 0x0b0a4004 [ 1494.980561] amdgpu 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 1494.980561] amdgpu 0000:00:01.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 [ 1494.980561] VM fault (0x00, vmid 0) at page 0, read from '' (0x00000000) (0) [ 1494.995486] systemd-journald[498]: /dev/kmsg buffer overrun, some messages lost. Any ideas? I'm running the 4.2.4-1-default under openSUSE Tumbleweed. Here's some hwinfo data: 09: PCI 01.0: 0300 VGA compatible controller (VGA) [Created at pci.366] Unique ID: vSkL.bMI5Iw7ysWD SysFS ID: /devices/pci0000:00/0000:00:01.0 SysFS BusID: 0000:00:01.0 Hardware Class: graphics card Model: "ATI Carrizo" Vendor: pci 0x1002 "ATI Technologies Inc" Device: pci 0x9874 "Carrizo" SubVendor: pci 0x103c "Hewlett-Packard Company" SubDevice: pci 0x80af Revision: 0xc5 Driver: "amdgpu" Driver Modules: "drm" Memory Range: 0xe0000000-0xefffffff (ro,non-prefetchable) Memory Range: 0xf0000000-0xf07fffff (ro,non-prefetchable) I/O Ports: 0xf000-0xf0ff (rw) Memory Range: 0xff700000-0xff73ffff (rw,non-prefetchable) Memory Range: 0xff740000-0xff75ffff (ro,non-prefetchable,disabled) IRQ: 47 (129945 events) Module Alias: "pci:v00001002d00009874sv0000103Csd000080AFbc03sc00i00" Driver Info #0: Driver Status: amdgpu is active Driver Activation Cmd: "modprobe amdgpu" Config Status: cfg=new, avail=yes, need=no, active=unknown