Intel Hades Canyon (Kaby Lake-G CPU, with Vega-M graphics) does seem to panic in IGT igt@gem_exec_suspend@basic-s4-devices Panic traces available: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4443/fi-kbl-8809g/pstore0-1530866512_Panic_1.log https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4444/fi-kbl-8809g/pstore0-1530875939_Panic_1.log
The system hung after resume and owatch panicked, judging by the 60s gap in timestamps. I'd postulate it's related to the warns we see if does "successfully" resume.
Trace catched. Moving bug to DRM/AMDgpu side [ 201.720189] atkbd serio0: Use 'setkeycodes 7c <keycode>' to make it known. [ 201.781782] done (allocated 209842 pages) [ 202.536562] DMAR: DRHD: handling fault status reg 3 [ 202.536585] DMAR: [DMA Read] Request device [01:00.0] fault addr 527000 [fault reason 06] PTE Read access is not set [ 203.554655] [drm:amdgpu_vce_ring_test_ib [amdgpu]] *ERROR* amdgpu: IB test timed out. [ 203.554684] [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 15 (-110). [ 203.554687] [drm:process_one_work] *ERROR* ib ring test failed (-110). [ 207.647706] usb usb1: root hub lost power or was reset [ 207.647709] usb usb2: root hub lost power or was reset [ 207.648779] usb usb3: root hub lost power or was reset [ 207.648783] usb usb4: root hub lost power or was reset [ 207.977164] [drm:uvd_v6_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 12 test failed (0xCAFEDEAD) [ 207.977177] [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <uvd_v6_0> failed -22 [ 207.977189] [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-22). [ 207.977192] dpm_run_callback(): pci_pm_restore+0x0/0xa0 returns -22 [ 207.977198] PM: Device 0000:01:00.0 failed to restore async: error -22 [ 208.256034] atkbd serio0: Unknown key released (translated set 2, code 0x7c on isa0060/serio0). [ 208.256036] atkbd serio0: Use 'setkeycodes 7c <keycode>' to make it known. [ 208.459780] atkbd serio0: Unknown key released (translated set 2, code 0x7c on isa0060/serio0). [ 208.459782] atkbd serio0: Use 'setkeycodes 7c <keycode>' to make it known. [ 208.664209] atkbd serio0: Unknown key released (translated set 2, code 0x7c on isa0060/serio0). [ 208.664211] atkbd serio0: Use 'setkeycodes 7c <keycode>' to make it known. [ 208.671947] Setting dangerous option reset - tainting kernel Full trace: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4467/fi-kbl-8809g/igt@gem_exec_suspend@basic-s4-devices.html History: https://intel-gfx-ci.01.org/tree/drm-tip/fi-kbl-8809g.html Hardware: Intel Hades Canyon NUC8i7HVK (Kaby Lake CPU with "Vega-M" graphic) Kconfig: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4467/kernel.config.bz2
Bug assessment: The IGT submits batch buffers both before and after hibernating the system and ensures they complete successfully. This bug has not been updated in over a year. The system is a KBL NUC with a AMD GPU. Earlier, from the comments, it seems there was a panic in amdgpu side and component was assigned as DRM/AMDgpu. At this point, from what I can tell, the test itself is passing except for the following trace in dmesg: <3> [76.484258] [drm:intel_dp_aux_xfer [i915]] *ERROR* dp aux hw did not signal timeout! https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5237/fi-kbl-7500u/igt@gem_exec_suspend@basic-s4-devices.html This trace is coming from the i915 display port. The above trace is already been tracked as part of bug 105128. As a result of these findings I am: a. Setting the component of this bug to DRM/Intel b. Marking this bug as a duplicate of bug 105128
*** This bug has been marked as a duplicate of bug 105128 ***
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.