Created attachment 144006 [details] dmesg with amdgpu.dc=1 drm.debug=7 in boot command We have an Acer Squirtle_SR laptop equipped with AMD A9-9420e RADEON R5, 5 COMPUTE CORES 2C+3G and [AMD/ATI] Topaz XT [Radeon R7 M260/M265 / M340/M360 / M440/M445] [1002:6900]. We test it with Linux kernel 5.1.0-rc5+. The kernel includes the patch [1] mentioned in comment 110360#c9 [2]. System keeps screen black after system resumes from suspending. The error keeps showing on dmesg: [ 177.401716] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=290, emitted seq=294 [ 177.401848] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 569 thread Xorg:cs0 pid 571 [ 177.401855] [drm] IP block:gfx_v8_0 is hung! [ 177.401932] [drm] GPU recovery disabled. 01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Topaz XT [Radeon R7 M260/M265 / M340/M360 / M440/M445] [1002:6900] (rev c3) Subsystem: Acer Incorporated [ALI] Topaz XT [Radeon R7 M260/M265 / M340/M360 / M440/M445] [1025:1217] Flags: bus master, fast devsel, latency 0, IRQ 40 Memory at c0000000 (64-bit, prefetchable) [size=256M] Memory at d0000000 (64-bit, prefetchable) [size=2M] I/O ports at 3000 [size=256] Memory at d1400000 (32-bit, non-prefetchable) [size=256K] Expansion ROM at d1440000 [disabled] [size=128K] Capabilities: [48] Vendor Specific Information: Len=08 <?> Capabilities: [50] Power Management version 3 Capabilities: [58] Express Legacy Endpoint, MSI 00 Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [150] Advanced Error Reporting Capabilities: [270] #19 Capabilities: [2b0] Address Translation Service (ATS) Capabilities: [2c0] Page Request Interface (PRI) Capabilities: [2d0] Process Address Space ID (PASID) Kernel driver in use: amdgpu Kernel modules: amdgpu [1] https://patchwork.kernel.org/patch/10889269/ [2] https://bugzilla.freedesktop.org/show_bug.cgi?id=110360#c9
Created attachment 144007 [details] dmesg with amdgpu.dc=1 drm.debug=7 amdgpu.runpm=0 in boot command Also tried with amdgpu.runpm=0 in boot command. However, it still get the same error. [ 78.078762] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=290, emitted seq=294 [ 78.078897] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 572 thread Xorg:cs0 pid 588 [ 78.078908] [drm] IP block:gfx_v8_0 is hung! [ 78.079079] [drm] GPU recovery disabled.
Created attachment 144008 [details] lspci -nnv on Acer Squirtle_SR
Created attachment 144030 [details] dmesg with amdgpu.dc=1 drm.debug=7 in boot command on Acer TravelMate B114-21 We have another laptop Acer TravelMate B114-21, which hits the same issue. It is equipped with AMD A4-9120C RADEON R4, 5 COMPUTE CORES 2C+3G. [ 60.011965] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=206, emitted seq=208 [ 60.012215] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 1388 thread gnome-shel:cs0 pid 1409 [ 60.012226] [drm] IP block:gfx_v8_0 is hung! [ 60.012320] [drm] GPU recovery disabled. 00:01.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Stoney [Radeon R2/R3/R4/R5 Graphics] [1002:98e4] (rev eb) (prog-if 00 [VGA controller]) Subsystem: Acer Incorporated [ALI] Stoney [Radeon R2/R3/R4/R5 Graphics] [1025:132a] Flags: bus master, fast devsel, latency 0, IRQ 36 Memory at e8000000 (64-bit, prefetchable) [size=128M] Memory at f0000000 (64-bit, prefetchable) [size=8M] I/O ports at f000 [size=256] Memory at fea00000 (32-bit, non-prefetchable) [size=256K] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: [48] Vendor Specific Information: Len=08 <?> Capabilities: [50] Power Management version 3 Capabilities: [58] Express Root Complex Integrated Endpoint, MSI 00 Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [270] #19 Capabilities: [2b0] Address Translation Service (ATS) Capabilities: [2c0] Page Request Interface (PRI) Capabilities: [2d0] Process Address Space ID (PASID) Kernel driver in use: amdgpu Kernel modules: amdgpu Also tried with amdgpu.runpm=0 in boot command, but this issue still can be reproduced.
Created attachment 144031 [details] lspci -nnv on Acer TravelMate B114-21
Created attachment 144042 [details] journal log on Acer TravelMate B114-21 Got more information after wait more time for resuming on Acer TravelMate B114-21. Apr 19 15:06:38 endless kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=2841, emitted seq=2845 Apr 19 15:06:38 endless kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 695 thread Xorg:cs0 pid 698 Apr 19 15:06:38 endless kernel: [drm] IP block:gfx_v8_0 is hung! Apr 19 15:06:38 endless kernel: [drm] GPU recovery disabled. Apr 19 15:06:40 endless kernel: INFO: task Xorg:695 blocked for more than 604 seconds. Apr 19 15:06:40 endless kernel: Tainted: G W 5.1.0-rc5+ #1 Apr 19 15:06:40 endless kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 15:06:40 endless kernel: Xorg D 0 695 683 0x00400004 Apr 19 15:06:40 endless kernel: Call Trace: Apr 19 15:06:40 endless kernel: __schedule+0x2d4/0x840 Apr 19 15:06:40 endless kernel: schedule+0x2c/0x70 Apr 19 15:06:40 endless kernel: schedule_timeout+0x258/0x360 Apr 19 15:06:40 endless kernel: ? amdgpu_atom_execute_table_locked+0x136/0x210 [amdgpu] Apr 19 15:06:40 endless kernel: dma_fence_default_wait+0x20a/0x280 Apr 19 15:06:40 endless kernel: ? dma_fence_release+0xa0/0xa0 Apr 19 15:06:40 endless kernel: dma_fence_wait_timeout+0xe7/0x110 Apr 19 15:06:40 endless kernel: amdgpu_fence_wait_empty+0x61/0xc0 [amdgpu] Apr 19 15:06:40 endless kernel: amdgpu_pm_compute_clocks+0x70/0x590 [amdgpu] Apr 19 15:06:40 endless kernel: dm_pp_apply_display_requirements+0x19a/0x1b0 [amdgpu] Apr 19 15:06:40 endless kernel: dce11_pplib_apply_display_requirements+0x1f4/0x210 [amdgpu] Apr 19 15:06:40 endless kernel: dce11_update_clocks+0xa0/0x100 [amdgpu] Apr 19 15:06:40 endless kernel: dce110_prepare_bandwidth+0x3e/0x50 [amdgpu] Apr 19 15:06:40 endless kernel: dc_commit_state+0x22d/0x5a0 [amdgpu] Apr 19 15:06:40 endless kernel: ? drm_calc_timestamping_constants+0x106/0x150 [drm] Apr 19 15:06:40 endless kernel: amdgpu_dm_atomic_commit_tail+0x1fb/0x1930 [amdgpu] Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x40/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x34/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x40/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x34/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x40/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x34/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x40/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x34/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x34/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x40/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x34/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x40/0x70 Apr 19 15:06:40 endless kernel: ? __switch_to_xtra+0x3b8/0x5b0 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x34/0x70 Apr 19 15:06:40 endless kernel: ? ttm_bo_mem_compat+0x28/0x60 [ttm] Apr 19 15:06:40 endless kernel: ? ttm_bo_validate+0x3d/0x130 [ttm] Apr 19 15:06:40 endless kernel: ? __switch_to+0x48b/0x4f0 Apr 19 15:06:40 endless kernel: ? __switch_to_asm+0x34/0x70 Apr 19 15:06:40 endless kernel: ? __schedule+0x2dc/0x840 Apr 19 15:06:40 endless kernel: ? amdgpu_bo_pin_restricted+0x1a2/0x270 [amdgpu] Apr 19 15:06:40 endless kernel: ? _cond_resched+0x19/0x30 Apr 19 15:06:40 endless kernel: ? wait_for_completion_timeout+0x38/0x140 Apr 19 15:06:40 endless kernel: ? _cond_resched+0x19/0x30 Apr 19 15:06:40 endless kernel: ? wait_for_completion_interruptible+0x35/0x1a0 Apr 19 15:06:40 endless kernel: commit_tail+0x42/0x70 [drm_kms_helper] Apr 19 15:06:40 endless kernel: ? commit_tail+0x42/0x70 [drm_kms_helper] Apr 19 15:06:40 endless kernel: drm_atomic_helper_commit+0x113/0x120 [drm_kms_helper] Apr 19 15:06:40 endless kernel: amdgpu_dm_atomic_commit+0x9b/0xe0 [amdgpu] Apr 19 15:06:40 endless kernel: drm_atomic_commit+0x4a/0x50 [drm] Apr 19 15:06:40 endless kernel: drm_atomic_helper_set_config+0x87/0x90 [drm_kms_helper] Apr 19 15:06:40 endless kernel: drm_mode_setcrtc+0x1bb/0x740 [drm] Apr 19 15:06:40 endless kernel: ? drm_is_current_master+0x1f/0x40 [drm] Apr 19 15:06:40 endless kernel: ? drm_mode_getcrtc+0x1a0/0x1a0 [drm] Apr 19 15:06:40 endless kernel: drm_ioctl_kernel+0xb0/0x100 [drm] Apr 19 15:06:40 endless kernel: drm_ioctl+0x233/0x410 [drm] Apr 19 15:06:40 endless kernel: ? drm_mode_getcrtc+0x1a0/0x1a0 [drm] Apr 19 15:06:40 endless kernel: amdgpu_drm_ioctl+0x4f/0x80 [amdgpu] Apr 19 15:06:40 endless kernel: do_vfs_ioctl+0xa9/0x640 Apr 19 15:06:40 endless kernel: ? tomoyo_file_ioctl+0x19/0x20 Apr 19 15:06:40 endless kernel: ksys_ioctl+0x67/0x90 Apr 19 15:06:40 endless kernel: __x64_sys_ioctl+0x1a/0x20 Apr 19 15:06:40 endless kernel: do_syscall_64+0x5a/0x110 Apr 19 15:06:40 endless kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Apr 19 15:06:40 endless kernel: RIP: 0033:0x7f36f7126777 Apr 19 15:06:40 endless kernel: Code: Bad RIP value. Apr 19 15:06:40 endless kernel: RSP: 002b:00007ffeb62a80d8 EFLAGS: 00003246 ORIG_RAX: 0000000000000010 Apr 19 15:06:40 endless kernel: RAX: ffffffffffffffda RBX: 00007ffeb62a8110 RCX: 00007f36f7126777 Apr 19 15:06:40 endless kernel: RDX: 00007ffeb62a8110 RSI: 00000000c06864a2 RDI: 000000000000000d Apr 19 15:06:40 endless kernel: RBP: 00007ffeb62a8110 R08: 0000000000000000 R09: 00005652f3eb9510 Apr 19 15:06:40 endless kernel: R10: 00007ffeb62a81d0 R11: 0000000000003246 R12: 00000000c06864a2 Apr 19 15:06:40 endless kernel: R13: 000000000000000d R14: 0000000000000000 R15: 00005652f3eb9510
Vega56 Ryzen 2700x Kernel 5.0.3 Mesa latest master git libdrm latest master git llvm 8 I have the same problem then I use DXVK for the free version of Assasin Creed. [ 3137.670744] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=191619, emitted seq=191621 [ 3137.670765] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process ACU.exe pid 8085 thread ACU.exe:cs0 pid 8118 [ 3137.670767] amdgpu 0000:1f:00.0: GPU reset begin! [ 3147.900752] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:47:crtc-0] hw_done or flip_done timed out
I am having very similar issues and see similar errors in logs. The most recent error was: kernel: amdgpu 0000:06:00.0: [gfxhub] no-retry page fault (src_id:0 ring:24 vmid:1 pasid:32768, for process Xorg pid 1301 thread Xorg:cs0 pid 1362) kernel: amdgpu 0000:06:00.0: in page starting at address 0x0000800108a18000 from 27 kernel: amdgpu 0000:06:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00101031 The laptop is then unusable and requires a hard reboot. Linux Mint 19.1 Kernel 5.1.0 AMD Ryzen PRO 2700U with Vega 10 graphics Trying to load cities skylines is a guaranteed crash.
This is probably related to bug 102322, yes?
Created attachment 144900 [details] Thinkpad E585 log file with amdgpu errors I'm running into an issue that I think is related to this. Attached a journal file containing the traces from the last boot where it occurred. For some reason, it doesn't happen every time I try to resume from suspend, but when it does I have no choice but to hard reboot. This is a Thinkpad E585, uname -a "Linux thonkpad 5.2.3-arch1-1-ARCH #1 SMP PREEMPT Fri Jul 26 08:13:47 UTC 2019 x86_64 GNU/Linux"
The patch is on it's way https://bugs.freedesktop.org/show_bug.cgi?id=110258#c12
(In reply to Eugene Bright from comment #10) > The patch is on it's way > https://bugs.freedesktop.org/show_bug.cgi?id=110258#c12 I tried the patch upon Linux stable 5.2.8. It fixed this issue. Thank you so much!
*** This bug has been marked as a duplicate of bug 110258 ***
Hello. please, explain. Why I work fine with FX-8320 CPU, but after Ryzen r5 1600 upgrade, I see this OS freezes and bug? is pcie generation any cause? planned obsolescence? or coincidence with amdgpu driver update? part of my log: [49266.138534] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=5660155, emitted seq=5660157 [49266.138578] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Civ6Sub pid 1778 thread Civ6Sub:cs0 pid 1781 [49266.138580] [drm] GPU recovery disabled. [49275.866518] INFO: task Xorg:sh1:1789 blocked for more than 122 seconds. [49275.866521] Tainted: G R O 5.2.10 #2 radeon 7970. mesa utils(8.4.0-1) linux 5.2.10 amdgpu Version: 18.1.99+git20190207-1
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.