Summary: | Annoying GPU stucks are continued on Vega 20 with Kernel 5.4 + mesa 9.2.0 RC4 + llvm 9.0.0 [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:47:crtc-0] flip_done timed out | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | mikhail.v.gavrilov | ||||||||||||||||
Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> | ||||||||||||||||
Status: | RESOLVED MOVED | QA Contact: | |||||||||||||||||
Severity: | not set | ||||||||||||||||||
Priority: | not set | CC: | polynomial-c | ||||||||||||||||
Version: | XOrg git | ||||||||||||||||||
Hardware: | Other | ||||||||||||||||||
OS: | All | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
i915 platform: | i915 features: | ||||||||||||||||||
Attachments: |
|
Description
mikhail.v.gavrilov
2019-09-24 18:53:35 UTC
Created attachment 145495 [details]
./umr -O halt_waves -wa
Created attachment 145496 [details]
./umr -R gfx[.]
Created attachment 145497 [details]
./umr -O many,bits -r *.*.mmGRBM_STATUS*
Created attachment 145498 [details]
./umr -O many,bits -r *.*.mmCP_EOP_*
Created attachment 145499 [details]
./umr -O many,bits -r *.*.mmCP_PFP_HEADER_DUMP
Created attachment 145500 [details]
./umr -O many,bits -r *.*.mmCP_ME_HEADER_DUMP
Can confirm, have same issue with Vega 64 and gaming (both native and Wine + DXVK). Surprisingly, the dmesg stack mentions Slack electron app, which indeed was running in background. dmesg stack: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:47:crtc-0] flip_done timed out [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=589680, emitted seq=589681 [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=5916, emitted seq=5917 [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process slack pid 2028 thread slack:cs0 pid 2032 [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process slack pid 2028 thread slack:cs0 pid 2032 amdgpu 0000:0d:00.0: GPU reset begin! amdgpu 0000:0d:00.0: GPU reset begin! [drm] Bailing on TDR for s_job:8f401, as another already in progress [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:47:crtc-0] flip_done timed out [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, but soft recovered [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:45:plane-5] flip_done timed out ------------[ cut here ]------------ WARNING: CPU: 9 PID: 937 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:5813 amdgpu_dm_atomic_commit_tail.cold+0x82/0xed [amdgpu] Modules linked in: cmac rfcomm fuse bridge stp llc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev ... aesni_intel libahci libata aes_x86_64 glue_helper crypto_simd cryptd xhci_pci scsi_mod xhci_hcd CPU: 9 PID: 937 Comm: Xorg Not tainted 5.3.0-arch1-1-ARCH #1 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X370 Gaming K4, BIOS P5.50 08/04/2019 RIP: 0010:amdgpu_dm_atomic_commit_tail.cold+0x82/0xed [amdgpu] Code: c7 c7 58 4d 0a c1 e8 57 22 f1 db 0f 0b 41 83 7c 24 08 00 0f 85 a0 ff f1 ff e9 bb ff f1 ff 48 c7 c7 58 4d 0a c1 e8 38 22 f1 db <0f> 0b e9 3a f5 f ... RSP: 0018:ffffa20100cc78a0 EFLAGS: 00010046 RAX: 0000000000000024 RBX: ffff92f34e662000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000086 RDI: 00000000ffffffff RBP: ffffa20100cc7bc0 R08: 00000000000004dc R09: 0000000000000004 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000286 R13: ffff92f24c449800 R14: ffff92f3769a0000 R15: ffff92f22460af00 FS: 00007f45b11eedc0(0000) GS:ffff92f37e840000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f5b665c8000 CR3: 00000007c99e6000 CR4: 00000000003406e0 Call Trace: ? commit_tail+0x3c/0x70 [drm_kms_helper] commit_tail+0x3c/0x70 [drm_kms_helper] drm_atomic_helper_commit+0x108/0x110 [drm_kms_helper] drm_atomic_helper_legacy_gamma_set+0x11b/0x170 [drm_kms_helper] drm_mode_gamma_set_ioctl+0x1a9/0x210 [drm] ? drm_color_lut_check+0xb0/0xb0 [drm] drm_ioctl_kernel+0xb8/0x100 [drm] drm_ioctl+0x23d/0x3d0 [drm] ? drm_color_lut_check+0xb0/0xb0 [drm] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] do_vfs_ioctl+0x43d/0x6c0 ? syscall_trace_enter+0x1f2/0x2e0 ksys_ioctl+0x5e/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x5f/0x1c0 ? prepare_exit_to_usermode+0x85/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f45b242721b System info: System: Host: house-of-maker Kernel: 5.3.0-arch1-1-ARCH x86_64 bits: 64 compiler: gcc v: 9.1.0 Desktop: KDE Plasma 5.16.5 Machine: Type: Desktop Mobo: ASRock model: X370 Gaming K4 serial: <root required> UEFI: American Megatrends v: P5.50 date: 08/04/2019 CPU: Topology: 8-Core model: AMD Ryzen 7 1700X bits: 64 type: MT MCP arch: Zen rev: 1 L2 cache: 4096 KiB flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 108622 Speed: 2513 MHz min/max: 2200/3400 MHz Core speeds (MHz): 1: 2440 2: 2570 3: 1725 4: 2371 5: 1712 6: 1740 7: 1711 8: 2396 9: 1711 10: 2367 11: 1862 12: 1711 13: 1754 14: 2398 15: 2682 16: 2368 Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] vendor: Sapphire Limited driver: amdgpu v: kernel bus ID: 0d:00.0 Display: x11 server: X.Org 1.20.5 driver: modesetting unloaded: fbdev,vesa resolution: 2560x1440~60Hz OpenGL: renderer: Radeon RX Vega (VEGA10 DRM 3.33.0 5.3.0-arch1-1-ARCH LLVM 8.0.1) v: 4.5 Mesa 19.1.7 direct render: Yes Just ran into this with the Vega 64. No games open. Only KDE, suckless terminal, firefox, and remote-viewer. Thankfully, I'm not sure of any negative effects. I'm not even sure I need to reboot, and only saw this while looking at journalctl for another reason. Currently running 5.3.0, mesa 19.2.0, and llvm 8.0.1. Going to be upgrading to 5.3.5, 19.2.1, and 9.0.0 soon, but haven't done so yet. Oct 11 00:13:53 newKvm kernel: [drm] amdgpu_dm_irq_schedule_work FAILED src 11 (yeah, nothing else with this message almost 2 days before this problem) Oct 12 19:34:58 newKvm kernel: kworker/u65:4 D 0 2652517 2 0x80004080 Oct 12 19:34:58 newKvm kernel: Workqueue: events_unbound commit_work [drm_kms_helper] Oct 12 19:34:58 newKvm kernel: Call Trace: Oct 12 19:34:58 newKvm kernel: ? __schedule+0x27f/0x6d0 Oct 12 19:34:58 newKvm kernel: schedule+0x43/0xd0 Oct 12 19:34:58 newKvm kernel: schedule_timeout+0x1cf/0x3d0 Oct 12 19:34:58 newKvm kernel: ? collect_expired_timers+0xb0/0xb0 Oct 12 19:34:58 newKvm kernel: wait_for_common+0xeb/0x190 Oct 12 19:34:58 newKvm kernel: ? wake_up_q+0x60/0x60 Oct 12 19:34:58 newKvm kernel: drm_atomic_helper_wait_for_flip_done+0x5f/0xb0 [drm_kms_helper] Oct 12 19:34:58 newKvm kernel: amdgpu_dm_atomic_commit_tail+0x1898/0x1d00 [amdgpu] Oct 12 19:34:58 newKvm kernel: ? commit_tail+0x3c/0x70 [drm_kms_helper] Oct 12 19:34:58 newKvm kernel: commit_tail+0x3c/0x70 [drm_kms_helper] Oct 12 19:34:58 newKvm kernel: process_one_work+0x1d1/0x3a0 Oct 12 19:34:58 newKvm kernel: worker_thread+0x4a/0x3d0 Oct 12 19:34:58 newKvm kernel: kthread+0xfb/0x130 Oct 12 19:34:58 newKvm kernel: ? process_one_work+0x3a0/0x3a0 Oct 12 19:34:58 newKvm kernel: ? kthread_park+0x80/0x80 Oct 12 19:34:58 newKvm kernel: ret_from_fork+0x35/0x40 -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/917. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.