Summary: | AMD GPU Error, random lockup, Ryzen 2500U Vega 8 GPU | ||||||
---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | JerryD <jvdelisle> | ||||
Component: | DRM/AMDgpu | Assignee: | Default DRI bug account <dri-devel> | ||||
Status: | RESOLVED INVALID | QA Contact: | |||||
Severity: | normal | ||||||
Priority: | medium | ||||||
Version: | unspecified | ||||||
Hardware: | Other | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
i915 platform: | i915 features: | ||||||
Attachments: |
|
Description
JerryD
2018-05-26 18:13:01 UTC
Please post some information about your hardware and the kernel version you're running. Full dmesg is also good, shows what the driver says when it's loading. Created attachment 139819 [details]
Full dmesg text
dmesg output
(In reply to Ernst Sjöstrand from comment #1) > Please post some information about your hardware and the kernel version > you're running. Full dmesg is also good, shows what the driver says when > it's loading. I have attached the dmesg text. I also tried to use ssh from a remote machine to try to 'see' what is going on. The ssh session also completely locks up. I can reproduce this hange when running glxgears with vblank_mode=0. The time it takes is random, something like 10 to 30 minutes. Extended renderer info (GLX_MESA_query_renderer): Vendor: X.Org (0x1002) Device: AMD RAVEN (DRM 3.23.0 / 4.16.12-300.fc28.x86_64, LLVM 6.0.0) (0x15dd) Version: 18.0.2 Accelerated: yes Video memory: 223MB Unified memory: no Preferred profile: core (0x1) Max core profile version: 4.5 Max compat profile version: 3.0 Max GLES1 profile version: 1.1 Max GLES[23] profile version: 3.1 Memory info (GL_ATI_meminfo): VBO free memory - total: 223 MB, largest block: 223 MB VBO free aux. memory - total: 3067 MB, largest block: 3067 MB Texture free memory - total: 223 MB, largest block: 223 MB Texture free aux. memory - total: 3067 MB, largest block: 3067 MB Renderbuffer free memory - total: 223 MB, largest block: 223 MB Renderbuffer free aux. memory - total: 3067 MB, largest block: 3067 MB Memory info (GL_NVX_gpu_memory_info): Dedicated video memory: 223 MB Total available memory: 3291 MB Currently available dedicated video memory: 223 MB OpenGL vendor string: X.Org OpenGL renderer string: AMD RAVEN (DRM 3.23.0 / 4.16.12-300.fc28.x86_64, LLVM 6.0.0) OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.0.2 OpenGL core profile shading language version string: 4.50 I have same issues. AMD Ryzen 1800x Sapphier Vega 56 amdgpu git kernel 4.17.0 This is what dmesg say from time to time beyond a hang up. [Sa Jun 9 17:34:54 2018] [drm:generic_reg_wait] *ERROR* REG_WAIT timeout 10us * 3500 tries - dce_mi_free_dmif line:563 [Sa Jun 9 17:34:54 2018] WARNING: CPU: 14 PID: 175 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:195 generic_reg_wait+0xe2/0x160 [Sa Jun 9 17:34:54 2018] Modules linked in: vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) [Sa Jun 9 17:34:54 2018] CPU: 14 PID: 175 Comm: kworker/14:1 Tainted: G O 4.17.0 #1 [Sa Jun 9 17:34:54 2018] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 4011 04/19/2018 [Sa Jun 9 17:34:54 2018] Workqueue: events dm_irq_work_func [Sa Jun 9 17:34:54 2018] RIP: 0010:generic_reg_wait+0xe2/0x160 [Sa Jun 9 17:34:54 2018] RSP: 0018:ffffba1941e9fa88 EFLAGS: 00010297 [Sa Jun 9 17:34:54 2018] RAX: 0000000000000000 RBX: 0000000000000dad RCX: 0000000000000000 [Sa Jun 9 17:34:54 2018] RDX: 0000000000000000 RSI: ffff985a9ef953b8 RDI: ffff985a9ef953b8 [Sa Jun 9 17:34:54 2018] RBP: 000000000000000a R08: 0000000000000416 R09: 0000000000000002 [Sa Jun 9 17:34:54 2018] R10: 0000000000000002 R11: 0000000000000001 R12: ffff985a8c8b5280 [Sa Jun 9 17:34:54 2018] R13: 00000000000035af R14: 0000000000000010 R15: 0000000000000001 [Sa Jun 9 17:34:54 2018] FS: 0000000000000000(0000) GS:ffff985a9ef80000(0000) knlGS:0000000000000000 [Sa Jun 9 17:34:54 2018] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Sa Jun 9 17:34:54 2018] CR2: 000055e803d42c78 CR3: 00000003c621c000 CR4: 00000000003406e0 [Sa Jun 9 17:34:54 2018] Call Trace: [Sa Jun 9 17:34:54 2018] dce_mi_free_dmif+0x11c/0x1a0 [Sa Jun 9 17:34:54 2018] dce110_reset_hw_ctx_wrap+0x13b/0x1c0 [Sa Jun 9 17:34:54 2018] dce110_apply_ctx_to_hw+0x51/0x8c0 [Sa Jun 9 17:34:54 2018] ? amdgpu_pm_compute_clocks+0xa2/0x570 [Sa Jun 9 17:34:54 2018] dc_commit_state+0x333/0x5f0 [Sa Jun 9 17:34:54 2018] ? set_freesync_on_streams.part.6+0x48/0x240 [Sa Jun 9 17:34:54 2018] ? mod_freesync_set_user_enable+0x116/0x140 [Sa Jun 9 17:34:54 2018] amdgpu_dm_atomic_commit_tail+0x359/0xd10 [Sa Jun 9 17:34:54 2018] ? amdgpu_bo_pin_restricted+0x227/0x2e0 [Sa Jun 9 17:34:54 2018] ? _cond_resched+0x10/0x40 [Sa Jun 9 17:34:54 2018] ? wait_for_completion_timeout+0x2f/0x130 [Sa Jun 9 17:34:54 2018] ? _cond_resched+0x10/0x40 [Sa Jun 9 17:34:54 2018] ? wait_for_completion_interruptible+0x2c/0x160 [Sa Jun 9 17:34:54 2018] ? dm_plane_helper_prepare_fb+0xea/0x290 [Sa Jun 9 17:34:54 2018] commit_tail+0x38/0x70 [Sa Jun 9 17:34:54 2018] drm_atomic_helper_commit+0x11c/0x130 [Sa Jun 9 17:34:54 2018] dm_restore_drm_connector_state+0x100/0x190 [Sa Jun 9 17:34:54 2018] handle_hpd_irq+0x81/0xa0 [Sa Jun 9 17:34:54 2018] dm_irq_work_func+0x49/0x60 [Sa Jun 9 17:34:54 2018] process_one_work+0x1cc/0x3c0 [Sa Jun 9 17:34:54 2018] worker_thread+0x26/0x3f0 [Sa Jun 9 17:34:54 2018] ? trace_event_raw_event_workqueue_execute_start+0xc0/0xc0 [Sa Jun 9 17:34:54 2018] kthread+0x10e/0x130 [Sa Jun 9 17:34:54 2018] ? kthread_create_worker_on_cpu+0x70/0x70 [Sa Jun 9 17:34:54 2018] ret_from_fork+0x22/0x40 [Sa Jun 9 17:34:54 2018] Code: 24 58 48 8b 4c 24 50 89 ee 8b 54 24 48 48 c7 c7 48 1d 4b 9a 44 89 4c 24 08 e8 6b 70 eb ff 41 83 7c 24 20 01 44 8b 4c 24 08 74 02 <0f> 0b 48 83 c4 10 44 89 c8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f [Sa Jun 9 17:34:54 2018] ---[ end trace b03679a92b01c897 ]--- I can't give logs about hangups because i can't enter the machine via ssh. I used grubby to add to my kernel boot command 'idle=nomwait' and the problem seems resolved. The mwait instruction is known to possibly hang threads on some earlier released ryzen chips as documented in the AMD Errata. (In reply to JerryD from comment #6) > I used grubby to add to my kernel boot command 'idle=nomwait' and the > problem seems resolved. The mwait instruction is known to possibly hang > threads on some earlier released ryzen chips as documented in the AMD Errata. Thanks for the follow-up, resolving accordingly. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.