To reproduce, play Serious Sam Fusion (Vulkan rendering). X should freeze after a few minutes; at the start of SS:TFE, venturing out into the desert or killing an gnaar should be enough to trigger it. It looks like a fence + buffer object 'interaction'. This appears to affect all Linux 4.9.x and amd-staging-4.9; other kernels are as yet untested. Mesa 17.0.2 or git, libdrm 2.4.75, llvm 4.0~svn294803. Hardware is RX 470. Sample backtrace: [ 861.398271] INFO: task Xorg:4985 blocked for more than 120 seconds. [ 861.398292] Tainted: G C O 4.9.17+dc+ #1 [ 861.398301] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 861.398306] Xorg D 0 4985 4966 0x00000004 [ 861.398317] ffff88040d4037c8 ffff880409bc9480 ffff88041c9ca1c0 ffff88041ed96440 [ 861.398339] ffff88040d403340 ffffc9000acc3a70 ffffffff8155de4e 0000000000000002 [ 861.398356] ffff88040d403340 ffff880405ab1500 ffff880405ab1500 0000000000000001 [ 861.398377] Call Trace: [ 861.398391] [<ffffffff8155de4e>] ? __schedule+0x228/0x3bd [ 861.398400] [<ffffffff8155e067>] schedule+0x84/0x98 [ 861.398409] [<ffffffff813194d5>] amd_sched_entity_push_job+0x69/0x86 [ 861.398417] [<ffffffff81059623>] ? __wake_up_sync+0xd/0xd [ 861.398425] [<ffffffff81319c85>] amdgpu_job_submit+0x71/0x7f [ 861.398434] [<ffffffff812dff16>] amdgpu_vm_bo_split_mapping+0x3df/0x47a [ 861.398443] [<ffffffff812df742>] ? amdgpu_vm_adjust_mc_addr+0x1f/0x1f [ 861.398461] [<ffffffff812e0f37>] amdgpu_vm_bo_update+0x17d/0x222 [ 861.398472] [<ffffffff812d5d41>] amdgpu_gem_va_ioctl+0x362/0x3d9 [ 861.398486] [<ffffffff812d4fed>] ? ttm_bo_unreserve+0x40/0x43 [ 861.398495] [<ffffffff8129859c>] ? drm_gem_handle_create+0x34/0x39 [ 861.398503] [<ffffffff81298ccc>] drm_ioctl+0x26c/0x38b [ 861.398510] [<ffffffff81298ccc>] ? drm_ioctl+0x26c/0x38b [ 861.398516] [<ffffffff812d59df>] ? amdgpu_gem_metadata_ioctl+0xe8/0xe8 [ 861.398525] [<ffffffff8104b554>] ? preempt_latency_start+0x21/0x5d [ 861.398532] [<ffffffff8104b5f2>] ? preempt_count_add+0x62/0x65 [ 861.398540] [<ffffffff81560846>] ? _raw_spin_unlock_irqrestore+0x13/0x25 [ 861.398549] [<ffffffff812c1d17>] amdgpu_drm_ioctl+0x4a/0x7a [ 861.398557] [<ffffffff810dbdbd>] vfs_ioctl+0x13/0x2f [ 861.398564] [<ffffffff810dc2d0>] do_vfs_ioctl+0x47f/0x524 [ 861.398573] [<ffffffff810e48a3>] ? __fget+0x66/0x72 [ 861.398581] [<ffffffff810dc3b3>] SyS_ioctl+0x3e/0x5c [ 861.398588] [<ffffffff81560be4>] entry_SYSCALL_64_fastpath+0x17/0x98 [ 861.398654] INFO: task Sam2017:6271 blocked for more than 120 seconds. [ 861.398660] Tainted: G C O 4.9.17+dc+ #1 [ 861.398664] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 861.398669] Sam2017 D 0 6271 6270 0x00000000 [ 861.398762] ffff8803902caa08 ffff88040d0d1800 ffffffff81c0c500 ffff88041ec16440 [ 861.398778] ffff8803902ca580 ffffc90002cdba78 ffffffff8155de4e 0000000000000002 [ 861.398796] ffff8803902ca580 7fffffffffffffff 0000000000000246 ffff88033dd3e100 [ 861.398812] Call Trace: [ 861.398821] [<ffffffff8155de4e>] ? __schedule+0x228/0x3bd [ 861.398829] [<ffffffff8155e067>] schedule+0x84/0x98 [ 861.398835] [<ffffffff8155fdc5>] schedule_timeout+0x2f/0xf5 [ 861.398843] [<ffffffff8104b554>] ? preempt_latency_start+0x21/0x5d [ 861.398850] [<ffffffff8104b5f2>] ? preempt_count_add+0x62/0x65 [ 861.398857] [<ffffffff813ae57a>] fence_default_wait+0x124/0x1c1 [ 861.398863] [<ffffffff813ae57a>] ? fence_default_wait+0x124/0x1c1 [ 861.398869] [<ffffffff813adf22>] ? fence_release+0x2b/0x2b [ 861.398875] [<ffffffff813adee1>] fence_wait_timeout+0x2e/0x30 [ 861.398881] [<ffffffff812e4005>] amdgpu_ctx_add_fence+0x66/0x13b [ 861.398887] [<ffffffff812d81b4>] amdgpu_cs_ioctl+0x1132/0x116a [ 861.398897] [<ffffffff81298ccc>] drm_ioctl+0x26c/0x38b [ 861.398903] [<ffffffff812d7082>] ? amdgpu_cs_find_mapping+0x7d/0x7d [ 861.398910] [<ffffffff8104b554>] ? preempt_latency_start+0x21/0x5d [ 861.398917] [<ffffffff8104b5f2>] ? preempt_count_add+0x62/0x65 [ 861.398923] [<ffffffff81560846>] ? _raw_spin_unlock_irqrestore+0x13/0x25 [ 861.398931] [<ffffffff812c1d17>] amdgpu_drm_ioctl+0x4a/0x7a [ 861.398938] [<ffffffff810dbdbd>] vfs_ioctl+0x13/0x2f [ 861.398944] [<ffffffff810dc2d0>] do_vfs_ioctl+0x47f/0x524 [ 861.398951] [<ffffffff810e48a3>] ? __fget+0x66/0x72 [ 861.398958] [<ffffffff810dc3b3>] SyS_ioctl+0x3e/0x5c [ 861.398965] [<ffffffff81560be4>] entry_SYSCALL_64_fastpath+0x17/0x98
I'm aware (via IRC) that 4.11-rc3 should be fine. However, as I'm using amd-staging-4.9 for its HDMI audio support, this isn't an option.
Probably a radv (or LLVM) issue.
Happening with 4.11-rc4; essentially the same backtrace. I'll try bumping llvm next (probably to 4.0~svn294803), although I expect little difference.
… okay, it's looking like the Steam overlay has a lot to do with this problem. (Tested with current Mesa git, but the same LLVM as before.)
Hi Darren, Can you still reproduce the hang? I regularly test Serious Sam Fusion on Polaris/Vega, and it never hung for me.
Closing, I have tried to reproduce the issue yesterday (again), maybe I'm unlucky and not good enough at playing games, but it worked perfectly fine (tested on Vega 56). Feel free to re-open. Thanks!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.