R9285 Tonga, probably since 36626ff radeonsi: split si_emit_shader_pointer Xonotic big-key-bench timedemo may provoke a display/gpu lockup. On head it seemed easy to provoke: one,two or three runs. Bisecting was harder as it took more hence the possibility that I am on a false good. Currently on commit before above and haven't locked so far, don't know if that's luck yet. Lock seems to be same(ish) place in demo = frame 6512, though once it was 6514. Waiting before doing SysRq I will get a timeout trace as below. I tried on older kernels with same result, the dirty here is just a CPU fix I need that's in rc-5. INFO: task gallium_drv:0:985 blocked for more than 120 seconds. Not tainted 4.14.0-rc3-g96687ec-dirty #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. gallium_drv:0 D 0 985 962 0x00000000 Call Trace: __schedule+0x2ce/0x890 ? _raw_write_unlock+0x11/0x30 schedule+0x3b/0x90 amd_sched_entity_push_job+0x9f/0xf0 [amdgpu] ? remove_wait_queue+0x80/0x80 amdgpu_job_submit+0x9a/0xc0 [amdgpu] amdgpu_vm_bo_update_mapping+0x2de/0x3a0 [amdgpu] ? amdgpu_vm_free_mapping.isra.20+0x30/0x30 [amdgpu] amdgpu_vm_bo_update+0x2e8/0x6a0 [amdgpu] amdgpu_gem_va_ioctl+0x476/0x480 [amdgpu] ? amdgpu_gem_metadata_ioctl+0x1d0/0x1d0 [amdgpu] drm_ioctl_kernel+0x6f/0xc0 [drm] drm_ioctl+0x2f9/0x3c0 [drm] ? futex_wake+0x7c/0x140 ? amdgpu_gem_metadata_ioctl+0x1d0/0x1d0 [amdgpu] ? do_futex+0x289/0xb20 ? put_prev_entity+0xf8/0x5a0 ? preempt_count_add+0x99/0xb0 ? _raw_write_unlock_irqrestore+0x13/0x30 ? _raw_spin_unlock_irqrestore+0x9/0x10 amdgpu_drm_ioctl+0x54/0x90 [amdgpu] do_vfs_ioctl+0x98/0x5b0 ? __fget+0x6e/0xa0 SyS_ioctl+0x47/0x80 entry_SYSCALL_64_fastpath+0x17/0x98 RIP: 0033:0x7fee63cb8717 RSP: 002b:00007fee567415a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fee2c00bbe0 RCX: 00007fee63cb8717 RDX: 00007fee567415f0 RSI: 00000000c0286448 RDI: 000000000000000e RBP: 0000000000000000 R08: 0000000150400000 R09: 000000000000000e R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 000000000370e4a0 R15: 0000000008881590
Lots of "normal" runs indicated the commit before the "bad" is OK - I've not locked doing many of vblank_mode=0 ./xonotic-linux64-glx -benchmarkruns 20 -benchmark demos/the-big-keybench.dem cpu & gpu set high for testing. vblank_mode=0 or the perf settings are not required to provoke lock, they are just much faster to get the multiple runs in. xonotic settings are ultra with aniso and aa highest, 1920x1080 fullscreen. Unfortunately, I tried a more abnormal test = with a 2160p framebuffer + panning and like that I can still lock. Locks are again in same place, but that place is different = frame 9411. Over time I'll try to go back further.
I have no clue about any fixing commit, this turned out to be very random whether I could provoke or not and currently I can't, so closing as the bisect was clearly wrong. I guess LLVM version or getting GPU into some "state" was involved, but whatever it was this bug is not correct.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.