Created attachment 116682 [details] dmesg ==Regression== -------------------------- Regression: No. Ubuntu: 14.04 ==kernel== -------------------------- drm-intel-next-queued: git-8c6cda ==Test cases== Beignet: git://anongit.freedesktop.org/git/beignet (master git-e64445f) ==Bug detailed description== ----------------------------- OpenCL/utests may hang on BSW sporadically(~20%). And the fail tests are not specific. The issue doesn't exist on other platforms(IVB/HSW/BDW). Please see the attached dmesg. (gdb) bt ================== #0 0x00007f6fb5bb1337 in ioctl () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f6fb4cd6e74 in drmIoctl (fd=6, request=request@entry=1074553951, arg=arg@entry=0x7ffdca8d7840) at xf86drm.c:164 #2 0x00007f6fb4ee68f7 in drm_intel_gem_bo_map (bo=0x1505f50, write_enable=1) at intel_bufmgr_gem.c:1325 #3 0x00007f6fb5880446 in cl_mem_map (mem=0x14dc4a0, write=write@entry=1) at /home/OpenCL/beignet/src/cl_mem.c:1908 #4 0x00007f6fb586f223 in clMapBufferIntel (mem=<optimized out>, errcode_ret=0x7ffdca8d790c) at /home/OpenCL/beignet/src/cl_api.c:3215 #5 0x00007f6fb65af04e in test_copy_buf (sz=1024, cb=512, dst_off=0, src_off=<optimized out>) at /home/OpenCL/beignet/utests/enqueue_copy_buf.cpp:24 #6 enqueue_copy_buf () at /home/OpenCL/beignet/utests/enqueue_copy_buf.cpp:61 #7 0x00007f6fb65af4bd in __ANON__enqueue_copy_buf__ () at /home/OpenCL/beignet/utests/enqueue_copy_buf.cpp:66 #8 0x00007f6fb63b92df in UTest::runAllNoIssue () at /home/OpenCL/beignet/utests/utest.cpp:169 #9 0x0000000000401786 in main (argc=1, argv=0x7ffdca8d8308) at /home/OpenCL/beignet/utests/utest_run.cpp:104 ==Reproduce steps== ---------------------------- 1. utests/utest_run
This blocks our OpenCL testing.
The issue is case hang. "utests/utest_run" could reproduce the issue. Note,the issue couldn't be reproduced if running one by one (utests/utest_run -c "subcase").
So no GPU hang? Does the problem happen with i915.enable_execlists=0 too?
(In reply to Ville Syrjala from comment #3) > So no GPU hang? > > Does the problem happen with i915.enable_execlists=0 too? With i915.execlist=0, the issue still exists. For OpenCL testing, we need to disable i915 hang check because OCL kernel may cost 6 seconds or even more.
(In reply to meng from comment #4) When the case hang, gdb attach that, then it could finish. So it's not GPU hang.
(In reply to meng from comment #4) > (In reply to Ville Syrjala from comment #3) > > So no GPU hang? > > > > Does the problem happen with i915.enable_execlists=0 too? > > With i915.execlist=0, the issue still exists. > For OpenCL testing, we need to disable i915 hang check because OCL kernel > may cost 6 seconds or even more. 6 seconds of monopolizing the GPU sounds like a DoS worthy of being banned ;-) So not even the grace period given to looping kernels is enough to prevent hangcheck firing? I would strongly suggest you fired a bug with the bare minimum required to reproduce (that is an igt).
(In reply to meng from comment #5) > (In reply to meng from comment #4) > When the case hang, gdb attach that, then it could finish. So it's not GPU > hang. No, that would be a "missed interrupt" which is normally detected by hangcheck.
Bug scrub: Assigned to Jani
Assigned to Mengmeng Hi Mengmeng, Is it still reproduced?
Timeout, closing. Please reopen if the problem persists on latest kernels.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.