Bug 65534

Summary: Piglit glx_glx-multithread-shader-compile randomly aborts or core dumps
Product: Mesa Reporter: lu hua <huax.lu>
Component: Mesa coreAssignee: mesa-dev
Status: VERIFIED FIXED QA Contact:
Severity: major    
Priority: high CC: eero.t.tamminen, xunx.fang
Version: 9.1   
Hardware: All   
OS: Linux (All)   
i915 platform: i915 features:
Attachments: output

Description lu hua 2013-06-08 06:59:01 UTC
Created attachment 80506 [details]

System Environment:
Arch:           x86_64
Platform:       Haswell
Libdrm:		(master)libdrm-2.4.45-7-ga0178c00c70f4b47e09ed7564fc2ccde611231a0
Mesa:		(master)f8df73f41c5f4e461dc7de8cd3a7b32b04dfbf2e
Xserver:	(master)xorg-server-
Cairo:		(master)17dc312221c294b120bd159e01f5f566c6ec4a2d
Libva:		(staging)35f2a712e93d5eeadeafab72aa2a41cfb54512bc
Kernel:	(drm-intel-nightly) e93a016e97c27dc9325e7a25409228ae54fc14a4

Bug detailed description:
It randomly core dumped on haswell with mesa 9.1 branch. It works well on mesa master branch.
It happens 1 in 5 runs.

(gdb) bt
#0  0x0000003cc98362a5 in raise () from /lib64/libc.so.6
#1  0x0000003cc9837bbb in abort () from /lib64/libc.so.6
#2  0x0000003cc9875ffe in __libc_message () from /lib64/libc.so.6
#3  0x0000003cc987c606 in malloc_printerr () from /lib64/libc.so.6
#4  0x00007ffff5e68e54 in poll_for_reply (c=0x7ffff0001ca0, request=4, reply=0x7ffff5433c68, error=<optimized out>) at xcb_in.c:360
#5  0x00007ffff5e69346 in wait_for_reply (c=0x7ffff0001ca0, request=4, e=0x7ffff5433cf8) at xcb_in.c:398
#6  0x00007ffff5e6952b in xcb_wait_for_reply (c=0x7ffff0001ca0, request=4, e=0x7ffff5433cf8) at xcb_in.c:429
#7  0x00007ffff74e2dbd in _XReply (dpy=0x7ffff00008c0, rep=0x7ffff5433d60, extra=0, discard=0) at xcb_io.c:602
#8  0x00007ffff74d337b in XOpenDisplay (display=<optimized out>) at OpenDis.c:539
#9  0x00007ffff7d3d5ea in piglit_get_glx_display () at /GFX/Test/Piglit/piglit/tests/util/piglit-glx-util.c:43
#10 0x0000000000400dca in thread_func (arg=0x0) at /GFX/Test/Piglit/piglit/tests/glx/glx-multithread-shader-compile.c:59
#11 0x0000003db0407d90 in start_thread () from /lib64/libpthread.so.0
#12 0x0000003cc98eeddd in clone () from /lib64/libc.so.6

Reproduce steps:
1. xinit
2. ./bin/glx-multithread-shader-compile -auto -fbo
Comment 1 Gordon Jin 2013-12-06 02:12:15 UTC
Ian, do you want to fix it?
Comment 2 Eric Anholt 2014-02-05 18:01:14 UTC
Renaming the bug so it can be the dupe target of the other QA report about the test case.  There is general memory corruption due to lack of locking, which manifests in various ways including aborts and core dumps.
Comment 3 Eric Anholt 2014-02-05 18:01:26 UTC
*** Bug 66948 has been marked as a duplicate of this bug. ***
Comment 4 Eero Tamminen 2014-10-23 16:11:00 UTC
(In reply to Eric Anholt from comment #2)
> There is general memory corruption due to lack of locking,
> which manifests in various ways including aborts and core dumps.

Would Valgrind thread & memory debuggers help in finding the issues:

or are they already known?

(New GCC & LLVM tools and mutrace might also be interesting in tracking down issues: http://0pointer.de/blog/projects/mutrace.html)
Comment 5 Jason Ekstrand 2015-02-05 23:08:04 UTC
Is this still a problem?  Matt did a bunch of work to fix multi-threaded shader compiles so that we could do a single-process multi-threaded shader-db.  It's working pretty well these days.
Comment 6 Kaveh 2015-02-05 23:11:18 UTC
Please re-test with the latest Mesa.
Comment 7 lu hua 2015-02-06 03:35:02 UTC
It works well on the latest Mesa master branch.
[root@x-hsw24 piglit]# bin/glx-multithread-shader-compile -auto -fbo
PIGLIT: {"result": "pass" }
Comment 8 lu hua 2015-02-06 03:35:37 UTC

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.