Bug 59445 - [SNB/IVB/HSW Bisected]Oglc draw-buffers2(advanced.blending.none) segfault
[SNB/IVB/HSW Bisected]Oglc draw-buffers2(advanced.blending.none) segfault
Status: VERIFIED FIXED
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965
unspecified
All Linux (All)
: high major
Assigned To: Carl Worth
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-01-16 02:29 UTC by lu hua
Modified: 2013-04-22 07:39 UTC (History)
2 users (show)

See Also:


Attachments
Proposed fix (1.80 KB, patch)
2013-02-20 00:25 UTC, Carl Worth
Details | Splinter Review
A revised patch (1.27 KB, patch)
2013-02-20 17:39 UTC, Carl Worth
Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description lu hua 2013-01-16 02:29:34 UTC
System Environment:
--------------------------
Arch:           x86_64
Platform:       Ivybridge
Libdrm:		(master)libdrm-2.4.40-13-g9e6f96a579fc2ed241e9a31a35a5995129ee8f7a
Mesa:		(master)1c9833ba70466906d3d2ad3aee6a10642a39abdd
Xserver:	(master)xorg-server-1.13.99.901-2-g6703a7c7cf1a349c137e247a0c8eb462ff7b07be
Xf86_video_intel:(master)2.20.17-86-g6abd442279fd32d1ce9b33a72eabbeb922316151
Cairo:		(master)768b81b78eabbebb1bb443355441cac567739035
Libva:		(staging)21649988d6b532cc96f633db017d1e4369f640e9
Libva_intel_driver:(staging)788e99361208127763fdf1e146e63fca03a09f67
Kernel:	(drm-intel-nightly) 5462cff312bb68c38e1996b7c4a1fda23b6bdb96

Bug detailed description:
-------------------------
It segfaults on sandybridge, ivybridge and haswell with mesa master branch. 
It doesn't happen on mesa 9.0 branch.

Bisect shows:258453716f001eab1288d99765213221d0599ca4 is the first bad commit.
commit 258453716f001eab1288d99765213221d0599ca4
Author:     Carl Worth <cworth@cworth.org>
AuthorDate: Fri Jan 11 07:15:18 2013 -0800
Commit:     Carl Worth <cworth@cworth.org>
CommitDate: Mon Jan 14 15:35:37 2013 -0800

    i965: Avoid blending with destination alpha when RB format has no alpha bits

    The hardware does not support a render target without an alpha channel.
    So when the user creates a render buffer with no alpha channel, there actually
    is storage available for alpha internally. It requires special care to
    avoid these unwanted alpha bits from causing any problems.

    Specifically, when blending, and when the blend factors would read the
    destination alpha values, this commit coerces the blend factors to instead be
    either 0 or 1 as appropriate.

    A similar fix was made for pre-gen6 hardware in commit eadd9b8e and this
    commit shares the fixup function written by Ian then.

output:
Intel OpenGL Conformance Test
Version ENG (Jan 15 2013 15:59:23)

CLI options echo:
oglconform -z -suite all -v 2 -test draw-buffers2 advanced.blending.none

Window will be recreated 12 times.
  Window 0 will run 1 testcases on config with id 147.
  Window 1 will run 1 testcases on config with id 141.
  Window 2 will run 1 testcases on config with id 115.
  Window 3 will run 1 testcases on config with id 109.
  Window 4 will run 1 testcases on config with id 148.
  Window 5 will run 1 testcases on config with id 116.
  Window 6 will run 1 testcases on config with id 139.
  Window 7 will run 1 testcases on config with id 107.
  Window 8 will run 1 testcases on config with id 142.
  Window 9 will run 1 testcases on config with id 110.
  Window 10 will run 1 testcases on config with id 140.
  Window 11 will run 1 testcases on config with id 108.
Total of 12 testcases will be executed.

Setup Report.
    Verbose level = 2.
    Path inactive.

Visual Report for ID 147 (32 bits).
ID      |ACCELERA|DB      |REND_T  |SURF_T  |C_BUF_T |BUF_S   |RED_S   |
     147|       1|       1|      gl|  wipbpx|    rgba|      32|       8|

GREEN_S |BLUE_S  |ALPHA_S |DEPTH_S |STENC_S |ACCUM_S |SPL_BUF |SAMPLES |
       8|       8|       8|      24|       8|       0|       0|       0|

SRGB    |TEX_RGB |TEX_RGBA|CAVEAT  |SWAP    |M_PBUF_W|M_PBUF_H|M_PBUF_P
       0|       0|       0|    none|   undef|       0|       0|       0

OpenGL Report.
    Vendor - 'Intel Open Source Technology Center'
    Renderer - 'Mesa DRI Intel(R) Ivybridge Desktop '
    Version - '3.0 Mesa 9.1-devel (git-2584537)' (3.0)
    GLSL Version - '1.30'
    Context Flags - None

>> Draw buffers 2 (draw-buffers2)  test:
--> 2.1.6 - advanced.blending.none subcase:
Segmentation fault (core dumped)

(gdb) bt
#0  gen6_upload_blend_state (brw=0x3289870) at gen6_cc.c:128
#1  0x00007ffff6715462 in brw_upload_state (brw=0x3289870) at brw_state_upload.c:500
#2  0x00007ffff66d8c27 in brw_try_draw_prims (max_index=<optimized out>, min_index=<optimized out>, ib=0x32d9e04, nr_prims=53044960, prim=0x32d9dec, arrays=<optimized out>,
    ctx=0x3289870) at brw_draw.c:500
#3  brw_draw_prims (ctx=0x3289870, prim=0x32d9dec, nr_prims=53044960, ib=0x32d9e04, index_bounds_valid=<optimized out>, min_index=0, max_index=3, tfb_vertcount=0x0)
    at brw_draw.c:587
#4  0x00007ffff6239835 in vbo_exec_vtx_flush (exec=0x32d9558, keepUnmapped=1 '\001') at ../../../src/mesa/vbo/vbo_exec_draw.c:400
#5  0x00007ffff622aafc in vbo_exec_FlushVertices_internal (exec=0x32d9558, unmap=<optimized out>) at ../../../src/mesa/vbo/vbo_exec_api.c:551
#6  0x00007ffff62372bc in vbo_exec_FlushVertices (ctx=0x3289870, flags=<optimized out>) at ../../../src/mesa/vbo/vbo_exec_api.c:1245
#7  0x00007ffff61e35de in use_shader_program (ctx=0x3289870, type=35633, shProg=0x0) at ../../../src/mesa/main/shaderapi.c:886
#8  0x00007ffff61e4aa9 in _mesa_use_program (ctx=0x3289870, shProg=0x0) at ../../../src/mesa/main/shaderapi.c:921
#9  0x00000000008a4cc3 in oglConf::gl::ProgramGLSL::~ProgramGLSL() ()
#10 0x00000000005159a2 in boost::detail::sp_counted_impl_p<oglConf::gl::ProgramGLSL>::dispose() ()
#11 0x0000000000541b09 in draw_buffers2::buffersDraw(std::vector<draw_buffers2::BufferState, std::allocator<draw_buffers2::BufferState> > const&, bool, draw_buffers2::DrawMethod) ()
#12 0x0000000000544a69 in draw_buffers2::blendMaskTestGeneralFlow ()
#13 0x000000000054601a in draw_buffers2::blendMaskNoneTest(bool) ()
#14 0x000000000053f2c2 in DrawBuffers2Exec(testParameters*) ()
#15 0x000000000179e0a9 in callFunctionHandleExceptionsInner(long (*)(testParameters*), testParameters*, char*) ()
#16 0x000000000179e1ff in callFunctionHandleExceptions(long (*)(testParameters*), testParameters*) ()
#17 0x000000000179ce91 in DriverExec(long (*)(testParameters*), testParameters*) ()
#18 0x00000000016d39b5 in Driver(std::vector<std::pair<std::string, std::string>, std::allocator<std::pair<std::string, std::string> > > const&, std::vector<driverRec*, std::allocator<driverRec*> > const&, std::vector<boost::shared_ptr<PrePostTestAction>, std::allocator<boost::shared_ptr<PrePostTestAction> > > const&, std::vector<boost::shared_ptr<PrePostTestcaseAction>, std::allocator<boost::shared_ptr<PrePostTestcaseAction> > > const&) ()
#19 0x00000000016d4468 in (anonymous namespace)::MyMessagePump::idle() ()
#20 0x00000000016a5470 in MessagePump::process_messages() ()
#21 0x00000000016d29dc in ExecutionManager::execute_schedules() ()
#22 0x0000000001672667 in tkShellExecute(std::vector<std::pair<std::string, std::string>, std::allocator<std::pair<std::string, std::string> > > const&, std::vector<std::pair<std::string, std::string>, std::allocator<std::pair<std::string, std::string> > > const&) ()
#23 0x0000000000428495 in main ()

Reproduce steps:
----------------
1. xinit
2. ./oglconform -z -suite all -v 2 -test draw-buffers2 advanced.blending.none
Comment 1 Gordon Jin 2013-02-18 00:58:59 UTC
Carl?
Comment 2 fangxun 2013-02-19 09:20:34 UTC
It also fails on mesa 9.1 branch.
Comment 3 Carl Worth 2013-02-20 00:25:56 UTC
Created attachment 75146 [details] [review]
Proposed fix

I just sent this out to the list for comments.

Please feel free to test and comment.

-Carl
Comment 4 fangxun 2013-02-20 06:09:31 UTC
I test the patch on ivybridge, the test result is unstable. It segfaults once in 6 runs.
Comment 5 Carl Worth 2013-02-20 17:39:53 UTC
Created attachment 75193 [details] [review]
A revised patch

> I test the patch on ivybridge, the test result is unstable. It
> segfaults once in 6 runs.

I have no theory that would explain that behavior.

But, independently, I have an alternate patch which uses a slightly
simpler condition, (and the same code as brw_cc.c).

Please test this patch instead of the earlier one, and let me know if
it works better.

If not, then I'll get setup with oglconform so that I can test
directly.

-Carl
Comment 6 fangxun 2013-02-21 04:15:31 UTC
Test your patch on ivybridge, it still segfaults once in 6 runs.
Comment 7 Carl Worth 2013-02-21 20:05:03 UTC
I got oglconform working on my system, and verified the behavior you
reported here, (consistent segaults on master, and less frequent
segfaults with my patch applied).

But I also tested mesa from commit 6d4d4b00ddfbd3257ecd129fec5b813be,
(immediately before commit 258453716f001eab1288d99765213 bisected in
the bug report above). The infrequent segfaults are already present
there.

So I believe the patches I have posted are correct, (I'll push the
original one following some code review from Eric).

But there appears to be an independent, pre-existing bug leading to
the infrequent segfaults.

-Carl
Comment 8 Carl Worth 2013-02-21 21:52:02 UTC
(In reply to comment #7)
> But there appears to be an independent, pre-existing bug leading to
> the infrequent segfaults.

And here's a valgrind report that seems consistent with the bug,
obtained by running:

    valgrind ./oglconform -suite all -v 2 -test draw-buffers2 advanced.blending.none

On this run, 13 of the 16 testcases passed without valgrind reporting
any problems. Then, on the 14th testcase the error below was reported.

This looks to me like a bug in oglconform itself, where it is reusing
some resource in calling Context:make_current, where that resource was
freed previously by a call to glxDestroyWindow.

Now that I have pushed the fix for the originally bisected issue, I am
closing this bug report. The oglconform problem should be tracked
separately, (but it's not a bug in our driver).

-Carl

==13683== Invalid read of size 8
==13683==    at 0x6B44312: dri2FlushFrontBuffer (dri2_glx.c:676)
==13683==    by 0x82108F6: intel_flush_front (intel_context.c:296)
==13683==    by 0x8211007: intel_glFlush (intel_context.c:545)
==13683==    by 0x85FE758: _mesa_flush (context.c:1693)
==13683==    by 0x85FE05E: _mesa_make_current (context.c:1479)
==13683==    by 0x821220F: intelUnbindContext (intel_context.c:904)
==13683==    by 0x82E2442: driUnbindContext (dri_util.c:421)
==13683==    by 0x6B431ED: dri2_unbind_context (dri2_glx.c:195)
==13683==    by 0x6B003C4: MakeContextCurrent (glxcurrent.c:255)
==13683==    by 0x1A4B984: Context::make_current(Drawable_ const&) const (windowing.cpp:1440)
==13683==    by 0x1A82025: (anonymous namespace)::run_schedule(Display_ const&, boost::shared_ptr<FbConfig>, std::vector<std::pair<std::string, std::string>, std::allocator<std::pair<std::string, std::string> > > const&, std::vector<driverRec*, std::allocator<driverRec*> > const&, std::vector<boost::shared_ptr<PrePostAction>, std::allocator<boost::shared_ptr<PrePostAction> > > const&, std::vector<boost::shared_ptr<PrePostTestAction>, std::allocator<boost::shared_ptr<PrePostTestAction> > > const&, std::vector<boost::shared_ptr<PrePostTestcaseAction>, std::allocator<boost::shared_ptr<PrePostTestcaseAction> > > const&) (executer.cpp:200)
==13683==    by 0x1A81992: ExecutionManager::execute_schedules() (executer.cpp:84)
==13683==  Address 0xc5346e8 is 24 bytes inside a block of size 208 free'd
==13683==    at 0x4C28F5C: free (vg_replace_malloc.c:446)
==13683==    by 0x6B436CB: dri2DestroyDrawable (dri2_glx.c:358)
==13683==    by 0x6B3B089: DestroyDRIDrawable (glx_pbuffer.c:230)
==13683==    by 0x6B3B85C: DestroyDrawable (glx_pbuffer.c:468)
==13683==    by 0x6B3C3DC: glXDestroyWindow (glx_pbuffer.c:938)
==13683==    by 0x1A4ABCA: Drawable_::~Drawable_() (windowing.cpp:1292)
==13683==    by 0x1A83A5A: void boost::checked_delete<Drawable_>(Drawable_*) (checked_delete.hpp:34)
==13683==    by 0x1A83DBB: boost::detail::sp_counted_impl_p<Drawable_>::dispose() (sp_counted_impl.hpp:78)
==13683==    by 0x5426EB: boost::detail::sp_counted_base::release() (sp_counted_base_gcc_x86.hpp:145)
==13683==    by 0x542764: boost::detail::shared_count::~shared_count() (shared_count.hpp:217)
==13683==    by 0x82F7B5: boost::shared_ptr<Drawable_>::~shared_ptr() (shared_ptr.hpp:168)
==13683==    by 0x1A8220F: (anonymous namespace)::run_schedule(Display_ const&, boost::shared_ptr<FbConfig>, std::vector<std::pair<std::string, std::string>, std::allocator<std::pair<std::string, std::string> > > const&, std::vector<driverRec*, std::allocator<driverRec*> > const&, std::vector<boost::shared_ptr<PrePostAction>, std::allocator<boost::shared_ptr<PrePostAction> > > const&, std::vector<boost::shared_ptr<PrePostTestAction>, std::allocator<boost::shared_ptr<PrePostTestAction> > > const&, std::vector<boost::shared_ptr<PrePostTestcaseAction>, std::allocator<boost::shared_ptr<PrePostTestcaseAction> > > const&) (executer.cpp:264)
Comment 9 lu hua 2013-04-22 07:39:35 UTC
Verified.Fixed.