Bug 48218 - brw_fs_schedule_instructions.cpp segfault due to accessing not allocated last_mrf_write[16]
brw_fs_schedule_instructions.cpp segfault due to accessing not allocated last...
Status: RESOLVED FIXED
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965
8.0
x86 (IA32) Linux (All)
: medium normal
Assigned To: Kenneth Graunke
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-04-02 15:19 UTC by youonly@mailinator.com
Modified: 2012-04-27 16:54 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description youonly@mailinator.com 2012-04-02 15:19:04 UTC
archlinux, mesa 8.0.2, libgl 8.0.2, intel-dri 8.0.2 at i965 chipset

mesa was recompiled with debug enabled

running gzdoom game (gl-vesion zdoom)

got segfault, gdb backtrace:



Program terminated with signal 11, Segmentation fault.
#0  instruction_scheduler::add_dep (this=0xbfdd5084, before=0xe4431200, after=0x9a3ee00) at brw_fs_schedule_instructions.cpp:207

#0  instruction_scheduler::add_dep (this=0xbfdd5084, before=0xe4431200, after=0x9a3ee00) at brw_fs_schedule_instructions.cpp:207
#1  0xb52a8c1a in instruction_scheduler::calculate_deps (this=0xbfdd5084) at brw_fs_schedule_instructions.cpp:298
#2  0xb52a94e0 in fs_visitor::schedule_instructions (this=0xbfdf5608) at brw_fs_schedule_instructions.cpp:516
#3  0xb528e500 in fs_visitor::run (this=0xbfdf5608) at brw_fs.cpp:1783
#4  0xb528ee96 in brw_wm_fs_emit (brw=0xa42b010, c=0x994a440, prog=0x9d97948) at brw_fs.cpp:1866
#5  0xb525f2de in do_wm_prog (brw=0xa42b010, prog=0x9d97948, fp=0xaade400, key=0xbfe15b7c) at brw_wm.c:282
#6  0xb525fa5e in brw_upload_wm_prog (brw=0xa42b010) at brw_wm.c:546
#7  0xb5254f0f in brw_upload_state (brw=0xa42b010) at brw_state_upload.c:503
#8  0xb523c5e7 in brw_try_draw_prims (max_index=3, min_index=0, ib=0x0, nr_prims=1, prim=0xa52f1c0, arrays=0xa5308d8, ctx=0xa42b010) at brw_draw.c:482
#9  brw_draw_prims (ctx=0xa42b010, arrays=0xa5308d8, prim=0xa52f1c0, nr_prims=1, ib=0x0, index_bounds_valid=1 '\001', min_index=0, max_index=3, tfb_vertcount=0x0) at brw_draw.c:566
#10 0xb502eaac in vbo_exec_vtx_flush (exec=0xa52eda0, keepUnmapped=1 '\001') at vbo/vbo_exec_draw.c:407
#11 0xb50233e7 in vbo_exec_FlushVertices_internal (exec=0xa52eda0, unmap=<optimized out>) at vbo/vbo_exec_api.c:444
#12 0xb502ba78 in vbo_exec_FlushVertices (ctx=0xa42b010, flags=1) at vbo/vbo_exec_api.c:1195
#13 0xb4f3b5f1 in _mesa_BlendFuncSeparateEXT (sfactorRGB=770, dfactorRGB=771, sfactorA=770, dfactorA=771) at main/blend.c:205
#14 0xb4f3b64b in _mesa_BlendFunc (sfactor=770, dfactor=771) at main/blend.c:155




After some research:

brw_fs_schedule_instructions.cpp:298 with debug code inserted

      for (int i = 0; i < inst->mlen; i++) {
        /* It looks like the MRF regs are released in the send
         * instruction once it's sent, not when the result comes
         * back.
         */
printf("debug: mlen=%d inst->base_mrf=%d i=%d\n",inst->mlen,inst->base_mrf,i);
        add_dep(last_mrf_write[inst->base_mrf + i], n);
      }



output when gzdoom crashes:

...

debug: mlen=15 inst->base_mrf=2 i=10
debug: mlen=15 inst->base_mrf=2 i=11
debug: mlen=15 inst->base_mrf=2 i=12
debug: mlen=15 inst->base_mrf=2 i=13
debug: mlen=15 inst->base_mrf=2 i=14



last_mrf_write has maximum size of BRW_MAX_MRF (16)

BAD: inst->base_mrf + i = 14+2 = 16
BAD: accessing out of array size (0..15) and crashed




I unpacked mesa sources first time so do not have enough knowlege how to find real reason of this problem.

But changed array size +1 allows me to run gzdoom greatly:



-schedule_node *last_mrf_write[BRW_MAX_MRF];
+schedule_node *last_mrf_write[BRW_MAX_MRF+1];



Please check this bug. If you need more information I can do more research.
Comment 1 Kenneth Graunke 2012-04-04 16:06:45 UTC
Could you please send us the output of:

   glxinfo | grep renderer

(I'm guessing you're using Sandybridge, but it'd be nice to double check.)  

Thanks!
Comment 2 youonly@mailinator.com 2012-04-05 01:24:32 UTC
OpenGL renderer string: Mesa DRI Intel(R) Ironlake Mobile x86/MMX/SSE2

this is lenovo laptop
Comment 3 youonly@mailinator.com 2012-04-12 03:13:00 UTC
updating to kernel 3.3.1 do not solve this issue. only patched version of brw_fs_schedule_instructions.cpp allows to run gzdoom
Comment 4 Kenneth Graunke 2012-04-24 00:37:08 UTC
Sorry for not getting to this sooner...

The bug is that we were creating a message that tried to use m2..m16, when there technically is no m16 register on the hardware.  This caused us to make an out-of-bounds address during instruction scheduling, and thus crash.  However, simply making the array bigger is not a good solution, since it still means we'd try to use a non-existent register.  According to the documentation, it -may- wrap around and use m0, but that's not guaranteed and we're told not to rely on it.

Instead, I changed it to use m1..m15.

Mesa patches to fix the bug:
http://lists.freedesktop.org/archives/mesa-dev/2012-April/021113.html
http://lists.freedesktop.org/archives/mesa-dev/2012-April/021114.html

Piglit test:
http://lists.freedesktop.org/archives/piglit/2012-April/002251.html

Assuming it passes the review, I'll commit these soon.  Thanks again for the report!
Comment 5 Kenneth Graunke 2012-04-27 16:54:18 UTC
Fixed on Mesa master by:

commit b443ca96a55a06ee215a3f9a9e7dba558deeb58c
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Tue Apr 24 14:09:13 2012 -0700

    i965/fs: Fix FB writes that tried to use the non-existent m16 register.
    
    A little analysis shows that the worst-case value for "nr" is 17:
    - base_mrf = 2                       ... 2
    - header present (say gen == 5)      ... 4
    - aa_dest_stencil_reg (stencil test) ... 5
    - SIMD16 mode: += 4 * reg_width      ... 13
    - source_depth_to_render_target      ... 15
    - dest_depth_reg                     ... 17
    
    This resulted in us setting base_mrf to 2 and mlen to 15.  In other
    words, we'd try to use m2..m16.  But m16 doesn't exist pre-Gen6.  Also,
    the instruction scheduler data structures use arrays of size 16, so this
    would cause us to access them out of bounds.
    
    While the debugger system routine may need m0 and m1, we don't use it
    today, so the simplest solution is just to move base_mrf back to 1.
    That way, our worst case message fits in m1..m15, which is legal.
    
    An alternative would be to fail on SIMD16 in this case, but that seems
    a bit unfortunate if there's no real need to reserve m0 and m1.
    
    Fixes new piglit test shaders/depth-test-and-write on Ironlake.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48218
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Eric Anholt <eric@anholt.net>

which I also cherry-picked to the 8.0 branch as commit bcc5caf642a9beec324657becac501944ce4dc23.