Bug 61036 - Shader fails to build in LLVMpipe, aborts program
Shader fails to build in LLVMpipe, aborts program
Status: RESOLVED FIXED
Product: Mesa
Classification: Unclassified
Component: Other
9.0
x86 (IA32) Linux (All)
: medium major
Assigned To: Jose Fonseca
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-18 03:39 UTC by Brian Crowell
Modified: 2013-02-20 18:40 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Shader test program (3.63 KB, text/plain)
2013-02-18 03:39 UTC, Brian Crowell
Details
Error message produced by LLVM 3.0 (62.97 KB, text/plain)
2013-02-18 03:55 UTC, Brian Crowell
Details
fix indirect fetches requiring bitcasts (3.12 KB, patch)
2013-02-19 01:37 UTC, Roland Scheidegger
Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Brian Crowell 2013-02-18 03:39:02 UTC
Created attachment 75019 [details]
Shader test program

I'm aware that some of the problems with shaders in LLVMpipe depend on the version of LLVM installed, so I tested the attached program with LLVM 3.0 and 3.2, and both fail in Mesa 9.0.1.

Building with LLVM 3.2 and running the attached program yields:

shader1: /home/james/software/extern/llvm-3.2.src/lib/VMCore/Instructions.cpp:1487: llvm::InsertElementInst::InsertElementInst(llvm::Value*, llvm::Value*, llvm::Value*, const llvm::Twine&, llvm::Instruction*): Assertion `isValidOperands(Vec, Elt, Index) && "Invalid insertelement instruction operands!"' failed.
Aborted


The backtrace:

#0  0xb7fe1430 in __kernel_vsyscall ()
#1  0xb5eb6941 in *__GI_raise (sig=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2  0xb5eb9d72 in *__GI_abort () at abort.c:92
#3  0xb5eafb58 in *__GI___assert_fail (
    assertion=0xb7975fa8 "isValidOperands(Vec, Elt, Index) && \"Invalid insertelement instruction operands!\"", 
    file=0xb7975180 "/home/james/software/extern/llvm-3.2.src/lib/VMCore/Instructions.cpp", line=1487, 
    function=0xb797c9c0 "llvm::InsertElementInst::InsertElementInst(llvm::Value*, llvm::Value*, llvm::Value*, const llvm::Twine&, llvm::Instruction*)")
    at assert.c:81
#4  0xb7249aff in llvm::InsertElementInst::InsertElementInst(llvm::Value*, llvm::Value*, llvm::Value*, llvm::Twine const&, llvm::Instruction*) ()
   from /home/james/software/extern/mesa/build/linux-x86-debug/gallium/targets/libgl-xlib/libGL.so.1
#5  0xb6e43a63 in llvm::InsertElementInst::Create(llvm::Value*, llvm::Value*, llvm::Value*, llvm::Twine const&, llvm::Instruction*) ()
   from /home/james/software/extern/mesa/build/linux-x86-debug/gallium/targets/libgl-xlib/libGL.so.1
#6  0xb6e4453b in llvm::IRBuilder<true, llvm::ConstantFolder, llvm::IRBuilderDefaultInserter<true> >::CreateInsertElement(llvm::Value*, llvm::Value*, llvm::Value*, llvm::Twine const&) ()
   from /home/james/software/extern/mesa/build/linux-x86-debug/gallium/targets/libgl-xlib/libGL.so.1
#7  0xb71b4777 in LLVMBuildInsertElement ()
   from /home/james/software/extern/mesa/build/linux-x86-debug/gallium/targets/libgl-xlib/libGL.so.1
#8  0xb684f29f in build_gather (bld=0xbfffb66c, base_ptr=0x81b530c, 
    indexes=0x817df50) at src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c:451
#9  0xb684f9d9 in emit_fetch_constant (bld_base=0xbfffb624, reg=0x8331a80, 
    stype=TGSI_TYPE_SIGNED, swizzle=0)
    at src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c:640
#10 0xb688562f in lp_build_emit_fetch (bld_base=0xbfffb624, inst=0x8331a50, 
    src_op=0, chan_index=0) at src/gallium/auxiliary/gallivm/lp_bld_tgsi.c:305
#11 0xb688513b in lp_build_fetch_args (bld_base=0xbfffb624, 
    emit_data=0xbfffb4c8) at src/gallium/auxiliary/gallivm/lp_bld_tgsi.c:177
#12 0xb688538b in lp_build_tgsi_inst_llvm (bld_base=0xbfffb624, 
    inst=0x8331a50) at src/gallium/auxiliary/gallivm/lp_bld_tgsi.c:245
#13 0xb6885c30 in lp_build_tgsi_llvm (bld_base=0xbfffb624, tokens=0x81cade8)
    at src/gallium/auxiliary/gallivm/lp_bld_tgsi.c:475
#14 0xb6853f29 in lp_build_tgsi_soa (gallivm=0x81c0138, tokens=0x81cade8, 
    type=..., mask=0x0, consts_ptr=0x81b7514, system_values=0xbfffed1c, 
    pos=0x0, inputs=0xbfffe714, outputs=0xbfffed28, sampler=0x81c1f70, 
    info=0x81ca75c) at src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c:2561
#15 0xb6855f1e in generate_vs (variant=0x81a2b78, builder=0x82c87f8, 
    vs_type=..., outputs=0xbfffed28, inputs=0xbfffe714, 
    system_values=0xbfffed1c, context_ptr=0x81b2380, draw_sampler=0x81c1f70, 
    clamp_vertex_color=1 '\001') at src/gallium/auxiliary/draw/draw_llvm.c:494
#16 0xb6858895 in draw_llvm_generate (llvm=0x805b010, variant=0x81a2b78, 
    elts=0 '\000') at src/gallium/auxiliary/draw/draw_llvm.c:1342
#17 0xb6855d61 in draw_llvm_create_variant (llvm=0x805b010, num_inputs=6, 
    key=0xbffff1b8) at src/gallium/auxiliary/draw/draw_llvm.c:449
#18 0xb685a3f3 in llvm_middle_end_prepare (middle=0x80694c8, in_prim=6, 
    opt=3, max_vertices=0x80668cc)
    at src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline_llvm.c:170
#19 0xb67a452f in vsplit_prepare (frontend=0x80668b0, in_prim=6, 
    middle=0x80694c8, opt=3)
    at src/gallium/auxiliary/draw/draw_pt_vsplit.c:175
# And so on


I'm not sure if you care what LLVM 3.0 yields, but it's a very long IL printout and complaint list. I'll attach it separately.
Comment 1 Brian Crowell 2013-02-18 03:55:01 UTC
Created attachment 75025 [details]
Error message produced by LLVM 3.0

This is the text produced by LLVM 3.0 from the program that was the source of the test case. I don't know if it's useful at all, it seems to just be saying that LLVM has no idea what to do with the shader.

The stack trace on that one is a simple verification failure:

#0  0xb704a5bb in _debug_assert_fail (expr=0xb7a8af3c "0", 
    file=0xb7a8ae84 "src/gallium/auxiliary/gallivm/lp_bld_init.c", line=587, 
    function=0xb7a8afa6 "gallivm_verify_function") at src/gallium/auxiliary/util/u_debug.c:278
#1  0xb70c252f in gallivm_verify_function (gallivm=0x8196640, func=0x819bf40)
    at src/gallium/auxiliary/gallivm/lp_bld_init.c:587
#2  0xb70dd891 in draw_llvm_generate (llvm=0x805b200, variant=0x81a0740, elts=0 '\000')
    at src/gallium/auxiliary/draw/draw_llvm.c:1401
#3  0xb70daa7d in draw_llvm_create_variant (llvm=0x805b200, num_inputs=6, key=0xbffff1b8)
    at src/gallium/auxiliary/draw/draw_llvm.c:449
#4  0xb70df10f in llvm_middle_end_prepare (middle=0x80696b8, in_prim=6, opt=3, max_vertices=0x8066abc)
    at src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline_llvm.c:170
#5  0xb702924f in vsplit_prepare (frontend=0x8066aa0, in_prim=6, middle=0x80696b8, opt=3)
    at src/gallium/auxiliary/draw/draw_pt_vsplit.c:175
#6  0xb701dcb7 in draw_pt_arrays (draw=0x805a4b8, prim=6, start=0, count=4)
    at src/gallium/auxiliary/draw/draw_pt.c:134
#7  0xb701e901 in draw_vbo (draw=0x805a4b8, info=0xbffff6e4)
    at src/gallium/auxiliary/draw/draw_pt.c:539
#8  0xb6d30b48 in llvmpipe_draw_vbo (pipe=0x80595a0, info=0xbffff6e4)
    at src/gallium/drivers/llvmpipe/lp_draw_arrays.c:103
#9  0xb7009cc7 in cso_draw_vbo (cso=0x811ed08, info=0xbffff6e4)
    at src/gallium/auxiliary/cso_cache/cso_context.c:1347
#10 0xb6e8110f in st_draw_vbo (ctx=0x80c5208, prims=0xbffff774, nr_prims=1, ib=0x0, 
    index_bounds_valid=1 '\001', min_index=0, max_index=3, tfb_vertcount=0x0)
    at src/mesa/state_tracker/st_draw.c:287
#11 0xb6f334c2 in vbo_draw_arrays (ctx=0x80c5208, mode=6, start=0, count=4, numInstances=1, 
    baseInstance=0) at src/mesa/vbo/vbo_exec_array.c:619
#12 0xb6f335b4 in vbo_exec_DrawArrays (mode=6, start=0, count=4) at src/mesa/vbo/vbo_exec_array.c:649
#13 0x08048cdf in main (argc=1, argv=0xbffff944) at mesa-test/shader1.c:127
Comment 2 Roland Scheidegger 2013-02-18 12:40:58 UTC
Should be fixed by c25ae5d27b114e23d5734f846002df1a05759658.
Comment 3 Roland Scheidegger 2013-02-18 12:41:39 UTC
(In reply to comment #2)
> Should be fixed by c25ae5d27b114e23d5734f846002df1a05759658.
Sorry that was wrong bug.
Comment 4 Roland Scheidegger 2013-02-19 01:37:55 UTC
Created attachment 75081 [details] [review]
fix indirect fetches requiring bitcasts

The problem is the generated shader contains indirect constant fetches with signed/unsigned src components (btw the tgsi shader looks really complicated):
 12:   UARL ADDR[0].x, TEMP[4].xxxx
 13:   I2F TEMP[9].xy, CONST[ADDR[0].x+3].xyyy

This patch should fix it and indeed it no longer aborts - instead it segfaults within the compiled shader...
Dunno if that's because the test program doesn't quite seem correct:
Mesa: User error: GL_INVALID_ENUM in glGetString(GL_EXTENSIONS)
Mesa: User error: GL_INVALID_OPERATION in glVertexAttribPointer(no array object bound)

In any case it looked like not only indirect constant fetches wouldn't work but indirect temp fetches wouldn't work neither so this should fix both.
Comment 5 Brian Crowell 2013-02-19 03:25:20 UTC
You are quite awesome. The shader runs on my system now.

It's odd, though. On my system, my Mesa doesn't report any errors from
the test program. In fact, in my larger test case, not only does the
shader run, but I see that some Mesa bugs have been fixed since 8.0.5
and the shader output is no longer clamped or incorrectly quantized
(awesome!). This is after applying your patch to master.

You're probably segfaulting because the glVertexAttribPointer isn't
going through, but the GL_INVALID_OPERATION from it is especially
weird, because it would have to be complaining about a missing VAO,
and VAO's aren't around in OpenGL 2.1.

I'm worried when you say the TGSI shader is complicated. Is there
something I'm doing that's unnecessarily complicated?

On Mon, Feb 18, 2013 at 7:37 PM,  <bugzilla-daemon@freedesktop.org> wrote:
> Comment # 4 on bug 61036 from Roland Scheidegger
>
> Created attachment 75081 [details] [review] [review]
> fix indirect fetches requiring bitcasts
>
> The problem is the generated shader contains indirect constant fetches with
> signed/unsigned src components (btw the tgsi shader looks really
> complicated):
>  12:   UARL ADDR[0].x, TEMP[4].xxxx
>  13:   I2F TEMP[9].xy, CONST[ADDR[0].x+3].xyyy
>
> This patch should fix it and indeed it no longer aborts - instead it
> segfaults
> within the compiled shader...
> Dunno if that's because the test program doesn't quite seem correct:
> Mesa: User error: GL_INVALID_ENUM in glGetString(GL_EXTENSIONS)
> Mesa: User error: GL_INVALID_OPERATION in glVertexAttribPointer(no array
> object
> bound)
>
> In any case it looked like not only indirect constant fetches wouldn't work
> but
> indirect temp fetches wouldn't work neither so this should fix both.
>
> ________________________________
> You are receiving this mail because:
>
> You reported the bug.
Comment 6 Roland Scheidegger 2013-02-19 03:56:45 UTC
(In reply to comment #5)
> You are quite awesome. The shader runs on my system now.
> 
> It's odd, though. On my system, my Mesa doesn't report any errors from
> the test program.
Well if you don't compile with debug option you won't see those, as you'd only see errors if you query them from the app.

> In fact, in my larger test case, not only does the
> shader run, but I see that some Mesa bugs have been fixed since 8.0.5
> and the shader output is no longer clamped or incorrectly quantized
> (awesome!). This is after applying your patch to master.
Yes I think this is not very surprising.

> You're probably segfaulting because the glVertexAttribPointer isn't
> going through, but the GL_INVALID_OPERATION from it is especially
> weird, because it would have to be complaining about a missing VAO,
> and VAO's aren't around in OpenGL 2.1.

Yes that's probably one of the cases where we can't avoid crash.

> I'm worried when you say the TGSI shader is complicated. Is there
> something I'm doing that's unnecessarily complicated?
It's probably just glsl to tgsi (or glsl compiler itself) doing a somewhat poor job.
For this rather simple shader,

"void main() {\n"
"    gl_Position = vec4(position * vec2(2.0, 2.0) + vec2(-1.0, -1.0), 0.0, 1.0);\n"
"    frame_coord = position * vec2(frame_size) + frame_offset;\n"
"\n"
"    for( int i = 0; i < input_count; i++ ) {\n"
"        tex_coord[i] = position * vec2(frame_size) + frame_offset + tex_offset[i];\n"
"    }\n"
"}\n";

the generated tgsi looked like this:

IMM[0] FLT32 {    0.0000,     1.0000,     2.0000,    -1.0000}
IMM[1] INT32 {0, 1, 0, 0}
  0: MOV TEMP[1].zw, IMM[0].yyxy
  1: MAD TEMP[1].xy, IN[0].xyyy, IMM[0].zzzz, IMM[0].wwww
  2: MOV TEMP[3], TEMP[1]
  3: MOV TEMP[4].x, IMM[1].xxxx
  4: BGNLOOP :0
  5:   ISGE TEMP[5].x, TEMP[4].xxxx, CONST[2].xxxx
  6:   IF TEMP[5].xxxx :0
  7:     BRK
  8:   ENDIF
  9:   I2F TEMP[6].xy, CONST[1].xyyy
 10:   I2F TEMP[7].xy, CONST[0].xyyy
 11:   MAD TEMP[8].xy, IN[0].xyyy, TEMP[6].xyyy, TEMP[7].xyyy
 12:   UARL ADDR[0].x, TEMP[4].xxxx
 13:   I2F TEMP[9].xy, CONST[ADDR[0].x+3].xyyy
 14:   UARL ADDR[0].x, TEMP[4].xxxx
 15:   ADD TEMP[ADDR[0].x+11].xy, TEMP[8].xyyy, TEMP[9].xyyy
 16:   UADD TEMP[4].x, TEMP[4].xxxx, IMM[1].yyyy
 17: ENDLOOP :0
 18: MOV TEMP[20].xy, TEMP[11].xyxx
 19: MOV TEMP[20].zw, TEMP[12].yyxy
 20: MOV TEMP[21].xy, TEMP[13].xyxx
 21: MOV TEMP[21].zw, TEMP[14].yyxy
 22: MOV TEMP[22].xy, TEMP[15].xyxx
 23: MOV TEMP[22].zw, TEMP[16].yyxy
 24: MOV TEMP[23].xy, TEMP[17].xyxx
 25: MOV TEMP[23].zw, TEMP[18].yyxy
 26: MOV OUT[1], TEMP[20]
 27: MOV OUT[2], TEMP[21]
 28: MOV OUT[3], TEMP[22]
 29: MOV OUT[0], TEMP[3]
 30: MOV OUT[4], TEMP[23]
 31: END

(leaving out all the DCL stuff obviously).
But on closer look it isn't really all that complicated - the I2F and 2->4 element vector stuff just blows the instruction count up. Nothing to worry about really.
Comment 7 Brian Crowell 2013-02-19 05:48:59 UTC
On Mon, Feb 18, 2013 at 9:56 PM,  <bugzilla-daemon@freedesktop.org> wrote:
>> It's odd, though. On my system, my Mesa doesn't report any errors from
>> the test program.
> Well if you don't compile with debug option you won't see those, as you'd
> only
> see errors if you query them from the app.

I'm using the debug build from master, and glGetError returns zero
after that line. By my (admittedly poor) understanding of GL, the
behavior you're seeing only applies at GL 3.1 core or later, which is
when a VAO binding was required. Which would be weird in the llvmpipe.


> But on closer look it isn't really all that complicated - the I2F and 2->4
> element vector stuff just blows the instruction count up. Nothing to worry
> about really.

Good to know I'm not doing anything wacky.

Anyhow, I'll stop bugging you. Thanks so much for the fast fix.
Comment 8 Roland Scheidegger 2013-02-19 16:59:43 UTC
(In reply to comment #7)
> On Mon, Feb 18, 2013 at 9:56 PM,  <bugzilla-daemon@freedesktop.org> wrote:
> >> It's odd, though. On my system, my Mesa doesn't report any errors from
> >> the test program.
> > Well if you don't compile with debug option you won't see those, as you'd
> > only
> > see errors if you query them from the app.
> 
> I'm using the debug build from master, and glGetError returns zero
> after that line. By my (admittedly poor) understanding of GL, the
> behavior you're seeing only applies at GL 3.1 core or later, which is
> when a VAO binding was required. Which would be weird in the llvmpipe.

You're right my bad I was using some GL version overrides by accident.
Comment 9 Roland Scheidegger 2013-02-20 18:40:35 UTC
Fixed by 83f7cde1821d8e004d49fb966a323c037631b9a2.