Bug 35603

Summary: GLSL compiler freezes compiling shaders
Product: Mesa Reporter: sheeettin
Component: glsl-compilerAssignee: Ian Romanick <idr>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: kenneth, mat
Version: 7.10   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description sheeettin 2011-03-23 13:31:54 UTC
Summary says it all. Start Cube 2: Sauerbraten in anything other than fixed-function mode; always freeze. (I'm using Cube 2 SVN, but it shouldn't matter.)

Backtrace in frozen state: 

#0  __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:69
#1  0x00007fffef6033a5 in ralloc_vasprintf_append (str=0x7ffffffe6658, fmt=0x7fffef71c384 "%s%d.%02d%s", args=0x7ffffffe6570) at ralloc.c:432
#2  0x00007fffef6034d4 in ralloc_asprintf_append (str=<value optimized out>, fmt=<value optimized out>) at ralloc.c:413
#3  0x00007fffef54dac9 in _mesa_glsl_parse_state::_mesa_glsl_parse_state (this=0x1147820, ctx=<value optimized out>, target=<value optimized out>, mem_ctx=<value optimized out>) at glsl_parser_extras.cpp:110
#4  0x00007fffef609bf7 in read_builtins (target=35633, protos=0x7fffef762820 "(\n(function radians\n  (signature float\n    (parameters\n      (declare (in) float degrees))\n    ())\n  (signature vec2\n    (parameters\n      (declare (in) vec2 degrees))\n    ())\n  (signature vec3\n    (p"..., functions=0x7fffefa43ba0, count=72) at builtin_function.cpp:42
#5  0x00007fffef609da0 in _mesa_read_profile (state=0x1147170, profile_index=3, prototypes=<value optimized out>, functions=<value optimized out>, count=<value optimized out>, instructions=<value optimized out>) at builtin_function.cpp:13535
#6  0x00007fffef60a011 in _mesa_glsl_initialize_functions (instructions=<value optimized out>, state=0x1147170) at builtin_function.cpp:13580
#7  0x00007fffef60467a in _mesa_ast_to_hir (instructions=0x112c380, state=0x1147170) at ast_to_hir.cpp:63
#8  0x00007fffef602420 in _mesa_glsl_compile_shader (ctx=0x1071370, shader=0x1142b70) at program/ir_to_mesa.cpp:2973
#9  0x0000000000559a86 in compileglslshader (type=35633, obj=@0x14cbc58, def=0x11f0f40 "\n    void main(void)\n    {\n        gl_Position = ftransform();\n        gl_TexCoord[0] = gl_MultiTexCoord0;\n        gl_FrontColor = gl_Color;\n    }\n", tname=0x60e5bf "VS", name=0x10fc0b0 "default", msg=true) at engine/shader.cpp:160
#10 0x000000000055c718 in Shader::compile (this=0x14cbc28) at engine/shader.cpp:756
#11 0x000000000055d6d6 in newshader (type=4, name=0x1266f40 "default", vs=0x11f0e20 "\n    void main(void)\n    {\n        gl_Position = ftransform();\n        gl_TexCoord[0] = gl_MultiTexCoord0;\n        gl_FrontColor = gl_Color;\n    }\n", ps=0x1147710 "\n    uniform sampler2D tex0;\n    void main(void)\n    {\n        gl_FragColor = gl_Color * texture2D(tex0, gl_TexCoord[0].xy);\n    }\n", variant=0x0, row=0) at engine/shader.cpp:882
#12 0x000000000056168c in shader (type=0x7fffffffda80, name=0x1266f40 "default", vs=0x11f0e20 "\n    void main(void)\n    {\n        gl_Position = ftransform();\n        gl_TexCoord[0] = gl_MultiTexCoord0;\n        gl_FrontColor = gl_Color;\n    }\n", ps=0x1147710 "\n    uniform sampler2D tex0;\n    void main(void)\n    {\n        gl_FragColor = gl_Color * texture2D(tex0, gl_TexCoord[0].xy);\n    }\n") at engine/shader.cpp:1575
#13 0x00000000004341b7 in runcode (code=0x126706c, result=...) at engine/command.cpp:1433
#14 0x0000000000436148 in execute (    p=0x13c3dc0 "// standard shader definitions\n\nlazyshader = [\n    defershader $arg1 $arg2 [\n        shader @arg1 @arg2 [@@arg3] [@@arg4]\n    ]\n]\n\nlmcoordscale = (divf 1 32767)\n\n", '/' <repeats 38 times>...) at engine/command.cpp:1660
#15 0x0000000000436267 in execfile (cfgfile=0x60e340 "data/glsl.cfg", msg=true) at engine/command.cpp:1679
#16 0x0000000000559468 in loadshaders () at engine/shader.cpp:67
#17 0x00000000004667c1 in main (argc=7, argv=0x7fffffffe258) at engine/main.cpp:1085

The file containing the shaders: http://sauerbraten.svn.sourceforge.net/viewvc/sauerbraten/data/glsl.cfg?content-type=text%2Fplain
(The freeze happens on the very first shader, "default", which is not complicated at all.)
Comment 1 Ian Romanick 2011-03-23 15:42:40 UTC
Looking at that backtrace, the compiler is still bootstrapping itself.  I don't think that it has gotten into Sauerbraten's shader at all.  This makes me a bit suspicious about your Mesa build.  Is this the distro install, or did you build it yourself?
Comment 2 sheeettin 2011-03-23 18:24:07 UTC
The distribution's (Ubuntu Natty; mesa 7.10.1-0ubuntu3).
Comment 3 sheeettin 2011-03-24 11:57:15 UTC
Yeah, exact same behavior when switching Blender's display to GLSL as well. Definitely not a Cube 2 issue.
Comment 4 Kenneth Graunke 2011-03-25 00:03:22 UTC
Cube 2: Sauerbraten works fine for me with the 7.10 branch, i965 driver, on Arch Linux.  I don't have an Ubuntu box handy (much less an alpha).

What driver are you using?  The only thing I can think of is that the driver isn't setting ctx->Const.GLSLVersion and it contains (large) rubbish, so it's spinning around the loop of supported versions for a really really long time.  The code your backtrace points to is just doing simple string appends...
Comment 5 Mathieu Virbel 2011-05-17 03:36:42 UTC
Hi, i got one user that have exactly the same backtrace. It's not about the game Sauerbraten, but the Framework Kivy.

OpenGL information:
* OpenGL version 2.1 Mesa 7.10.2
* OpenGL vendor Tungsten Graphics, Inc
* OpenGL renderer Mesa DRI Intel(R) 965GM GEM 20100330 DEVELOPMENT x86/MMX/SSE2

The shader come from:
* https://github.com/tito/kivy/raw/master/kivy/data/glsl/default.fs
* https://github.com/tito/kivy/raw/master/kivy/data/glsl/default.vs

And here is the backtrace :

(gdb) bt
#0  0xb7cf076f in ?? () from /lib/i386-linux-gnu/libc.so.6
#1  0xb5d56957 in ralloc_vasprintf_append () from /usr/lib/dri/libglsl.so
#2  0xb5d56a4b in ralloc_asprintf_append () from /usr/lib/dri/libglsl.so
#3  0xb5d77cd2 in _mesa_glsl_parse_state::_mesa_glsl_parse_state(gl_context*, unsigned int, void*) () from /usr/lib/dri/libglsl.so
#4  0xb5d67533 in read_builtins(unsigned int, char const*, char const**, unsigned int) () from /usr/lib/dri/libglsl.so
#5  0xb5d67704 in ?? () from /usr/lib/dri/libglsl.so
#6  0xb5d67adf in _mesa_glsl_initialize_functions(exec_list*, _mesa_glsl_parse_state*) () from /usr/lib/dri/libglsl.so
#7  0xb5d61725 in _mesa_ast_to_hir(exec_list*, _mesa_glsl_parse_state*) ()
   from /usr/lib/dri/libglsl.so
#8  0xb5fbfcb2 in _mesa_glsl_compile_shader () from /usr/lib/dri/libdricore.so
#9  0xb5ec96bd in ?? () from /usr/lib/dri/libdricore.so
#10 0xb758e03c in __pyx_f_4kivy_8graphics_6shader_6Shader_compile_shader (
    __pyx_v_self=0x85c8ca4, 
    __pyx_v_source=0x83af65c "#ifdef GL_ES\n    precision highp float;\n#endif\n\n/* Outputs from the vertex shader */\nvarying vec4 frag_color;\nvarying vec2 tex_coord0;\n\n/* uniform texture samplers */\nuniform sampler2D texture0;\n\nvoid"..., __pyx_v_shadertype=0x853966c)
    at /tmp/easy_install-mAI7rI/Kivy-1.0.6/kivy/graphics/shader.c:2655
#11 0xb758d62e in __pyx_f_4kivy_8graphics_6shader_6Shader_build_fragment (
    __pyx_v_self=0x85c8ca4)
    at /tmp/easy_install-mAI7rI/Kivy-1.0.6/kivy/graphics/shader.c:2305
#12 0xb758b1da in __pyx_pf_4kivy_8graphics_6shader_6Shader_2fs_1__set__ (
    o=0x85c8ca4, v=0x82417d8, x=0x0)
    at /tmp/easy_install-mAI7rI/Kivy-1.0.6/kivy/graphics/shader.c:3230
#13 __pyx_setprop_4kivy_8graphics_6shader_6Shader_fs (o=0x85c8ca4, 
    v=0x82417d8, x=0x0)
    at /tmp/easy_install-mAI7rI/Kivy-1.0.6/kivy/graphics/shader.c:3366
Comment 6 Mathieu Virbel 2011-05-17 06:49:18 UTC
The previous backtrace is from someone under : 2.6.38-8-generic-pae #42-Ubuntu SMP  Ubuntu 11.04 (32 bits version so.)

I got another guy under x86_64 / mageia, and even if the backtrace doesn't look like the same, it look like it stuck at compilation in strlen(). Not entirely sure it's related to this bug : http://pastebin.com/Rz8T0zFw
Comment 7 Mathieu Virbel 2011-05-21 15:08:42 UTC
Got another user that have the same backtrace too... http://paste.pocoo.org/show/392792/

What information is needed to be able to look closer at the bug ?

(In reply to comment #5)
> Hi, i got one user that have exactly the same backtrace. It's not about the
> game Sauerbraten, but the Framework Kivy.
> 
> OpenGL information:
> * OpenGL version 2.1 Mesa 7.10.2
> * OpenGL vendor Tungsten Graphics, Inc
> * OpenGL renderer Mesa DRI Intel(R) 965GM GEM 20100330 DEVELOPMENT x86/MMX/SSE2
> 
> The shader come from:
> * https://github.com/tito/kivy/raw/master/kivy/data/glsl/default.fs
> * https://github.com/tito/kivy/raw/master/kivy/data/glsl/default.vs
> 
> And here is the backtrace :
> 
> (gdb) bt
> #0  0xb7cf076f in ?? () from /lib/i386-linux-gnu/libc.so.6
> #1  0xb5d56957 in ralloc_vasprintf_append () from /usr/lib/dri/libglsl.so
> #2  0xb5d56a4b in ralloc_asprintf_append () from /usr/lib/dri/libglsl.so
> #3  0xb5d77cd2 in _mesa_glsl_parse_state::_mesa_glsl_parse_state(gl_context*,
> unsigned int, void*) () from /usr/lib/dri/libglsl.so
> #4  0xb5d67533 in read_builtins(unsigned int, char const*, char const**,
> unsigned int) () from /usr/lib/dri/libglsl.so
> #5  0xb5d67704 in ?? () from /usr/lib/dri/libglsl.so
> #6  0xb5d67adf in _mesa_glsl_initialize_functions(exec_list*,
> _mesa_glsl_parse_state*) () from /usr/lib/dri/libglsl.so
> #7  0xb5d61725 in _mesa_ast_to_hir(exec_list*, _mesa_glsl_parse_state*) ()
>    from /usr/lib/dri/libglsl.so
> #8  0xb5fbfcb2 in _mesa_glsl_compile_shader () from /usr/lib/dri/libdricore.so
> #9  0xb5ec96bd in ?? () from /usr/lib/dri/libdricore.so
> #10 0xb758e03c in __pyx_f_4kivy_8graphics_6shader_6Shader_compile_shader (
>     __pyx_v_self=0x85c8ca4, 
>     __pyx_v_source=0x83af65c "#ifdef GL_ES\n    precision highp
> float;\n#endif\n\n/* Outputs from the vertex shader */\nvarying vec4
> frag_color;\nvarying vec2 tex_coord0;\n\n/* uniform texture samplers
> */\nuniform sampler2D texture0;\n\nvoid"..., __pyx_v_shadertype=0x853966c)
>     at /tmp/easy_install-mAI7rI/Kivy-1.0.6/kivy/graphics/shader.c:2655
> #11 0xb758d62e in __pyx_f_4kivy_8graphics_6shader_6Shader_build_fragment (
>     __pyx_v_self=0x85c8ca4)
>     at /tmp/easy_install-mAI7rI/Kivy-1.0.6/kivy/graphics/shader.c:2305
> #12 0xb758b1da in __pyx_pf_4kivy_8graphics_6shader_6Shader_2fs_1__set__ (
>     o=0x85c8ca4, v=0x82417d8, x=0x0)
>     at /tmp/easy_install-mAI7rI/Kivy-1.0.6/kivy/graphics/shader.c:3230
> #13 __pyx_setprop_4kivy_8graphics_6shader_6Shader_fs (o=0x85c8ca4, 
>     v=0x82417d8, x=0x0)
>     at /tmp/easy_install-mAI7rI/Kivy-1.0.6/kivy/graphics/shader.c:3366
Comment 8 Kenneth Graunke 2011-05-21 18:46:22 UTC
Where did you get your Mesa from?  Stock Ubuntu Natty distro packages, or what?

Clearly your Mesa has been patched somehow - patched Mesa 7.10.2 doesn't mention "GEM 20100330 DEVELOPMENT" in the renderer string.  I can't reproduce any of this with unpatched, vanilla Mesa and I have _no idea_ what could cause this.  I'm wondering if there's a bad patch in your distro package or something.
Comment 9 Mathieu Virbel 2011-05-22 02:52:45 UTC
(In reply to comment #8)
> Where did you get your Mesa from?  Stock Ubuntu Natty distro packages, or what?
> 
> Clearly your Mesa has been patched somehow - patched Mesa 7.10.2 doesn't
> mention "GEM 20100330 DEVELOPMENT" in the renderer string.  I can't reproduce
> any of this with unpatched, vanilla Mesa and I have _no idea_ what could cause
> this.  I'm wondering if there's a bad patch in your distro package or
> something.

Thomas said it's the default version shipped into Ubuntu Natty.
Comment 10 Mathieu Virbel 2011-05-26 07:22:26 UTC
Ok, i got one computer to play with and track down this issue. (Note: this is happening quite often, like 99%, but sometime, it works.)

After looking at mesa commit, i've see this one:

"glsl: Fix memory error when creating the supported version string."

http://cgit.freedesktop.org/mesa/mesa/commit/?id=a7d350790b4d0416117bc785aa77de52e9298a01

By looking at the message commit, it look like this is exactly the issue we actually got, and... after checking on my ubuntu version / source code:

$ apt-cache show libgl1-mesa-dri|grep Version
Version: 7.10.2-0ubuntu2

After getting source code, i've see that they sill using:
char *supported = (char*)ralloc_context(this);

Could it be that ?
Comment 11 Mathieu Virbel 2011-05-26 08:30:47 UTC
Ok, no, after recompiling, the error is still here. However, when i look at the generated supported string, it's:

(gdb) p supported
$21 = 0x8baf658 "1.10, 1.20, 1.30, 1.40, 1.50, 1.60, 1.70, 1.80, 1.90, 2.00, 2.10, 2.20, 2.30, 2.40, 2.50, 2.60, 2.70, 2.80, 2.90, 3.00, 3.10, 3.20, 3.30, 3.40, 3.50, 3.60, 3.70, 3.80, 3.90, 4.00, 4.10, 4.20, 4.30, 4."...

After looking how the loop is done, even if i can't print highest_version, here is the thing:

(gdb) p ctx->API == API_OPENGL
$30 = true
(gdb) p (unsigned int)ctx->Const.GLSLVersion
$31 = 135840965

lowest_version will be always 100 or 110
but with a such big highest_version, the string generated is too big to be supported ? :/

(gdb) call (unsigned int)strlen(supported)
$35 = 3083806448

Checking deeper where this const is set...
Comment 12 Mathieu Virbel 2011-05-26 09:05:59 UTC
Ok, i got it, this is due to the changes done by 
http://cgit.freedesktop.org/mesa/mesa/commit/?id=14880a510a1a288df0778395097d5a52806abfb0

All is happening in src/glsl/builtin_function.cpp. At the start, a fakeCtx is declared (and not initialized...) and used directly to _mesa_glsl_parse_state.
This uninitialized fakeCtx was enough before, but the previous commit is using more information into it, aka, the Const.GLSLVersion.

Unfortunately, this code is not current mesa trunk, but is actually used in mesa 7.10.2.


(In reply to comment #11)
> Ok, no, after recompiling, the error is still here. However, when i look at the
> generated supported string, it's:
> 
> (gdb) p supported
> $21 = 0x8baf658 "1.10, 1.20, 1.30, 1.40, 1.50, 1.60, 1.70, 1.80, 1.90, 2.00,
> 2.10, 2.20, 2.30, 2.40, 2.50, 2.60, 2.70, 2.80, 2.90, 3.00, 3.10, 3.20, 3.30,
> 3.40, 3.50, 3.60, 3.70, 3.80, 3.90, 4.00, 4.10, 4.20, 4.30, 4."...
> 
> After looking how the loop is done, even if i can't print highest_version, here
> is the thing:
> 
> (gdb) p ctx->API == API_OPENGL
> $30 = true
> (gdb) p (unsigned int)ctx->Const.GLSLVersion
> $31 = 135840965
> 
> lowest_version will be always 100 or 110
> but with a such big highest_version, the string generated is too big to be
> supported ? :/
> 
> (gdb) call (unsigned int)strlen(supported)
> $35 = 3083806448
> 
> Checking deeper where this const is set...
Comment 13 Mathieu Virbel 2011-05-26 10:03:41 UTC
--- src/glsl/builtin_function.cpp.before	2011-05-26 18:54:01.299953729 +0200
+++ src/glsl/builtin_function.cpp	2011-05-26 18:54:44.730017427 +0200
@@ -37,6 +37,8 @@
 {
    struct gl_context fakeCtx;
    fakeCtx.API = API_OPENGL;
+   fakeCtx.Const.GLSLVersion = 120;
+   fakeCtx.Extensions.ARB_ES2_compatibility = 0;
    gl_shader *sh = _mesa_new_shader(NULL, 0, target);
    struct _mesa_glsl_parse_state *st =
       new(sh) _mesa_glsl_parse_state(&fakeCtx, target, sh);


Is that could be ok for read_builtins as a temporary fix for ubuntu mesa 7.10.2 ?

(In reply to comment #12)
> Ok, i got it, this is due to the changes done by 
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=14880a510a1a288df0778395097d5a52806abfb0
> 
> All is happening in src/glsl/builtin_function.cpp. At the start, a fakeCtx is
> declared (and not initialized...) and used directly to _mesa_glsl_parse_state.
> This uninitialized fakeCtx was enough before, but the previous commit is using
> more information into it, aka, the Const.GLSLVersion.
> 
> Unfortunately, this code is not current mesa trunk, but is actually used in
> mesa 7.10.2.
> 
> 
> (In reply to comment #11)
> > Ok, no, after recompiling, the error is still here. However, when i look at the
> > generated supported string, it's:
> > 
> > (gdb) p supported
> > $21 = 0x8baf658 "1.10, 1.20, 1.30, 1.40, 1.50, 1.60, 1.70, 1.80, 1.90, 2.00,
> > 2.10, 2.20, 2.30, 2.40, 2.50, 2.60, 2.70, 2.80, 2.90, 3.00, 3.10, 3.20, 3.30,
> > 3.40, 3.50, 3.60, 3.70, 3.80, 3.90, 4.00, 4.10, 4.20, 4.30, 4."...
> > 
> > After looking how the loop is done, even if i can't print highest_version, here
> > is the thing:
> > 
> > (gdb) p ctx->API == API_OPENGL
> > $30 = true
> > (gdb) p (unsigned int)ctx->Const.GLSLVersion
> > $31 = 135840965
> > 
> > lowest_version will be always 100 or 110
> > but with a such big highest_version, the string generated is too big to be
> > supported ? :/
> > 
> > (gdb) call (unsigned int)strlen(supported)
> > $35 = 3083806448
> > 
> > Checking deeper where this const is set...
Comment 14 Kenneth Graunke 2011-05-26 14:46:40 UTC
Mathieu, thanks so much for this detailed analysis!

As you found, there are two issues:

First, we missed commit a7d350790b4 (ralloc_context -> ralloc_strdup) when cherry-picking patches to the 7.10 branch.  I went ahead and cherry-picked it today.

Secondly, builtin_function.cpp in the _release tarballs_ is not setting Const.GLSLVersion and Extensions.ARB_ES2_compatibility correctly.  This results in the highest supported language version being undefined, and thus huge, so it generates an infinitely long string.  I fixed this in dfdb9fda8 on master, which was cherry-picked to 7.10 as ab58b21634 ages ago.

The problem is that builtin_function.cpp has __not been regenerated in the tarballs__, so it doesn't contain the bug fix (made in generate_builtins.py).  If you build from git, the file is properly generated and works fine---which is why none of us saw the issues.

Reassigning to Ian since he made the tarballs.  Ian, what happened?
Comment 15 Kenneth Graunke 2011-05-31 12:08:13 UTC
Embarassing.  We forgot to run 'make builtins' after cherry-picking.  I just did that and pushed.

So now we just need a 7.10.3 release.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.