Bug 71870 - Metro: Last Light rendering issues
Summary: Metro: Last Light rendering issues
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Mesa core (show other bugs)
Version: git
Hardware: All All
: medium normal
Assignee: mesa-dev
QA Contact:
URL:
Whiteboard:
Keywords:
: 71239 (view as bug list)
Depends on:
Blocks:
 
Reported: 2013-11-21 11:05 UTC by Tapani Pälli
Modified: 2014-02-21 10:31 UTC (History)
9 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Metro LL runnung on intel mesa (1.19 MB, image/png)
2014-02-15 14:05 UTC, Darius Spitznagel
Details
Patch for ignoring macros starting with __ (630 bytes, patch)
2014-02-16 12:49 UTC, Mike Lothian
Details | Splinter Review
ast_to_hir.cpp also needs a patch? (740 bytes, patch)
2014-02-18 05:43 UTC, Krzysztof A. Sobiecki
Details | Splinter Review

Description Tapani Pälli 2013-11-21 11:05:28 UTC
Game has several rendering issues, some of they may be Mesa Core but filing bug for i965 component for now.

Discovered issues:

1. all lighting and shadows are missing making the game look terrible
2. a flash of 'outside world' visible in the metro tunnel when game menu comes up
3. text/graphics missing from game menus
4. sudden graphics glitches, looks like parts of low resolution fbo rendered on top of scene now and then when moving the camera
5. flashes of objects rendered with solid red color (just when they become visible for the first time?)

I've dumped all shaders of the game and they look like converted from D3D to GL, I don't see anything obvious there but then again there's ~1600 of them to look at. Game requires at least OpenGL 3.2 and GLSL 1.50. I have also apitrace dump which reveals that at least DXT1 and DXT5 formats are utilized. There's quite a lot of small textures which look like would contribute to shadow/lighting. DXT textures cannot be shown by apitrace gui so I'll try to get raw dumps from the driver to see what's in them.
Comment 1 Tapani Pälli 2013-11-21 11:06:38 UTC
There's related bug #71239 but that's for issues with r600g.
Comment 2 Tapani Pälli 2013-11-22 06:19:55 UTC
update .. now Metro has started to crash on startup, bisected this to:

commit 068a073c1d4853b5c8f33efdeb481026f42e23a5
Author: Paul Berry <stereotype441@gmail.com>
Date:   Tue Nov 19 13:31:20 2013 -0800

    meta: fix meta clear of layered framebuffers

backtrace:

(gdb) bt
#0  0xf7747430 in __kernel_vsyscall ()
#1  0xf73c11ef in ?? ()
#2  0xf441f0fb in intel_upload_map () from /opt/lib/dri/i965_dri.so
#3  0xf441f1bf in intel_upload_unmap () from /opt/lib/dri/i965_dri.so
#4  0xf44a86d2 in brw::vec4_visitor::get_timestamp() () from /opt/lib/dri/i965_dri.so
#5  0xf4443ab2 in brw_draw_destroy () from /opt/lib/dri/i965_dri.so
#6  0xf4443e41 in brw_get_vertex_surface_type () from /opt/lib/dri/i965_dri.so
#7  0xf42736b8 in vbo_VertexAttribP4uiv () from /opt/lib/dri/i965_dri.so
#8  0xf4274402 in vbo_VertexAttribP4uiv () from /opt/lib/dri/i965_dri.so
#9  0xf42747c7 in vbo_Materialfv () from /opt/lib/dri/i965_dri.so

If I revert this (and bb354c6c279031dafc08029a62cd3e76a6c1ca71) game works fine again. Will investigate what goes wrong.
Comment 3 Paul Berry 2013-11-22 14:57:34 UTC
(In reply to comment #2)
> update .. now Metro has started to crash on startup, bisected this to:
> 
> commit 068a073c1d4853b5c8f33efdeb481026f42e23a5
> Author: Paul Berry <stereotype441@gmail.com>
> Date:   Tue Nov 19 13:31:20 2013 -0800
> 
>     meta: fix meta clear of layered framebuffers
> 
> backtrace:
> 
> (gdb) bt
> #0  0xf7747430 in __kernel_vsyscall ()
> #1  0xf73c11ef in ?? ()
> #2  0xf441f0fb in intel_upload_map () from /opt/lib/dri/i965_dri.so
> #3  0xf441f1bf in intel_upload_unmap () from /opt/lib/dri/i965_dri.so
> #4  0xf44a86d2 in brw::vec4_visitor::get_timestamp() () from
> /opt/lib/dri/i965_dri.so
> #5  0xf4443ab2 in brw_draw_destroy () from /opt/lib/dri/i965_dri.so
> #6  0xf4443e41 in brw_get_vertex_surface_type () from
> /opt/lib/dri/i965_dri.so
> #7  0xf42736b8 in vbo_VertexAttribP4uiv () from /opt/lib/dri/i965_dri.so
> #8  0xf4274402 in vbo_VertexAttribP4uiv () from /opt/lib/dri/i965_dri.so
> #9  0xf42747c7 in vbo_Materialfv () from /opt/lib/dri/i965_dri.so
> 
> If I revert this (and bb354c6c279031dafc08029a62cd3e76a6c1ca71) game works
> fine again. Will investigate what goes wrong.

The function calls in this backtrace don't make sense (looking at the source code, vbo_VertexAttribP4uiv() never calls brw_get_vertex_surface_type(), brw_get_vertex_surface_type() never calls brw_draw_destroy(), and so on).  Is it possible that you've built libdrm without debugging symbols, and that's causing you to get a corrupted backtrace?

If you can provide an apitrace that reproduces the problem, I'd be glad to help debug this.
Comment 4 Tapani Pälli 2013-11-22 15:40:00 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > update .. now Metro has started to crash on startup, bisected this to:
> > 
> > commit 068a073c1d4853b5c8f33efdeb481026f42e23a5
> > Author: Paul Berry <stereotype441@gmail.com>
> > Date:   Tue Nov 19 13:31:20 2013 -0800
> > 
> >     meta: fix meta clear of layered framebuffers
> > 
> > backtrace:
> > 
> > (gdb) bt
> > #0  0xf7747430 in __kernel_vsyscall ()
> > #1  0xf73c11ef in ?? ()
> > #2  0xf441f0fb in intel_upload_map () from /opt/lib/dri/i965_dri.so
> > #3  0xf441f1bf in intel_upload_unmap () from /opt/lib/dri/i965_dri.so
> > #4  0xf44a86d2 in brw::vec4_visitor::get_timestamp() () from
> > /opt/lib/dri/i965_dri.so
> > #5  0xf4443ab2 in brw_draw_destroy () from /opt/lib/dri/i965_dri.so
> > #6  0xf4443e41 in brw_get_vertex_surface_type () from
> > /opt/lib/dri/i965_dri.so
> > #7  0xf42736b8 in vbo_VertexAttribP4uiv () from /opt/lib/dri/i965_dri.so
> > #8  0xf4274402 in vbo_VertexAttribP4uiv () from /opt/lib/dri/i965_dri.so
> > #9  0xf42747c7 in vbo_Materialfv () from /opt/lib/dri/i965_dri.so
> > 
> > If I revert this (and bb354c6c279031dafc08029a62cd3e76a6c1ca71) game works
> > fine again. Will investigate what goes wrong.
> 
> The function calls in this backtrace don't make sense (looking at the source
> code, vbo_VertexAttribP4uiv() never calls brw_get_vertex_surface_type(),
> brw_get_vertex_surface_type() never calls brw_draw_destroy(), and so on). 
> Is it possible that you've built libdrm without debugging symbols, and
> that's causing you to get a corrupted backtrace?

True, I'll try to get proper backtrace. This triggers through brw_clear as I can workaround by using the non-glsl clear (the one for gles1)


> If you can provide an apitrace that reproduces the problem, I'd be glad to
> help debug this.

Yes I can do this
Comment 5 Tapani Pälli 2013-11-24 16:29:34 UTC
hmm still getting weird backtrace with unstripped libdrm, I noticed also on console this message:

MetroLL: brw_state_batch.c:127: brw_state_batch: Assertion `size < batch->bo->size' failed.

Paul, I'll get a apitrace for you tomorrow and debug some more.
Comment 6 Tapani Pälli 2013-11-25 05:52:50 UTC
new backtrace

---- 8< ---
Program received signal SIGABRT, Aborted.
0xf7791430 in __kernel_vsyscall ()
(gdb) bt
#0  0xf7791430 in __kernel_vsyscall ()
#1  0xf73d5aff in raise () from /lib/i386-linux-gnu/libc.so.6
#2  0xf73d9083 in abort () from /lib/i386-linux-gnu/libc.so.6
#3  0xf73ce857 in ?? () from /lib/i386-linux-gnu/libc.so.6
#4  0xf73ce907 in __assert_fail () from /lib/i386-linux-gnu/libc.so.6
#5  0xf44735a8 in brw_state_batch (brw=0xaa7d8d8, type=AUB_TRACE_BINDING_TABLE, size=171503404, alignment=32, out_offset=0xaab85f8) at brw_state_batch.c:127
#6  0xf43e5f85 in brw_upload_binding_table (brw=0xaa7d8d8, brw_new_binding_table=2048, stage_state=0xaab85dc) at brw_binding_tables.c:76
#7  0xf43e6046 in brw_gs_upload_binding_table (brw=0xaa7d8d8) at brw_binding_tables.c:134
#8  0xf4477519 in brw_upload_state (brw=0xaa7d8d8) at brw_state_upload.c:557
#9  0xf440c32c in brw_try_draw_prims (ctx=0xaa7d8d8, arrays=0xaad5224, prims=0xffe8c040, nr_prims=1, ib=0x0, min_index=10, max_index=12) at brw_draw.c:421
#10 0xf440c6ef in brw_draw_prims (ctx=0xaa7d8d8, prims=0xffe8c040, nr_prims=1, ib=0x0, index_bounds_valid=1 '\001', min_index=10, max_index=12, 
    unused_tfb_object=0x0) at brw_draw.c:508
#11 0xf421f38c in vbo_draw_arrays (ctx=0xaa7d8d8, mode=4, start=10, count=3, numInstances=1, baseInstance=0) at vbo/vbo_exec_array.c:660
#12 0xf421fccc in vbo_exec_DrawArrays (mode=4, start=10, count=3) at vbo/vbo_exec_array.c:812
Comment 7 Tapani Pälli 2013-11-25 13:12:23 UTC
Game hits assert in brw_state_batch, the incoming size parameter varies and seems uninitialized data. Seems like this is done by 'mark_surface_used' in vs and fs generators but not sure yet why/how this is not happening.
Comment 8 Paul Berry 2013-11-25 16:35:09 UTC
(In reply to comment #7)
> Game hits assert in brw_state_batch, the incoming size parameter varies and
> seems uninitialized data. Seems like this is done by 'mark_surface_used' in
> vs and fs generators but not sure yet why/how this is not happening.

I haven't been able to reproduce the problem yet, but based on the backtrace I have a guess as to why it's occurring:

We have an existing bug in which brw_gs_upload_binding_table() incorrectly uses a NULL check on brw->gs.prog_data to determine whether a geometry shader is currently active.  This doesn't work: brw->gs.prog_data is non-NULL if a geometry shader has *ever* been active.  The correct way to determine whether a geometry shader is currently active is to check whether brw->geometry_program is NULL.

As a result of this bug, if no geometry shader is inactive, but one was active in the past, brw_gs_upload_binding_table() will refer to stale data from a previously active geometry shader.  Most of the time this is benign because the stale data is not used by the hardware (since no geometry shader is actually active).  However, if the program cache gets flushed, then the stale data now resides in free memory blocks.  If that memory gets reallocated, then we will get data corruption when brw_gs_upload_binding_table() tries to refer to it.

Commit 068a073c1d4853b5c8f33efdeb481026f42e23a5 makes the situation worse because it introduces a geometry shader into the Meta clear program.  As a result, even client applications that don't use geometry shaders will run into this problem.

Tapani, can you please try running Metro again with http://lists.freedesktop.org/archives/mesa-dev/2013-November/049081.html applied?  I'm hoping that this will fix the crash issue and take you back to the state where there are just rendering issues.
Comment 9 Tapani Pälli 2013-11-25 19:30:08 UTC
(In reply to comment #8)
> (In reply to comment #7)
 
> Tapani, can you please try running Metro again with
> http://lists.freedesktop.org/archives/mesa-dev/2013-November/049081.html
> applied?  I'm hoping that this will fix the crash issue and take you back to
> the state where there are just rendering issues.

Yes, just confirmed that this patch fixes the crash
Comment 10 Paul Berry 2013-11-26 21:53:14 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
>  
> > Tapani, can you please try running Metro again with
> > http://lists.freedesktop.org/archives/mesa-dev/2013-November/049081.html
> > applied?  I'm hoping that this will fix the crash issue and take you back to
> > the state where there are just rendering issues.
> 
> Yes, just confirmed that this patch fixes the crash

Ok, the crash is fixed by commit 2714ca81b9bad3dec3894fac97f34502c80b1697
Author: Paul Berry <stereotype441@gmail.com>
Date:   Mon Nov 25 08:03:24 2013 -0800

    i965/gs: Properly skip GS binding table upload when no GS active.
    
    Previously, in brw_gs_upload_binding_table(), we checked whether
    brw->gs.prog_data was NULL in order to determine whether a geometry
    shader was active.  This didn't work: brw->gs.prog_data starts off as
    NULL, but it is set to non-NULL when a geometry shader program is
    built, and then never set to NULL again.  As a result, if we called
    brw_gs_upload_binding_table() while there was no geometry shader
    active, but a geometry shader had previously been active, it would
    refer to a stale (and possibly freed) prog_data structure.
    
    This patch fixes the problem by modifying
    brw_gs_upload_binding_table() to use the proper technique to determine
    whether a geometry shader is active: by checking whether
    brw->geometry_program is NULL.
    
    This fixes the crash reported in comment 2 of bug 71870 (the incorrect
    rendering remains, however).    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
    
    Cc: mesa-stable@lists.freedesktop.org
    
    Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
    Reviewed-by: Eric Anholt <eric@anholt.net>
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Comment 11 Thomas Rohloff 2013-12-25 09:40:46 UTC
Maybe the apitrace from here https://bugs.freedesktop.org/show_bug.cgi?id=71239#c16 is helpfull.
Comment 12 Andreas Pokorny 2014-01-06 12:43:33 UTC
Not sure if you need another one. This apitrace was done with the intel driver from mesa 10.0.1 which shows the missing render passes for light effects.

http://ubuntuone.com/1bjCam2I24y4YDDT2ZrDdD

I can also provide another one captured with the windows/wine version of the game on the same system. But apitrace complains "error: unknnown event 4" on the that trace file. Would that file be helpful?
Comment 13 Andreas Pokorny 2014-01-06 12:57:01 UTC
+ the wine version does not show the missing light renderig.
Comment 14 Andreas Pokorny 2014-01-06 16:09:09 UTC
here it is: http://ubuntuone.com/2EZfBdqbuQythW2vCkO8FJ (wine apitrace on same system without the light renderig issues)
Comment 15 Bernd Kosmahl 2014-01-09 09:31:59 UTC
I made an api trace as well. Perhaps it's helpful. The game works just fine for me:

OpenGL renderer string: GeForce GTX 470/PCIe/SSE2
OpenGL version string: 4.2.0 NVIDIA 304.88 

http://beko.maxr.org/MetroLL.trace.tar.gz (674M)
http://beko.maxr.org/MetroLL.trace.tar.gz.sha1

I started the game, changed the "gfx settings slider" to MAX and loaded a savegame. After that I quit the game.
Comment 16 Tapani Pälli 2014-01-09 10:11:57 UTC
(In reply to comment #15)
> I made an api trace as well. Perhaps it's helpful. The game works just fine
> for me:
> 
> OpenGL renderer string: GeForce GTX 470/PCIe/SSE2
> OpenGL version string: 4.2.0 NVIDIA 304.88 
> 
> http://beko.maxr.org/MetroLL.trace.tar.gz (674M)
> http://beko.maxr.org/MetroLL.trace.tar.gz.sha1
> 
> I started the game, changed the "gfx settings slider" to MAX and loaded a
> savegame. After that I quit the game.

Bernd, would it be possible for you to try traces captured by Andreas?
Comment 17 Bernd Kosmahl 2014-01-09 11:53:58 UTC
How to do that? Please point me to a howto. I know next to nothing about this and just followed a call for apitraces from the Steam Community.
Comment 18 Tapani Pälli 2014-01-10 09:20:06 UTC
(In reply to comment #17)
> How to do that? Please point me to a howto. I know next to nothing about
> this and just followed a call for apitraces from the Steam Community.

Documentation exists as part of apitrace in file README.markdown, problem is that you might have different version and the replay of the file won't work (like is happening to me with these traces) but you could have a try. It would be interesting to know if Nvidia rendering displays correct with the trace file taken on Intel.
Comment 19 Bernd Kosmahl 2014-01-11 13:19:25 UTC
This is what I get with the trace http://ubuntuone.com/1bjCam2I24y4YDDT2ZrDdD from Andreas:

beko@daedalus:~/apitrace-tmp/apitrace/build$ ./apitrace replay ~/metroLastLight.trace 
0 49 glXSwapIntervalMESA(interval = 0) = 0
49: warning: unsupported glXSwapIntervalMESA call
1 143004 glXSwapIntervalMESA(interval = 0) = 0
143004: warning: unsupported glXSwapIntervalMESA call
Rendered 7 frames in 5.92302 secs, average of 1.18183 fps

It's a really short replay. Dunno what I might be looking for too ;)

I had to compile apitrace from GIT because the package 3.0+git20121018.d1c301f7-0ubuntu1 from repository had no replay function at all. The GUI simply crashed with seg fault on loading.

I can replay my own trace just fine with the compiled version fresh from GIT.
Comment 20 Andreas Pokorny 2014-01-11 17:47:58 UTC
@Bernd Kosmahl: the intersting question here is, whether the execution of my traces creates proper rendering results on your system. The short trace should show the intro camera flight that leads to the main menu, rendered with the engine. If you also have the visual artifacts it is clearly a bug inside the OpenGL backend in metro that causes the issue. With the version git, you should first see the embedded movie with company logos, and then it switches to the engine and lights should be gone..
Comment 21 Bernd Kosmahl 2014-01-11 18:33:08 UTC
@Andreas: Too bad, I see nothing :( The window pops up and closes a second later. I think I can see the fog in the right lower corner on black ground for a second. After this flash apitrace terminates with the output shown.

This reminds me: I removed the intro video from my installation and put an empty textfile in place because I didn't want to watch the movie every time. Crowbar tactics, I know.

If this means that my trace is useless I can put the video back in place and make a new trace.
Comment 22 Andreas Pokorny 2014-01-19 18:54:45 UTC
Nope, your trace will work in both cases. I tried the trace from your system:

It begins with warnings regarding __, and then:

0 14363 glLinkProgram(program = 616)
14363: warning: error: linking with uncompiled shader
14365: glDebugOutputCallback: High severity API error 1, GL_INVALID_OPERATION in glGetUniformLocation(program not linked)
0 14365 glGetUniformLocation(program = 616, name = "m_W") = -1
14365: warning: glGetError(glGetUniformLocation) = GL_INVALID_OPERATION
14366: glDebugOutputCallback: High severity API error 1, GL_INVALID_OPERATION in glGetUniformLocation(program not linked)
0 14366 glGetUniformLocation(program = 616, name = "m_iW") = -1
14366: warning: glGetError(glGetUniformLocation) = GL_INVALID_OPERATION
14367: glDebugOutputCallback: High severity API error 1, GL_INVALID_OPERATION in glGetUniformLocation(program not linked)
.. and a lot more of that. 

Then during rendering:
16650435: glDebugOutputCallback: Medium severity API performance issue 9, Stalling on the GPU for mapping a busy buffer object

16652177: glDebugOutputCallback: Medium severity API performance issue 12, Flushing before mapping a referenced bo.

16652177: glDebugOutputCallback: Medium severity API performance issue 11, Mapping a busy BO, causing a stall on the GPU.

16656049: glDebugOutputCallback: High severity API error 1, GL_INVALID_OPERATION in glUseProgram(program 1716 not linked)
0 16656049 glUseProgram(program = 1716)
16656049: warning: glGetError(glUseProgram) = GL_INVALID_OPERATION

The light passes show up correctly on screen. But the gamma value is far too high.

So that means Metro LL uses different shaders or assembles shaders in a different way depending on the GL version/vendor/extension info provided by the driver?
Comment 23 Darius Spitznagel 2014-02-15 14:05:38 UTC
Created attachment 94117 [details]
Metro LL runnung on intel mesa
Comment 24 Tapani Pälli 2014-02-15 14:07:21 UTC
(In reply to comment #23)
> Created attachment 94117 [details]
> Metro LL runnung on intel mesa

Which Intel GPU and what version of Mesa?
Comment 25 Darius Spitznagel 2014-02-15 14:27:33 UTC
@Andreas: I also checked the trace from Bernd and have discoverd the same problems.

The most important hint is in your 2nd line...
	 
.....It begins with warnings regarding __,...

This is because you mesa developers decided to forbid macro names starting with "__".

Please, don't misuderstand me. I don't wanna blame here anybody an you make a great job, but the specs for GLSL say that macro names starting with "__" * or "GL_" are reserved for FUTURE use.

During my internet exercise regarding this problem I have found more games having the same "lightning" problems when using mesa...second live, rust (maybe), ...

So since GLSL doesn't FORBID macro names with "__" there should be some kind of check against it.
For example:
If games run with GLSL 1.50 allow double underscores but disallow with GLSL X.X where double underscores are really DISALLOWED by specs.
Maybe there is a smarter way, I don't know, this is up to you.

For now I solved this by removing the check from src/glsl/glcpp/glcpp-parse.c.
Compiled mesa after that, and voila Metro LL has no lightning problems anymore (look at my attachment).

<<<<
	if (strstr(identifier, "__")) {
		glcpp_error (loc, parser, "Macro names containing \"__\" are reserved.\n");
	}
>>>>

Would be great if you find a solution for this, so I don't need to remove this on every new mesa release I wanna use to play Metro LL.

Beside this...
Thanks to Xonar (a steam user) who showed me a path to solve this problem for me through the steam community.
Comment 26 Darius Spitznagel 2014-02-15 14:29:49 UTC
(In reply to comment #24)
> (In reply to comment #23)
> > Created attachment 94117 [details]
> > Metro LL runnung on intel mesa
> 
> Which Intel GPU and what version of Mesa?

darius@pc1:~$ glxinfo | grep OpenGL
OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) Ivybridge Desktop x86/MMX/SSE2
OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.0.3
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 10.0.3
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:

This all is running on Debian Jessie on Intel(R) Core(TM) i3-3225 CPU @ 3.30GHz
Comment 27 Tapani Pälli 2014-02-15 14:40:16 UTC
(In reply to comment #26)
> (In reply to comment #24)
> > (In reply to comment #23)
> > > Created attachment 94117 [details]
> > > Metro LL runnung on intel mesa
> > 
> > Which Intel GPU and what version of Mesa?
> 
> darius@pc1:~$ glxinfo | grep OpenGL
> OpenGL vendor string: Intel Open Source Technology Center
> OpenGL renderer string: Mesa DRI Intel(R) Ivybridge Desktop x86/MMX/SSE2
> OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.0.3
> OpenGL core profile shading language version string: 3.30
> OpenGL core profile context flags: (none)
> OpenGL core profile profile mask: core profile
> OpenGL core profile extensions:
> OpenGL version string: 3.0 Mesa 10.0.3
> OpenGL shading language version string: 1.30
> OpenGL context flags: (none)
> OpenGL extensions:
> 
> This all is running on Debian Jessie on Intel(R) Core(TM) i3-3225 CPU @
> 3.30GHz

Thanks, I've just tested that patch indeed fixes the lighting. Will discuss how to proceed with '__'. We could have a driconf option to allow this if it is indeed a problem with many games.
Comment 28 Darius Spitznagel 2014-02-15 16:07:43 UTC
(In reply to comment #27)
> (In reply to comment #26)
> > (In reply to comment #24)
> > > (In reply to comment #23)
> > > > Created attachment 94117 [details]
> > > > Metro LL runnung on intel mesa
> > > 
> > > Which Intel GPU and what version of Mesa?
> > 
> > darius@pc1:~$ glxinfo | grep OpenGL
> > OpenGL vendor string: Intel Open Source Technology Center
> > OpenGL renderer string: Mesa DRI Intel(R) Ivybridge Desktop x86/MMX/SSE2
> > OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.0.3
> > OpenGL core profile shading language version string: 3.30
> > OpenGL core profile context flags: (none)
> > OpenGL core profile profile mask: core profile
> > OpenGL core profile extensions:
> > OpenGL version string: 3.0 Mesa 10.0.3
> > OpenGL shading language version string: 1.30
> > OpenGL context flags: (none)
> > OpenGL extensions:
> > 
> > This all is running on Debian Jessie on Intel(R) Core(TM) i3-3225 CPU @
> > 3.30GHz
> 
> Thanks, I've just tested that patch indeed fixes the lighting. Will discuss
> how to proceed with '__'. We could have a driconf option to allow this if it
> is indeed a problem with many games.

Glad to read this.
I have also tested it with mesa-10.2 from git but had some trouble with sound when starting a new game > I didn't hear anything from the guy who waked me up and got lost in an infinite loop.
Then I opend the game menu, went into sound menu, changed noting and got back to game and the loop ended with sound enabled. Maybe a minor bug?!

It would be great when you guys at mesa would release a workaround with mesa 10.1 in a few weeks.
Comment 29 José Suárez 2014-02-16 04:32:07 UTC
The game runs perfect with the workaround in comment 25. I'm on radeonsi.
Comment 30 José Suárez 2014-02-16 12:26:06 UTC
As stated by myself in https://bugs.freedesktop.org/show_bug.cgi?id=71239#c23 , this workaround seems to be causing some trouble with the radeonsi driver (or probably glamor), at least for me.

Is anyone here having the same problem I reported in the bug mentioned above?
Comment 31 Mike Lothian 2014-02-16 12:49:52 UTC
Created attachment 94158 [details] [review]
Patch for ignoring macros starting with __

The c file is generated by the src/glsl/glcpp/glcpp-parse.y file - this patch might make things easier for those building from scratch - you might need to make sure the c file is regenerated
Comment 32 José Suárez 2014-02-16 13:29:54 UTC
My bad... I had removed the whole fuction and the references to it instead of only the "__" subpart. Now it works properly and the xserver starts.
Thanks for the patch.
Comment 33 Ian Romanick 2014-02-16 20:19:27 UTC
I think it's pretty clear in the GLSL spec that using __ in a name is an error.  Section 3.6 (Keywords) says:

    "The following are the language's keywords and (after preprocessing)
    can only be used as described in this specification, or a compile-
    time error results:"

Later the same section also says:

    "In addition, all identifiers containing two consecutive underscores
    (__) are reserved as possible future keywords."

This language has existed since GLSL 1.10.

Section 3.3 (Preprocessor) says:

    "All macro names containing two consecutive underscores ( __ ) are
    reserved for future use as predefined macro names.   All macro
    names prefixed with “GL_” (“GL” followed by a single underscore)
    are also reserved."

The preprocessor restriction was added in GLSL 1.30, and it has always existed in GLSL ES.

There isn't any justification for a GLSL 1.50 shader using __, in any capacity, to compile.  This is an application bug, so we'll follow up with the application vendor.
Comment 34 Mike Lothian 2014-02-17 00:54:45 UTC
As this is one of the few AAA games available on Linux is there a way we could add an override in drirc to allow the game to function until it can be fixed up(down?)stream
Comment 35 Tapani Pälli 2014-02-17 11:44:19 UTC
Some more debug info, this bug seems to originate from Nvidia FXAA shader which has defines using '__', this same shader is used for multiple platforms and I guess multiple games. Anyway, let's see what the application vendor responds.
Comment 36 Thomas Rohloff 2014-02-17 16:07:21 UTC
Ian: So, as long as there are no future keyword/macros, where's the problem? Please correct me if I'm wrong but this is what I see:
Game uses(/defines) "__my_cool_macro".
Khronos says: "__my_cool_macro" could be something in the future, you shouldn't use it.
Mesa says: "__my_cool_macro" could be something in the future, you're not allowed to use it.

So, seriously, what's the point? The blobs (nvidia, fglrx) seem to allow it and I can't see any harm.
Comment 37 Alex Deucher 2014-02-17 17:21:16 UTC
*** Bug 71239 has been marked as a duplicate of this bug. ***
Comment 38 José Suárez 2014-02-17 19:15:50 UTC
I just wanted to give my my 2 cents on this topic. Be warned that I am no programmer nor do i have any knowledge of graphics drivers (so please, forgive me for any inaccuracy or incorrect technical point of view). I also know that the mesa policy is not to fix third programs bugs within mesa's code. I think that is the right choice, but maybe a compromise could be reached, especially in the light that both proprietary drivers seem to be more fault tolerant with this problem.

If the "__" macro names are reserved for _future_ use, I asume the reservation of certain "__" macro names should be explicitly made in a particular GLSL spec version (or subsequent versions if more need to be added). I don't know if that has happened yet, but I presume that once that happens there should be a list (as well as a spec) of those macro names. I also suppose that if no especific "__" macro names have been established yet, game programmers may not be too aware of this limitation even though the reservation has been established since long time ago, as Ian pointed out.

Therefore, if a compromise is to be reached I think there could be two alternatives for adressing this problem:

(i) Only enforce the "__" macro reservation beginning from the first GLSL version that establishes any of those macro names. I.e, if program uses GLSL version equal or greater than X.Y (being X.Y the first GLSL version with a reservation for a particular "__" macro), then disallow the use of those custom macros. Otherwise, allow the use. Mesa could output a warning instead of failing.

(ii) Once the GLSL specs begin expressly reserving certain "__" macro names, only disallow the use of them as custom macro names.

I believe the second approach is less desirable.

Of course not doing anything is also a possibility, but I believe trying to get a third party programmer to change a game's code is quite difficult. There are very collaborative game devs (such as, for instance, The Farm 51, who offered steam keys to mesa devs to sort out some bugs), but I'm sure there will be others who are not so friendly. And if this use of "__" macro names comes from using some develoment apps or techniques, I supose they would not be willing to risk messing the code just for satisfying a very small fraction of users who are gaming on Linux with mesa -dont't get me wrong, I wish we were a majority of the gamers- or (if not construed by the game developer that altruisticly) achieving 100% spec compatible code.
Comment 39 Darius Spitznagel 2014-02-17 23:09:43 UTC
@Ian: I'm very impressed by the fact, that you contact(ed) the vendor to report on this.
It's always a bad idea breaking or ignoring specs or rules, i know that.
But at the end, this will only fix the problem for Metro LL.

What about all the other games using the faulty FXAA shaders from nvidia?
Or other game developers simply misunderstanding GLSL specs?

Some of us mesa users here like Jose, Thomas and Mike have good ideas.
We all waited so long seeing Linux getting more and more attention on the desktop, so give us a hand.
Life is too short and world is far from being perfect.

I have 2 other ideas to solve this.

1.
There are already environment variables like MESA_GLSL_VERSION_OVERRIDE="XXX" inside mesa.
How about implement another one like MESA_GLSL_ALLOW_DOUBLE_UNDERSCORES="XXX".
Default is of course "no", override is "yes".
We could then write our own bash script or enter it in the launch options in steam.
I think, this is a good option.

2.
Look for a file like .glslrc (inspired by comment 34) in the games directory or where the games executable lie.
If it contains a string like ALLOW_DOUBLE_UNDERSOURCES="yes" then override.
This seams simpler to code but may break when the vendor is doing some kind on nasty check against foreign files.

So I think with both examples you don't break your rules and we are happy.
Comment 40 almos 2014-02-17 23:24:03 UTC
I'm not an expert on GLSL, but forbidding identifiers that contain '__' *anywhere* seems stupid to me. I'm not aware of any other programming language that imposes such a limitation on identifiers. What's the reason for this?
Comment 41 Thomas Rohloff 2014-02-17 23:38:52 UTC
(In reply to comment #40)
> I'm not an expert on GLSL, but forbidding identifiers that contain '__'
> *anywhere* seems stupid to me. I'm not aware of any other programming
> language that imposes such a limitation on identifiers. What's the reason
> for this?

    "All macro names containing two consecutive underscores ( __ ) are
    reserved for future use as predefined macro names.   All macro
    names prefixed with “GL_” (“GL” followed by a single underscore)
    are also reserved."
(source: Comment 33)

In other words: When you have a macro called "MY__SUPER_COOL_MACRO" Khronos could step up telling "MY__SUPER_COOL_MACRO" is now a predefined macro and as such your program is incompatible. I just fail to see why this is needed for already finished specifications.
Anyway in that example Mesa could just disallow "MY__SUPER_COOL_MACRO" (expose the internal macro instead) and not everything. Also I don't think the specification says you're not allowed to use "MY__SUPER_COOL_MACRO", it just tells you that it's a bad idea, so mesa is overstrict here.
Comment 42 Krzysztof A. Sobiecki 2014-02-18 01:42:51 UTC
I do not believe that requiring users to make any changes in theirs Mesa configuration to run some applications is a good idea. Mesa should be working out of the box, not "some assembly needed".

Especially when application authors have one simple answer: "use vendor supplied drivers". 

As I know nothing about GLSL I will now yell at some clouds:
*Do as big vendors(AMD, Nvidia) do. It's nice to follow specs, but when everyone that matters breaks(or interpret differently) them, then the spec is wrong. Imitate actual implementations. 
It's highly unlikely that Mesa will budge vendors and app authors in more desirable direction.

*Enforce "__" restrictions only in GLSL versions that actually use "__" stuff.

*Whitelisting of problematic apps brining more complexity(update that list without updating Mesa, otherwise why bother?) than it's worth.
Comment 43 Krzysztof A. Sobiecki 2014-02-18 05:43:50 UTC
Created attachment 94257 [details] [review]
ast_to_hir.cpp also needs a patch?
Comment 44 Kenneth Graunke 2014-02-18 07:50:21 UTC
Hi all,

We're actively engaged with Khronos on this, seeking clarification from the GLSL language group on what the intended behavior is.  Once a decision is reached, we will gladly change our drivers to implement the correct behavior, and can make sure those changes hit stable driver releases quickly.

As part of that, we investigated what other vendors do, and we seem to be alone in enforcing this.  So I suspect that this will be relaxed in one way or another.

In the meantime, thank you for your patience.  We hope to have this sorted shortly.  I apologize that it's taken so long.

--Ken
Comment 45 Ian Romanick 2014-02-19 23:11:26 UTC
The __ issue should be fixed by the following commits on Mesa master.  These are scheduled to inclusion in 10.1 and 10.0.4.

commit 2c85fd5a964a78c9f7a93994fb79f1723c6f45b5
Author: Ian Romanick <ian.d.romanick@intel.com>
Date:   Tue Feb 18 09:36:08 2014 -0800

    glsl: Only warn for macro names containing __
    
    From page 14 (page 20 of the PDF) of the GLSL 1.10 spec:
    
        "In addition, all identifiers containing two consecutive underscores
         (__) are reserved as possible future keywords."
    
    The intention is that names containing __ are reserved for internal use
    by the implementation, and names prefixed with GL_ are reserved for use
    by Khronos.  Names simply containing __ are dangerous to use, but should
    be allowed.
    
    Per the Khronos bug mentioned below, a future version of the GLSL
    specification will clarify this.
    
    Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
    Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    Tested-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
    Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
    Cc: Tapani Pälli <lemody@gmail.com>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
    Bugzilla: Khronos #11702

commit 0bd78926304e72ef3566e977d0cb5a959d86b809
Author: Ian Romanick <ian.d.romanick@intel.com>
Date:   Tue Feb 18 09:10:36 2014 -0800

    glcpp: Only warn for macro names containing __
    
    Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
    GLSL ES spec (all versions) say:
    
        "All macro names containing two consecutive underscores ( __ ) are
        reserved for future use as predefined macro names. All macro names
        prefixed with "GL_" ("GL" followed by a single underscore) are also
        reserved."
    
    The intention is that names containing __ are reserved for internal use
    by the implementation, and names prefixed with GL_ are reserved for use
    by Khronos.  Since every extension adds a name prefixed with GL_ (i.e.,
    the name of the extension), that should be an error.  Names simply
    containing __ are dangerous to use, but should be allowed.  In similar
    cases, the C++ preprocessor specification says, "no diagnostic is
    required."
    
    Per the Khronos bug mentioned below, a future version of the GLSL
    specification will clarify this.
    
    Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
    Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    Tested-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
    Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
    Cc: Tapani Pälli <lemody@gmail.com>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
    Bugzilla: Khronos #11702
Comment 46 Tapani Pälli 2014-02-20 03:09:52 UTC
I've verified with git master that issues 1, 3 and 4 are gone. 2 and 5 still hold but let's handle those separately from this bug.
Comment 47 Thomas Rohloff 2014-02-20 22:36:00 UTC
(In reply to comment #40)
>     Bugzilla: Khronos #11702

https://www.khronos.org/bugzilla/show_bug.cgi?id=11702 gives me "Bug #11702 does not exist."
Comment 48 Kenneth Graunke 2014-02-20 22:40:08 UTC
It's not in the public Khronos bug tracker.
Comment 49 José Suárez 2014-02-21 10:31:35 UTC
I confirm that the issue is fixed for radeonsi too (as expected).
Thank you very much to the devs and for the other users reporting and helping to sort this out.

Regards


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.