Created attachment 142728 [details]
Since some time this game keeps getting an out of memory error that makes it crash (it's a 32bit binary). While reporting the issues to the game devs this is the reason they gave to the users (https://steamcommunity.com/app/362890/discussions/1/3307213006836757658/?ctp=7):
"... From what I noticed, more shaders game had - faster out of memory occurred. Black Mesa has full (or almost full) support of SM 3.0 on Linux which allowed us to use same shaders with same code on both platfroms, while opening new horizons for our new dynamic lighting implementation. After first discovery of issues with Mesa driver, to avoid out of memory I had to cut some of shader combinations completely specifically for OpenGL version by introducing shader platform flags that decide if particular part with some visual feature the of shader will be compiled. That helped quite a lot, we were able to run Black Mesa for a while on AMD with lowest settings and CSM being turned off. But while Mesa is sharing same memory location in application's address space, it's pretty hard to go further being 32 bit."
Also, they mention that with Nvidia propietary driver the game with full setting doesn't go over 1.2 GB of ram (on my system, I detect a first peak of almost 3Gb of ram usage before crashing). So, there is some way that Mesa drivers could be leaking about 2Gb of ram? I understand that the memory usage between propietary drivers a Mesa are quite different, but I feel that it's quite suspicious that there is such a big ram usage for what seems to be shaders compile output.
Finally, while running an apitrace I detected a lot this errors: "major shader compiler issue 21156: 0:4(12): warning: extension `GL_EXT_gpu_shader4' unsupported in vertex shader". Not sure if it could be related with any leak though.
I'm attaching to the ticket my system information plus and the apitrace I got until the crash. Let me know if there's any other information you need or if it's necessary that I test this game with newer version of Mesa.
Created attachment 142729 [details]
I can also confirm that this is not a Mesa 18.2 specific bug as I am running Black Mesa on Mesa 19.0.0-devel. I will attach my own apitraces and terminal output and hopefully this can be solved soon as the BM dev says that this memory leak should be "Well known" to Mesa Devs.
Created attachment 142780 [details]
Terminal Output upon executing bms.sh
Created attachment 142781 [details]
BMS apitrace with default arguments
Created attachment 142782 [details]
BMS apitrace with -dev argument
-dev option creates static image on the menu-screen instead of an active scene without that argument
(I'm quite sure I submitted a similar comment earlier today, but it seems to have just vanished.)
I ran the API trace in Valgrind massif memory profiler. As far as I can tell, the application never calls glDeleteShader. We don't release any of the memory because the app hasn't told us that we can. In that trace there is is about 2.6GB (on a 64-bit build of apitrace) of intermediate IR that could be released by calling glDeleteShader on all the shaders that have been linked.
It's possible that this is a quirk of the API trace. Someone would have to run the app with Valgrind massif to say for sure.
(In reply to MGG from comment #0)
> Finally, while running an apitrace I detected a lot this errors: "major
> shader compiler issue 21156: 0:4(12): warning: extension
> `GL_EXT_gpu_shader4' unsupported in vertex shader". Not sure if it could be
> related with any leak though.
This is unrelated. It's a bit misleading that apitrace lists this as a "major shader compiler issue." It is perfectly valid to enable extensions that the driver does not support. GLSL is intentionally designed so that shaders can have multiple code paths that are used based on what extensions are available at compile time.
I see your point about glDeleteShader, but due that the trace includes a game crash, I thinks it is quite possible that the api call is not present because the engine wasn't able to run any resource release process. Anyway, in my current setup I can make the game run for some time (at least in the main menu), so I'll run apitrace without letting the game to crash. I think that with a graceful run of the game we should be able to see if the leak is related with what you pointed out.
By the way, how useful would be to get an apitrace with an nvidia card (using the propietary driver)? Could that give you any extra information regarding the shader memory usage on each driver?
(In reply to Ian Romanick from comment #6)
> (I'm quite sure I submitted a similar comment earlier today, but it seems to
> have just vanished.)
> I ran the API trace in Valgrind massif memory profiler. As far as I can
> tell, the application never calls glDeleteShader. We don't release any of
> the memory because the app hasn't told us that we can. In that trace there
> is is about 2.6GB (on a 64-bit build of apitrace) of intermediate IR that
> could be released by calling glDeleteShader on all the shaders that have
> been linked.
Do you really need to keep all Intermediate Representations of all shaders?
The standard does not deal with internal stuff and IR is just that.
I thought that in core context you do not need to recompile them and that there is a method to patch the prologue of the binary in compatibility mode (at least for radeon si).
(In reply to iive from comment #8)
> (In reply to Ian Romanick from comment #6)
> > (I'm quite sure I submitted a similar comment earlier today, but it seems to
> > have just vanished.)
> > I ran the API trace in Valgrind massif memory profiler. As far as I can
> > tell, the application never calls glDeleteShader. We don't release any of
> > the memory because the app hasn't told us that we can. In that trace there
> > is is about 2.6GB (on a 64-bit build of apitrace) of intermediate IR that
> > could be released by calling glDeleteShader on all the shaders that have
> > been linked.
> Do you really need to keep all Intermediate Representations of all shaders?
> The standard does not deal with internal stuff and IR is just that.
> I thought that in core context you do not need to recompile them and that
> there is a method to patch the prologue of the binary in compatibility mode
> (at least for radeon si).
The IR is needed in case the same shader is used to link with another program. It is common for the same vertex shader to be used with many different fragment shaders. Until the application calls glDeleteShader, it is impossible for the driver to know that the application is done with it.
Separate data is stored with the linked programs for state-based recompiles.
Created attachment 142819 [details]
apitrace without crash
Sorry for the delay in my answer. I'm attaching a new apitrace that shows a run where the game is shutdown properly (i.e. closed the game before it run out of ram). Well, as expected it seems that they call glDeleteShader on exit. I count 15433 glCreateShaderObjectARB calls and 15429 glDeleteShader, so they definitely leaks memory on 4 shaders (or they don't care to release it on exit). In any case, is there something wrong with these numbers?
Finally, how useful would be to create an application that compiles all the shaders that this game is using and check the memory usage for each one of them? I mean, if doing that test I detect a high memory usage (lets say of more than 1GB), that would mean that there is a problem on the shader compiler on Mesa library?
On December 17, we pushed new beta into public-beta branch. That's major ToGL layer rework, which should address memory issues with open-source Mesa drivers. The actual issue is much deeper than one might thought, but suggestions from some advanced forum users been helpful indeed.
From internal testing on development branch where we took some of the biggest uncut shader we have, following results were achieved on NVIDIA hardware on Windows with OpenGL as main renderer:
* Main Menu (before) - using compiled binaries as intermediate data for shaders that goes into actual programs: 1.5 GiB of RAM used
* Main Menu (after) - using text representation of GLSL as intermediate data for shaders that would go into actual programs on request: 450 MiB of RAM used
* Main Menu (after, reworked) - using compressed text representation of GLSL as intermediate data for shaders that would go into actual programs on request: 360 MiB of RAM used
After patch has been applied on current public-beta branch on Steam with NVIDIA hardware on Linux we got around 200-300 MiB less of RAM usage in game. If that was actually it, as was suggested on Mesa's bug tracker, AMD users should experience similar results (even more prominent ones) which should address out of memory issue.
P.S. I may add additional information later, would be great if any of you could try new public-beta out.
Hey, the new beta version has dramatically reduced the memory usage! Nice work there!
Please, don't forget to give us the extra info about the problem/bug you found in order to close this ticket (in case there it isn't a problem on Mesa side).
Definitive wasn't a problem related with Mesa, so will set it as resolved.