Hello, I am on mesa-git, built from around 1AM this morning. This has the patches that add the 32/64b folders, but it's still not working in with the game StarCraft II running through wine. First I remove the ~/.cache/mesa folder, then start the Battle.Net client with wine. This seems to work fine and I see a new 32 folder in ~/.cache/mesa. Here's the output of file on that binary: PE32 executable (GUI) Intel 80386, for MS Windows Then from the Battle.Net client I start StarCraft II. Per the configuration it starts the 64b client. Then my terminal gets flooded with these: Failed to create ~/.cache/mesa/64/1488793729_1488791612/AMD TAHITI/89 for shader cache (No such file or directory)---disabling. The issue is, as the message says, I do not have that folder on disk but: ~/.cache/mesa/64/1488793730_1488791612 notice the 30 instead of 29, the rest matches. The output of file on the binary that is now in memory (but maybe not the one directly called by the launcher): PE32+ executable (GUI) x86-64, for MS Windows. I am not sure what else to provide but would be happy to help with whatever is needed. Thanks! John
I'm guessing there is no ~/.cache/mesa/64 just ~/.cache/mesa/32?
(In reply to Timothy Arceri from comment #1) > I'm guessing there is no ~/.cache/mesa/64 just ~/.cache/mesa/32? Oh you said you have ~/.cache/mesa/64/1488793729_1488791612 that is very odd.
That's right I have both folders. How do you create the identifier? I find it quite strange that the folder's number on disk is only one away from the one expected. Last week when I reported a similar issue on phoronix, I believe it was similar (but without the 32/64b folders). Thanks!
(In reply to John from comment #3) > That's right I have both folders. > > How do you create the identifier? I find it quite strange that the folder's > number on disk is only one away from the one expected. Last week when I > reported a similar issue on phoronix, I believe it was similar (but without > the 32/64b folders). > > Thanks! That id is the time at which the Mesa binary was last modified. Also once the cache is created the path is stored in memory so that makes what you are reporting doubly odd. Do you have any programming experience? Any chance of you being able to a debug build through gdb? The functions you would want to look at are in disk_cache.c. I tried to install the trail but I get stuck with missing buttons and text boxes in the Battle.Net app.
(In reply to John from comment #0) > Failed to create ~/.cache/mesa/64/1488793729_1488791612/AMD TAHITI/89 for shader cache (No such file or directory)---disabling. > ... > The issue is, as the message says, I do not have that folder on disk but: > ~/.cache/mesa/64/1488793730_1488791612 > notice the 30 instead of 29, the rest matches. If you try to run the game again (without updating mesa), do you get the exact same numbers or a new set?
(In reply to John from comment #0) > Hello, > > I am on mesa-git, built from around 1AM this morning. > This has the patches that add the 32/64b folders, but it's still not working > in with the game StarCraft II running through wine. Did you restart x so it also used the new mesa? I saw a few of these late last night, now I normally remember to re-start x but it's possible I forgot. My test case was using a video player with a shader de-interlace - other settings that (I guess) also use shaders didn't produce the logging. Anyway after booting up today they are gone. Trying to reproduce by building newer mesa again today and deliberately not re-starting x doesn't re-produce, but then maybe I would have to put mesa back to where it was commit wise before I updated last night to properly re-produce. In summary - if you re-boot do the errors go away?
Oh I see, a timestamp. Now I've seen something interesting. I started B.Net and it created this folder: ~/.cache/mesa/32/1488874151_1488872775/ While launching SCII I saw some similar error message as before but about the 32b folder, so something 32b happens when I click on launching SCII. I then looked again in the cache folder but the ID folder had changed to this: ~/.cache/mesa/32/1488874150_1488872775/ Here's the current 64 one: 1488873224_1488871428/ So it's not like at some point one arch's ID gets overwritten by the other, but somehow the ID changes, and it seems always by just 1 before the underscore. I am a dev too, though with very little mesa experience :) I'll try to look at that. As for your errors, I haven't had issues with wine-staging 2.1, wine 2.2 and wine-staging 2.2, I think these were more common pre 2. Oh also all my cache tests happened with nine. Let me quickly try without it and see if that changes anything.
(In reply to Grazvydas Ignotas from comment #5) > If you try to run the game again (without updating mesa), do you get the > exact same numbers or a new set? I get the same numbers. 1488874151_1488872775/ when starting BNET which changes to 1488874150_1488872775/ when starting SCII (In reply to Andy Furniss from comment #6) > Did you restart x so it also used the new mesa? > > I saw a few of these late last night, now I normally remember to re-start x > but it's possible I forgot. > > My test case was using a video player with a shader de-interlace - other > settings that (I guess) also use shaders didn't produce the logging. > > Anyway after booting up today they are gone. > > Trying to reproduce by building newer mesa again today and deliberately not > re-starting x doesn't re-produce, but then maybe I would have to put mesa > back to where it was commit wise before I updated last night to properly > re-produce. > > In summary - if you re-boot do the errors go away? I have not, I thought mesa only required the application to be restarted. I will try that and report. Actually, I just tried without nine and the id staid the same! So I guess the problem is around that.
(In reply to John from comment #8) > (In reply to Grazvydas Ignotas from comment #5) > > If you try to run the game again (without updating mesa), do you get the > > exact same numbers or a new set? > I get the same numbers. > > 1488874151_1488872775/ when starting BNET > which changes to > 1488874150_1488872775/ when starting SCII > > (In reply to Andy Furniss from comment #6) > > Did you restart x so it also used the new mesa? > > > > I saw a few of these late last night, now I normally remember to re-start x > > but it's possible I forgot. > > > > My test case was using a video player with a shader de-interlace - other > > settings that (I guess) also use shaders didn't produce the logging. > > > > Anyway after booting up today they are gone. > > > > Trying to reproduce by building newer mesa again today and deliberately not > > re-starting x doesn't re-produce, but then maybe I would have to put mesa > > back to where it was commit wise before I updated last night to properly > > re-produce. > > > > In summary - if you re-boot do the errors go away? > I have not, I thought mesa only required the application to be restarted. > I will try that and report. > > > > Actually, I just tried without nine and the id staid the same! So I guess > the problem is around that. Yeah that makes sense. It's probably grabbing a different time for nine vs opengl because they are separate dynamic libraries I'll take a look at this tomorrow.
(In reply to Timothy Arceri from comment #9) > > Yeah that makes sense. It's probably grabbing a different time for nine vs > opengl because they are separate dynamic libraries I'll take a look at this > tomorrow. That's what I would expect from the 1s difference. I looked at libGL and d3dadapter9.so but their creation time is 3 minutes different, so these are probably not the 2 binaries used. Thank you!
Try this: stat -c %Y d3d/d3dadapter9.so dri/radeonsi_dri.so although I'm not sure if TAHITI uses radeonsi_dri or r600_dri.
(In reply to Grazvydas Ignotas from comment #11) > Try this: > stat -c %Y d3d/d3dadapter9.so dri/radeonsi_dri.so > > although I'm not sure if TAHITI uses radeonsi_dri or r600_dri. 1488874138 1488874151 The 2nd one is correct, but strangely not the first one. TAHITI is on radeonsi.
(In reply to John from comment #12) > (In reply to Grazvydas Ignotas from comment #11) > > Try this: > > stat -c %Y d3d/d3dadapter9.so dri/radeonsi_dri.so > > > > although I'm not sure if TAHITI uses radeonsi_dri or r600_dri. > > 1488874138 > 1488874151 > > The 2nd one is correct, but strangely not the first one. > TAHITI is on radeonsi. Ok the other timestamp is from /usr/lib32/d3d/d3dadapter9.so.1.0.0 not the links to it. stat -c %Y /usr/lib32/d3d/d3dadapter9.so.1.0.0 /usr/lib32/xorg/modules/dri/radeonsi_dri.so 1488874150 1488874151
You could try to set the modified time to the so files as a workaround. Something like this: touch -m -t 201708080000.00 /usr/lib32/d3d/d3dadapter9.so.1.0.0 /usr/lib32/xorg/modules/dri/radeonsi_dri.so
(In reply to Edmondo Tommasina from comment #14) > You could try to set the modified time to the so files as a workaround. > > Something like this: > > touch -m -t 201708080000.00 /usr/lib32/d3d/d3dadapter9.so.1.0.0 > /usr/lib32/xorg/modules/dri/radeonsi_dri.so That works (well I used -r instead...). I don't really need a workaround, as the current speed is good enough for me. I'm just trying to help fix out the bug :) This does prove what we were all guessing though!
Created attachment 130129 [details] [review] HACK: disable dynamic, enable symbolic Some ideas: - a non-timestamp better solution might be applicable/in the works (guessing here) - changing timestamps manually is a _very_ bad idea Please do NOT mention it nor do it. Some poor soul will start copying the commands having zero knowledge of the side effects. - add a bunch of printfs and establish which timestamp gets picked at which point - last but not least I suspect that the symbols are getting stomped over another. Can you try if the attached patch works ? Note that it will break GL <> VDPAU interop for DRI2, so it's not a real fix.
Emil, I don't think there's any mystery anymore given comment 13 — the different cache timestamps are due to *_dri.so and d3dadapter9.so.1.0.0 having different timestamps. (In reply to Emil Velikov from comment #16) > - a non-timestamp better solution might be applicable/in the works > (guessing here) If you mean build-id, that won't help for this issue, because different binaries will also have different build-ids.
(In reply to Emil Velikov from comment #16) > Some ideas: > - a non-timestamp better solution might be applicable/in the works > (guessing here) A simple workaround would be to use a combination of the timestamp and the binary's name (or anything else that identifies the binary, the checksum wouldn't work as it'd change with each build), and allow more than one base folder. So instead of: ~/.cache/mesa/1488874151_1488872775 then overwritten by ~/.cache/mesa/1488874150_1488872775 we'd have: ~/.cache/mesa/radeonsi_dri_1488874151_1488872775 and ~/.cache/mesa/d3dadapter9_1488874150_1488872775 or ~/.cache/mesa/radeonsi_dri/1488874151_1488872775 and ~/.cache/mesa/d3dadapter9/1488874150_1488872775
(In reply to Michel Dänzer from comment #17) > Emil, I don't think there's any mystery anymore given comment 13 — the > different cache timestamps are due to *_dri.so and d3dadapter9.so.1.0.0 > having different timestamps. > IMHO the [file] source of the timestamps is not the bigger mystery, but why we end up with "no such file or directory...---disabling". Afaics that should _never_ happen. The only thing that comes to mind is that the "wrong" library/symbol gets picked to get the timestamp from. This may happen due to the dynamic-list we have for the dri module (and the fact that it lacks Bsymbolic). A trivial test is the patch attached. Tl;Dr; "---disabling." should _never_ kick-in. Drop any hacks and give the patch a try, please ?
(In reply to John from comment #18) > (In reply to Emil Velikov from comment #16) > > Some ideas: > > - a non-timestamp better solution might be applicable/in the works > > (guessing here) > > A simple workaround would be to use a combination of the timestamp and the > binary's name (or anything else that identifies the binary, the checksum > wouldn't work as it'd change with each build), and allow more than one base > folder. > > So instead of: > ~/.cache/mesa/1488874151_1488872775 > then overwritten by > ~/.cache/mesa/1488874150_1488872775 > > we'd have: > ~/.cache/mesa/radeonsi_dri_1488874151_1488872775 > and > ~/.cache/mesa/d3dadapter9_1488874150_1488872775 > > or > ~/.cache/mesa/radeonsi_dri/1488874151_1488872775 > and > ~/.cache/mesa/d3dadapter9/1488874150_1488872775 I would really appreciate if one can spend 2 minutes and test the patch in question. Pretty please ? Do move/prune your cache and let me know if you still get the "---disabled" message(s).
(In reply to Emil Velikov from comment #20) > I would really appreciate if one can spend 2 minutes and test the patch in > question. Pretty please ? > > Do move/prune your cache and let me know if you still get the "---disabled" > message(s). I still do get these (and the different folders).
In case it helps, here are the calls for disk_cache_create (always called by r600_disk_cache_create). First when starting the Battle.Net launcher: - Called for radeonsi_dri.so - Called for d3dadapter9.so.1.0.0 - Called for d3dadapter9.so.1.0.0 - Called for d3dadapter9.so.1.0.0 - Called for radeonsi_dri.so When starting StarCraft II: - Called for d3dadapter9.so.1.0.0 - Called for d3dadapter9.so.1.0.0 - Called for radeonsi_dri.so The order seems to stay the same in my various tests. I've tried all 3 ifs related to path, ie with only MESA_GLSL_CACHE_DIR set, with only XDG_CACHE_HOME set and with no variable set, the behavior is the same. Without nine the calls are: First when starting the Battle.Net launcher: - Called for radeonsi_dri.so - Called for radeonsi_dri.so When starting StarCraft II: - Called for radeonsi_dri.so I've verified that these were the right libraries in r600_disk_cache_create (since the timestamps are shared I wanted to verify with the filenames.). And here is a stack trace for each library at that point: /usr/lib32/d3d/d3dadapter9.so.1(+0x300323) [0x71859323]. /usr/lib32/d3d/d3dadapter9.so.1(+0x30e165) [0x71867165]. /usr/lib32/d3d/d3dadapter9.so.1(+0x3df181) [0x71938181]. /usr/lib32/d3d/d3dadapter9.so.1(+0x274388) [0x717cd388]. /usr/lib32/d3d/d3dadapter9.so.1(+0x43c4c) [0x7159cc4c]. /usr/lib32/d3d/d3dadapter9.so.1(+0x2fdc9) [0x71588dc9]. /usr/bin/../lib32/wine/d3d9-nine.dll.so(present_create_adapter9+0x128) [0x7d581c58]. /usr/bin/../lib32/wine/d3d9-nine.dll.so(+0x1794d) [0x7d57b94d]. /usr/bin/../lib32/wine/d3d9-nine.dll.so(d3dadapter9_new+0x7c) [0x7d57c18c]. /usr/bin/../lib32/wine/d3d9-nine.dll.so(Direct3DCreate9+0x5c) [0x7d579fac]. and /usr/lib32/xorg/modules/dri/radeonsi_dri.so(+0x65d793) [0x6ae86793]. /usr/lib32/xorg/modules/dri/radeonsi_dri.so(+0x675365) [0x6ae9e365]. /usr/lib32/xorg/modules/dri/radeonsi_dri.so(+0x6a3081) [0x6aecc081]. /usr/lib32/xorg/modules/dri/radeonsi_dri.so(radeon_drm_winsys_create+0x8e8) [0x6add6718]. /usr/lib32/xorg/modules/dri/radeonsi_dri.so(+0x4e264b) [0x6ad0b64b]. /usr/lib32/xorg/modules/dri/radeonsi_dri.so(+0x60b656) [0x6ae34656]. /usr/lib32/xorg/modules/dri/radeonsi_dri.so(+0x4ec550) [0x6ad15550]. /usr/lib32/xorg/modules/dri/radeonsi_dri.so(+0x4edc20) [0x6ad16c20]. /usr/lib32/libGL.so.1(+0x2713c) [0x7d33613c]. /usr/lib32/libGL.so.1(+0x49678) [0x7d358678]. Should libGL really be loaded when using nine? I hope something is useful in there. Thanks, John
(In reply to Emil Velikov from comment #19) > IMHO the [file] source of the timestamps is not the bigger mystery, but why > we end up with "no such file or directory...---disabling". Because the shader cache code deletes all but one timestamp tuple directory.
(In reply to Michel Dänzer from comment #23) > (In reply to Emil Velikov from comment #19) > > IMHO the [file] source of the timestamps is not the bigger mystery, but why > > we end up with "no such file or directory...---disabling". > > Because the shader cache code deletes all but one timestamp tuple directory. Right, I must have misread 512c046eddc16bec9c309a3c7b4205d94ffc4ef3. Was under the impression that 'old' versions are purged only as we hit the size limit. Maybe it's worth exploring, maybe not. Another alternative is to "disable" disk caching all together for anyone but the dri module(s).
After a bunch of discussion it's been decided to move all cache entries to a single cache dir and instead use the timestamp (in future maybe build-id hash) and gpu id as part of the cache entries sha input. We will remove old entries as the cache approaches its max size. This solution should solve the issues for nine. It makes things a little harder for collection/distribution of precompilled shaders but not unsolvable. Thanks to everyone for there input, I'll try to get some patches this coming week :)
Timothy, do you have any patch you'd like me to test? Thanks! John
https://patchwork.freedesktop.org/series/21582/(In reply to John from comment #26) > Timothy, do you have any patch you'd like me to test? > > Thanks! > John This should do it: https://patchwork.freedesktop.org/series/21582/
That seems to work indeed! Thank you!
Should be fixed by: commit feb716239e1a318eb844fd3bcaca1ffd6903067c Author: Grazvydas Ignotas <notasas@gmail.com> Date: Thu Mar 16 01:09:29 2017 +0200 util/disk_cache: hash timestamps into the cache keys Instead of using a directory, hash the timestamps into the cache keys themselves. Since there is no more timestamp directory, there is no more need for deleting the cache of other mesa versions and we rely on eviction to clean up the old cache entries. This solves the problem of using several incarnations of disk_cache at the same time, where one deletes a directory belonging to the other, like when both OpenGL and gallium nine are used simultaneously (or several different mesa installations). v2: using additional blob instead of trying to clone sha1 state v3: (Timothy Arceri) don't use an opaque data type to store timestamp. V4: (Timothy Arceri) use blob to store driver keys just make sure to store null terminator for strings, and make sure blob is defined by disk_cache and not it's users. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100091
That seems to be working fine. Thanks to Timothy, Grazvydas and all others involved on fixing this!
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.