Created attachment 136931 [details]
mostly useless gdb backtrace
I think it's related to the series that ended with this commit "st/glsl_to_tgsi: add ARB_get_program_binary support using TGSI"
but I have to try around a bit more.
It doesn't always happen. I think it's related to the shader cache. Sometimes deleting the shader cache helps, sometimes it doesn't. I think MESA_EXTENSION_OVERRIDE=-GL_ARB_get_program_binary helps.
Usually export LD_LIBRARY_PATH=... LIBGL_DRIVERS_PATH=... to another mesa build, starting the application once and then starting it with the system mesa again helps. It could be related to mesa first deleting the other shader cache before making a new one. Some weird race condition?
Applications I've seen affected:
systemsettings (rendering black window content, segfaulting)
krunner (rendering breaking, segfaulting)
As far as I can tell it doesn't happen with mesa debug builds...
More investigation is needed.
Can you try running an affected application in valgrind (either with a debug build of radeonsi_dri.so, or at least a release build with -g)?
Created attachment 136932 [details]
I think I have tried running with valgrind before. I tried valgrind with -Og mesa rendering the black window, but no (relevant) errors comes up.
To describe one case:
I just tried a mesa debug build and everything worked normal.
Then with the same options, but a release build with additionally CFLAGS=-Og CXXFLAGS=-Og renders systemsettings5 and plasmashell completely black. No segfaults this time. Starting plasmashell and systemsettings with MESA_EXTENSION_OVERRIDE=-GL_ARB_get_program_binary makes it display normally. But starting it without the variable after that makes it black again. Deleting the shader cache does not help in this case.
But after starting systemsettings5 with another mesa installation once, it starts working normally with the mesa build that produced the black window just before too.
It's possible the segfaults and the black windows are different problems but they started happening at the same time.
I made this go away by deleting ~/.cache/qtshadercache and ~/.cache/mesa_shader_cache
Created attachment 136940 [details]
better gdb backtrace
Managed to get a segfault (happens every time now) with the same mesa build that previously was just showing a black screen, now with a better backtrace.
Deleting only the qt shadercache (TIL that's a thing) doesn't help. Deleting only the mesa shader cache doesn't help. Deleting both does help. Weird.
Created attachment 136950 [details] [review]
I wasn't able to reproduce the issue, but can you give this patch a try?
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
I got it since the commit went in.
Deleting both like Mike suggested worked here since that time, too.
I ignored it 'cause I thought that I'm running devel stuff and my son's account switched daily from Mesa devel to release back and forth...;-)
(In reply to Timothy Arceri from comment #5)
> Created attachment 136950 [details] [review] [review]
> possible fix
> I wasn't able to reproduce the issue, but can you give this patch a try?
Preliminary result: It does not help, after booting today with the patch plasmashell was segfaulting again until
rm -rf ~/.cache/mesa_shader_cache/ ~/.cache/qtshadercache
Sometimes I wiped this, too.
All above files would be regenerated automatically.
After this KDE5/Plasma5 started all the time with a nice/clean desktop, again.
It might be worth keeping these somewhere rather than deleting. Once everything is working, copy them back see if they issue can be reproduced that way
I was able to reproduce the problem. Fix sent to list:
Shoul be fixed by the following commit. Please reopen if the issue continues.
Author: Timothy Arceri <firstname.lastname@example.org>
Date: Fri Jan 26 11:56:50 2018 +1100
st/shader_cache: restore num_tgsi_tokens when loading from cache
Without this we will fail to correctly serialise programs when
using glGetProgramBinary() if the program was retrieved from
the disk cache rather than freshly compiled.
Fixes: c69b0dd6817b "st/glsl_to_tgsi: store num_tgsi_tokens in st_*_program"
Reviewed-by: Gert Wollny <email@example.com>
*** Bug 104806 has been marked as a duplicate of this bug. ***
That commit needs to be cherry picked to the 18.0 branch
Confirming: 041b18cf23a0acf7b0eddf63cd7a2a10192432a1 applied to 18.0.0_rc3, followed by cleaning ~/.cache of root, sddm and my user stops the crashes.
I still experience this with mesa 18.0.0_rc3 and r600 driver.
Resolution first looked ok, user processes (plasmashell etc.) stopped crashing, but screen locker and logout screen crash (not right away, after a few hours of working).
As always, clearing caches in /root/.cache and /var/lib/sddm/.cache helps, but this problem would occur again.
I use Qt 5.9.4.
Backtrace of /usr/lib64/libexec/ksmserver-logout-greeter mentions /usr/lib64/dri/r600_dri.so.
(In reply to Fireball from comment #15)
> I still experience this with mesa 18.0.0_rc3 and r600 driver.
It is not supposed to be fixed by rc3, since Timothy's patch 041b18cf23a0acf7b0eddf63cd7a2a10192432a1 only got applied after that version was released. It also was not yet backported to the 18.0 branch , so for now you need to apply it manually (e.g. by placing it in /etc/portage/patches/media-libs/mesa-18.0.0_rc3, if you are on Gentoo).
Please don't reopen this bug, the fix is already in master and has been tagged as a fix in the commit message so it will be picked up in the next stable version.
To be fair, the segfaults are fixed, but sddm and plasmashell randomly not working/rendering until the shader cache(s) are deleted is still happening. Unfortunately that's a bit harder to debug, but there might still be a similar issue hiding somewhere.
(In reply to Christoph Haag from comment #18)
> To be fair, the segfaults are fixed, but sddm and plasmashell randomly not
> working/rendering until the shader cache(s) are deleted is still happening.
> Unfortunately that's a bit harder to debug, but there might still be a
> similar issue hiding somewhere.
This may be rather bug 105065, which is really a QT bug: https://bugreports.qt.io/browse/QTBUG-66420