Bug 110814

Summary: KWin compositor crashes on launch
Product: Mesa Reporter: Eugene Shalygin <eugene.shalygin+bugzilla.FDO>
Component: Drivers/DRI/i965Assignee: Kenneth Graunke <kenneth>
Status: RESOLVED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: critical    
Priority: medium CC: andreas.sturmlechner, bvbfan, kenneth
Version: 19.1Keywords: bisected, regression
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
See Also: https://bugs.kde.org/show_bug.cgi?id=408333
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 111444    
Attachments: lspci
Xorg.log
apitrace
patch-add-log
stderr log with the patch
log+potentialfix
log with "log+potentialfix" applied
logs extra
log extra
kwinrc

Description Eugene Shalygin 2019-06-02 18:04:13 UTC
#5  0x00007effa92d2fae in _mesa_glthread_finish () from /usr/lib64/dri/i965_dri.so
#6  0x00007effa9181334 in intelUnbindContext () from /usr/lib64/dri/i965_dri.so
#7  0x00007effa923a5ca in driUnbindContext () from /usr/lib64/dri/i965_dri.so
#8  0x00007effc1dcd059 in dri2_make_current () from /usr/lib64/libEGL.so.1
#9  0x00007effc1dbdfe8 in eglMakeCurrent () from /usr/lib64/libEGL.so.1
#10 0x00007effb980e20e in non-virtual thunk to KWin::AbstractEglBackend::makeCurrent() () from /usr/lib64/qt5/plugins/org.kde.kwin.platforms/KWinX11Platform.so
#11 0x00007eff902e9ab4 in KWin::SceneOpenGLShadow::prepareBackend() () from /usr/lib64/qt5/plugins/org.kde.kwin.scenes/KWinSceneOpenGL.so
#12 0x00007effc6bb0104 in KWin::Shadow::init(KDecoration2::Decoration*) () from /usr/lib64/libkwin.so.5
#13 0x00007effc6bb04ba in KWin::Shadow::createShadowFromDecoration(KWin::Toplevel*) () from /usr/lib64/libkwin.so.5
#14 0x00007effc6bb5e3e in KWin::Shadow::createShadow(KWin::Toplevel*) () from /usr/lib64/libkwin.so.5
#15 0x00007effc6b74675 in KWin::Toplevel::getShadow() () from /usr/lib64/libkwin.so.5
#16 0x00007effc6b6d83a in KWin::Scene::addToplevel(KWin::Toplevel*) () from /usr/lib64/libkwin.so.5
#17 0x00007effc6b75ad3 in KWin::Toplevel::setupCompositing() () from /usr/lib64/libkwin.so.5
#18 0x00007effc6b78428 in KWin::Compositor::startupWithWorkspace() () from /usr/lib64/libkwin.so.5
#19 0x00007effc6b79050 in KWin::Compositor::slotCompositingOptionsInitialized() () from /usr/lib64/libkwin.so.5
#20 0x00007effc6b7966c in KWin::Compositor::setup() () from /usr/lib64/libkwin.so.5
#21 0x00007effc59087ba in QMetaObject::activate(QObject*, int, int, void**) () from /usr/lib64/libQt5Core.so.5
#22 0x00007effc66b5bb2 in QAction::triggered(bool) () from /usr/lib64/libQt5Widgets.so.5
#23 0x00007effc66c0bab in QAction::activate(QAction::ActionEvent) () from /usr/lib64/libQt5Widgets.so.5
#24 0x00007effc3c4c790 in KGlobalAccelPrivate::_k_invokeAction(QString const&, QString const&, long long) () from /usr/lib64/libKF5GlobalAccel.so.5
#25 0x00007effc5908675 in QMetaObject::activate(QObject*, int, int, void**) () from /usr/lib64/libQt5Core.so.5
#26 0x00007effc3c4e866 in OrgKdeKglobalaccelComponentInterface::qt_static_metacall(QObject*, QMetaObject::Call, int, void**) () from /usr/lib64/libKF5GlobalAccel.so.5
#27 0x00007effc3c4ff13 in OrgKdeKglobalaccelComponentInterface::qt_metacall(QMetaObject::Call, int, void**) () from /usr/lib64/libKF5GlobalAccel.so.5
#28 0x00007effc3a6e98b in QDBusConnectionPrivate::deliverCall(QObject*, int, QDBusMessage const&, QVector<int> const&, int) [clone .constprop.61] () from /usr/lib64/libQt5DBus.so.5
#29 0x00007effc5906ebb in QObject::event(QEvent*) () from /usr/lib64/libQt5Core.so.5
#30 0x00007effc66c69e1 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib64/libQt5Widgets.so.5
#31 0x00007effc66d7760 in QApplication::notify(QObject*, QEvent*) () from /usr/lib64/libQt5Widgets.so.5
#32 0x00007effc592ae80 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /usr/lib64/libQt5Core.so.5
#33 0x00007effc592b1dd in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () from /usr/lib64/libQt5Core.so.5
#34 0x00007effc58e895a in QEventDispatcherUNIX::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib64/libQt5Core.so.5
#35 0x00007effbcbd551e in QXcbUnixEventDispatcher::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib64/libQt5XcbQpa.so.5
#36 0x00007effc592b913 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib64/libQt5Core.so.5
#37 0x00007effc592bcb2 in QCoreApplication::exec() () from /usr/lib64/libQt5Core.so.5
#38 0x00007effc6d0eea5 in kdemain () from /usr/lib64/libkdeinit5_kwin_x11.so
#39 0x00007effc6d39f1b in __libc_start_main () from /lib64/libc.so.6
#40 0x0000559a766bf09a in _start ()

Works fine with 19.0
Comment 1 Denis 2019-06-03 06:56:23 UTC
hi Eugene. Provide please your HW information, mesa version and kernel version.
Also would be great to see your Xorg.1.log (if you are using X server).
Your current OS also would be helpful to know.
Thanks.
Comment 2 Eugene Shalygin 2019-06-03 11:08:55 UTC
Created attachment 144422 [details]
lspci
Comment 3 Eugene Shalygin 2019-06-03 11:10:02 UTC
Created attachment 144423 [details]
Xorg.log
Comment 4 Eugene Shalygin 2019-06-03 11:12:56 UTC
$ uname -a
Linux tiger 5.1.5-gentoo #1 SMP PREEMPT Mon May 27 17:27:22 CEST 2019 x86_64 Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz GenuineIntel GNU/Linux

Mesa versions 19.1.0 RC1 to RC4, all show the same behaviour. In a Wayland Plasma session everything is extremely clumsy: screen updates irregulary, mouse cursor moves by big leaps (maybe 0.5 second).
Comment 5 Denis 2019-06-03 15:08:16 UTC
thanks. I have hawell based CPU too, with the same GPU on board:
Intel Core i5-4300M 	
IntelĀ® HD Graphics 4600 
Fedora 29

And I can't reproduce this issue. According to your xorg log, you have radeon gpu on board. Could you please check kwin on it?
something like:
>>DRI_PRIME=1 kwin_x11 --replace
Comment 6 Eugene Shalygin 2019-06-03 15:36:39 UTC
With DRI_PRIME kwin does not crash, but not all composite effects can be enabled, so maybe it simply does not touch the crashing path? However, window shadows (the backtrace contains window shadows) are drawn normally.
Comment 7 Denis 2019-06-03 16:06:02 UTC
aha, ok, thanks for this test. One more idea came into mind - try please to make an apitrace of it. I tested it on my machine - and successfully created 2 apitraces, so it is possible.

Command:

>apitrace trace kwin_x11 --replace
To stop tracing use "ctrl + Z" (ctrl+C will crash apitrace and won't generate trace). Also would be helpful to check my apitrace on your HW (I will upload it later).

btw - did you try to disable compositing for windows? In my case I checked both - with and without compositing
Comment 8 Eugene Shalygin 2019-06-03 17:26:18 UTC
Thank you, I'll try to run apitest. In the meantime recompiled mesa with debug options and here is the updated stacktrace:

Application: KWin (kwin_x11), signal: Segmentation fault
 
Thread 1 (Thread 0x7f314646c8c0 (LWP 24002)):
[KCrash Handler]
#5  0x00007f312e10d124 in _mesa_glthread_finish (ctx=0x0) at ../mesa-19.1.0-rc4/src/mesa/main/glthread.c:184
#6  0x00007f312e00a8a4 in intelUnbindContext (driContextPriv=<optimized out>) at ../mesa-19.1.0-rc4/src/mesa/drivers/dri/i965/brw_context.c:1232
#7  0x00007f312e0987ea in driUnbindContext (pcp=0x55acf0ede080) at ../mesa-19.1.0-rc4/src/mesa/drivers/dri/common/dri_util.c:615
#8  0x00007f314ab53425 in dri2_make_current (drv=0x55acee384c30, disp=0x55acf1aecd90, dsurf=0x55acf1c030d0, rsurf=0x55acf1c030d0, ctx=<optimized out>) at ../mesa-19.1.0-rc4/src/egl/drivers/dri2/egl_dri2.c:1479
#9  0x00007f314ab491d7 in eglMakeCurrent (dpy=0x55acf1aecd90, draw=0x55acf1c030d0, read=0x55acf1c030d0, ctx=<optimized out>) at ../mesa-19.1.0-rc4/src/egl/main/eglapi.c:869
#10 0x00007f313e58020e in non-virtual thunk to KWin::AbstractEglBackend::makeCurrent() () from /usr/lib64/qt5/plugins/org.kde.kwin.platforms/KWinX11Platform.so
#11 0x00007f308d61388e in KWin::SceneOpenGLShadow::~SceneOpenGLShadow() () from /usr/lib64/qt5/plugins/org.kde.kwin.scenes/KWinSceneOpenGL.so
Comment 9 Eugene Shalygin 2019-06-03 17:29:52 UTC
(In reply to Denis from comment #7)
> btw - did you try to disable compositing for windows? In my case I checked
> both - with and without compositing

It crashes only with compositing enabled, as I wrote in the bug title. Perhaps you mean something else?
Comment 10 Eugene Shalygin 2019-06-03 17:38:12 UTC
Created attachment 144428 [details]
apitrace
Comment 11 Sergii Romantsov 2019-06-04 08:37:25 UTC
Hello, Eugene.
Could you, please, apply patch '110814_log.diff' and provide dumped log.

Also additionally could you, please, try as workaround: for X11 config similar to '/usr/share/X11/xorg.conf.d/20-intel.conf' apply (for sure its better to reboot):
'''
Section "Device"
  Identifier  "Intel Graphics"
  Driver      "intel"
  Option      "TearFree" "true"
  Option "DRI" "3"
EndSection
'''
Comment 12 Sergii Romantsov 2019-06-04 08:38:08 UTC
Created attachment 144440 [details] [review]
patch-add-log
Comment 13 Eugene Shalygin 2019-06-04 11:28:15 UTC
Created attachment 144443 [details]
stderr log with the patch

Had to change Xorg driver to "intel" from "modesetting" to apply the configuration options. The log is from the the "intel" driver.
Comment 14 Sergii Romantsov 2019-06-05 10:52:30 UTC
One more ask to check with new patch (we can't reproduce it on our side): '110814_log+pfix.diff'
Potentially it contains a fix also or may cause another assert - then will need a new stacktrace
Comment 15 Sergii Romantsov 2019-06-05 10:52:57 UTC
Created attachment 144456 [details] [review]
log+potentialfix
Comment 16 Eugene Shalygin 2019-06-05 11:27:39 UTC
Created attachment 144457 [details]
log with "log+potentialfix" applied

Thank you, with this patch it does not crash and compositing seems to be working fine.
Comment 17 Sergii Romantsov 2019-06-05 11:32:53 UTC
Good, will provide a MR to Mesa soon.
Thanks.
Comment 18 andreas.sturmlechner 2019-06-05 12:30:13 UTC
Also confirming this patch fixes a KWin crash with EGL platform interface on my system.
Comment 19 Eugene Shalygin 2019-06-05 12:55:16 UTC
I use the EGL backend too.
Comment 20 Sergii Romantsov 2019-06-05 13:03:01 UTC
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1020

But it may just hide a some another issue.
So would like to ask for help in collecting more logs with patch '110814_log+ext.diff' (it will crash).

Eugene, could you help with it?
Comment 21 Sergii Romantsov 2019-06-05 13:03:42 UTC
Created attachment 144458 [details] [review]
logs extra
Comment 22 Eugene Shalygin 2019-06-05 14:24:09 UTC
Created attachment 144459 [details]
log extra

Here is the log, please.
Comment 23 Denis 2019-06-05 14:28:36 UTC
and one more question from me, Eugene, provide please output from 
>~/.config/kwinrc
Comment 24 Eugene Shalygin 2019-06-05 14:33:41 UTC
Created attachment 144460 [details]
kwinrc

Maybe you are looking for this too?

$ env | grep KWIN
KWIN_OPENGL_WS=egl
Comment 25 Denis 2019-06-05 14:55:39 UTC
and for this also, yeap. My current config file is match smaller from your's.
I tried to launch kwin with mentioned flag. Everything worked fine also. Will try to apply your config (with some changes related to my PC)
Comment 26 Denis 2019-06-05 15:05:21 UTC
hmmm, looks like I am on  a right way :)
https://gist.github.com/DenKos363/48663a808e8c910bca636a91f6ffa641
I got 1 sig fault, very similar to your's. Investigating and trying to find stable steps
Comment 27 Denis 2019-06-21 12:33:13 UTC
hi, sorry for long reply, but anyway - we could find and reproduce this issue.
Also I made a bisect, and it lead to this commit:

commit dca36d5516d0fdaf012b4476975c5d585c2d1a09
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Sun Jul 9 23:03:44 2017 -0700

i965: Implement threaded GL support.

Now i965 supports mesa_glthread=true like Gallium drivers do.

According to Markus (degasus), the Citra emulator now runs ~30% faster.
Emmanuel (linkmauve) also reported that the Dolphin emulator improved
by 2.8x on one game. (Both of those still need to be added to drirc.)

An Intel Mesa CI run with mesa_glthread=true appears to be happy.

Bioshock Infinite's benchmark mode seems to be around 15-20% faster
on my Skylake GT4 at 1920x1080.

Tested-by: Markus Wick <markus@selfnet.de>
Tested-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

Also we created a piglit-test, which should trigger and test this bug:
https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87

As I know, right now the only one "work" solution is waiting for review:
https://bugs.freedesktop.org/show_bug.cgi?id=110814#c20
Comment 28 Paul 2019-08-02 09:02:34 UTC
*** Bug 111271 has been marked as a duplicate of this bug. ***
Comment 29 Anthony 2019-08-09 07:41:51 UTC
Is there any stopper to get this patch in, KWin crashes in every eglMakeCurrent, that can easy test when you have egl and glx at same time, that's always on modesetting DDX drivers.
Comment 30 Anthony 2019-08-30 11:24:24 UTC
Second released version and without this patch, i still can't understand what is the stopper?
Comment 31 Denis 2019-08-30 12:16:15 UTC
(In reply to Anthony from comment #30)
> Second released version and without this patch, i still can't understand
> what is the stopper?

https://bugs.freedesktop.org/show_bug.cgi?id=111444 here is a release tracking ticket. I asked guys to review/add it to the blockers for release, if they decide that it should be added.
Comment 32 Mark Janes 2019-09-12 21:25:46 UTC
fixed by

1dce75c1839f08cfa78400367019f998c258eff5
Author:     Sergii Romantsov <sergii.romantsov@globallogic.com>
CommitDate: Thu Sep 5 09:04:12 2019 -0700

intel/dri: finish proper glthread

KWin was able to get NULL-context in the call
intelUnbindContext. But a call _mesa_glthread_finish
is not resistent to such case.
Case can be catched with steps:
	1. Create both glx and egl contexts
	2. Make glx as current
	3. Make egl as current
	4. Reset glx context
	5. Make egl as current

Solution adds proper finishing of glthread-context
(context will be taken from the requested dri-context
for unbinding, but not from the saved current context).

Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271
Fixes: dca36d5516d0 (i965: Implement threaded GL support)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.