Bug 92103 - [G45] Segmentation fault in get_stencil_miptree
Summary: [G45] Segmentation fault in get_stencil_miptree
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: 10.6
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
Keywords: patch
Depends on:
Reported: 2015-09-24 12:33 UTC by Giulio Bernardi
Modified: 2019-09-25 18:54 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:

Stack trace (5.87 KB, text/plain)
2015-09-24 12:33 UTC, Giulio Bernardi
The patch (537 bytes, patch)
2015-09-24 12:33 UTC, Giulio Bernardi
Details | Splinter Review
debugging output (307.01 KB, text/plain)
2016-02-12 08:03 UTC, Marek Chalupa

Description Giulio Bernardi 2015-09-24 12:33:24 UTC
Created attachment 118428 [details]
Stack trace

This bug was originally reported on RedHat's bugzilla for Fedora 22:

Description of problem:
On the machine I use at work sometimes plasma crashes after the login, before showing the KDE desktop. Investigation with gdb might have shown the cause of the problem.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. In KDM, perform the login

Actual results:
Sometimes plasma crashes before showing the desktop

Expected results:
Plasma should start and show the desktop

Additional info:
I Launched gdb from the KDE crash handler (drkonqi?), see attached stack trace. The crash happened at brw_misc_state.c:215, that is:

(gdb) list
210     static struct intel_mipmap_tree *
211     get_stencil_miptree(struct intel_renderbuffer *irb)
212     {
213        if (!irb)
214           return NULL;
215        if (irb->mt->stencil_mt)
216           return irb->mt->stencil_mt;
217        return irb->mt;
218     }

It turns out that irb->mt was null:

(gdb) print irb->mt
$3 = (struct intel_mipmap_tree *) 0x0

I modified the line to read

if (irb->mt && irb->mt->stencil_mt)

(see the attached patch) and so far (a couple of restarts, a shutdown, and a tenth of logouts/logins) no crash happened.

However, since the crash is not always reproducible, I cannot be 100% sure.

The video card is

00:02.0 VGA compatible controller: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03)

I am using the intel driver with UXA acceleration method (not SNA).
Glxinfo says:

server glx vendor string: SGI
server glx version string: 1.4
client glx vendor string: Mesa Project and SGI
client glx version string: 1.4
OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) G45/G43 
OpenGL version string: 2.1 Mesa 10.6.3 (git-ccef890)
OpenGL shading language version string: 1.20
OpenGL ES profile version string: OpenGL ES 2.0 Mesa 10.6.3 (git-ccef890)
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16

I think this is not a duplicate of bugs like this:
because it was fixed long ago.
Comment 1 Giulio Bernardi 2015-09-24 12:33:58 UTC
Created attachment 118429 [details] [review]
The patch
Comment 2 Giulio Bernardi 2015-09-24 13:16:41 UTC
It looks like this fix is not enough. The crash happened again after a shutdown and a power on (30 minutes later).

In brw_workaroound_depthstencil_alignment pointers depth_mt and stencil_mt are null so the crash is only postponed until some other statement tries to access a member of these structures.

Maybe I'm wrong, but it looks like this crash only happens at first boot after power on...
Comment 3 Marek Chalupa 2016-02-12 08:02:17 UTC

I'm experiencing the same bug, 100% reproducible. I have gnome-shell and mesa compiled from sources and everytime I try to run the gnome-shell, I hit this bug.

When I add check for irb->mt being NULL in brw_clear() function, the crash won't happen, but the external monitor is just fuzzy and blinking (the one on laptop is fine)

mesa: 0f3cea95 (and earlier)
drm: f884af9b (earlier too)
Fedora 23
Comment 4 Marek Chalupa 2016-02-12 08:03:54 UTC
Created attachment 121701 [details]
debugging output
Comment 5 Marek Chalupa 2016-02-15 10:03:03 UTC
(In reply to Marek Chalupa from comment #3)

> When I add check for irb->mt being NULL in brw_clear() function, the crash
> won't happen, but the external monitor is just fuzzy and blinking (the one
> on laptop is fine)

After I change the external monitor from primary to secondary, it stops cluttering and looks just fine (even after changing back to primary)
Comment 6 Bernd Buschinski 2016-12-29 13:26:50 UTC
I also run into this problem with Plasma 5.8.5.
using mesa-13.0.2, kernel 4.9.0, xorg-server-1.18.4 and using dri3.

Plasma crashes under some conditions when trying to render a preview, it is not 100% reproduceable.

GlxInfo stuff:

Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Intel Open Source Technology Center (0x8086)
    Device: Mesa DRI Intel(R) HD Graphics 520 (Skylake GT2)  (0x1916)
    Version: 13.0.2
    Accelerated: yes
    Video memory: 3072MB
    Unified memory: yes
    Preferred profile: core (0x1)
    Max core profile version: 4.5
    Max compat profile version: 3.0
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
OpenGL vendor string: Intel Open Source Technology Center
OpenGL renderer string: Mesa DRI Intel(R) HD Graphics 520 (Skylake GT2) 
OpenGL core profile version string: 4.5 (Core Profile) Mesa 13.0.2
OpenGL core profile shading language version string: 4.50


Thread 1 (Thread 0x7fca8d3e9780 (LWP 6440)):
[KCrash Handler]
#6  0x00007fc9dbb6099f in get_stencil_miptree (irb=<optimized out>) at /var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/mesa/drivers/dri/i965/brw_misc_state.c:229
#7  brw_workaround_depthstencil_alignment (brw=brw@entry=0x14cfa78, clear_mask=clear_mask@entry=50) at /var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/mesa/drivers/dri/i965/brw_misc_state.c:245
#8  0x00007fc9dbb44c87 in brw_clear (ctx=0x14cfa78, mask=50) at /var/tmp/portage/media-libs/mesa-13.0.2/work/mesa-13.0.2/src/mesa/drivers/dri/i965/brw_clear.c:233
#9  0x00007fca8b60aa5a in QSGBatchRenderer::Renderer::renderBatches() () from /usr/lib64/libQt5Quick.so.5
#10 0x00007fca8b610354 in QSGBatchRenderer::Renderer::render() () from /usr/lib64/libQt5Quick.so.5
#11 0x00007fca8b61b9af in QSGRenderer::renderScene(QSGBindable const&) () from /usr/lib64/libQt5Quick.so.5
#12 0x00007fca8b61c06b in QSGRenderer::renderScene(unsigned int) () from /usr/lib64/libQt5Quick.so.5
#13 0x00007fca8b62bc7e in QSGRenderContext::renderNextFrame(QSGRenderer*, unsigned int) () from /usr/lib64/libQt5Quick.so.5
#14 0x00007fca8b675719 in QQuickWindowPrivate::renderSceneGraph(QSize const&) () from /usr/lib64/libQt5Quick.so.5
#15 0x00007fca8b642935 in ?? () from /usr/lib64/libQt5Quick.so.5
#16 0x00007fca8b643b68 in ?? () from /usr/lib64/libQt5Quick.so.5
#17 0x00007fca88993085 in QWindow::event(QEvent*) () from /usr/lib64/libQt5Gui.so.5
#18 0x00007fca8b67fec5 in QQuickWindow::event(QEvent*) () from /usr/lib64/libQt5Quick.so.5
#19 0x00007fca8cf7eafb in PlasmaQuick::Dialog::event(QEvent*) () from /usr/lib64/libKF5PlasmaQuick.so.5
#20 0x00007fc9d83d67ac in ?? () from /usr/lib64/qt5/qml/org/kde/plasma/core/libcorebindingsplugin.so
#21 0x00007fca88e88acc in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib64/libQt5Widgets.so.5
#22 0x00007fca88e9046e in QApplication::notify(QObject*, QEvent*) () from /usr/lib64/libQt5Widgets.so.5
#23 0x00007fca88654c8a in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /usr/lib64/libQt5Core.so.5
#24 0x00007fca88988c4d in QGuiApplicationPrivate::processExposeEvent(QWindowSystemInterfacePrivate::ExposeEvent*) () from /usr/lib64/libQt5Gui.so.5
#25 0x00007fca8898987d in QGuiApplicationPrivate::processWindowSystemEvent(QWindowSystemInterfacePrivate::WindowSystemEvent*) () from /usr/lib64/libQt5Gui.so.5
#26 0x00007fca8896a6cb in QWindowSystemInterface::sendWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib64/libQt5Gui.so.5
#27 0x00007fca765ba030 in ?? () from /usr/lib64/libQt5XcbQpa.so.5
#28 0x00007fca83bb75fd in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
#29 0x00007fca83bb78e0 in ?? () from /usr/lib64/libglib-2.0.so.0
#30 0x00007fca83bb798c in g_main_context_iteration () from /usr/lib64/libglib-2.0.so.0
#31 0x00007fca886a20af in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib64/libQt5Core.so.5
#32 0x00007fca88653c3a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /usr/lib64/libQt5Core.so.5
#33 0x00007fca8865b64c in QCoreApplication::exec() () from /usr/lib64/libQt5Core.so.5
#34 0x000000000041ca98 in ?? ()
#35 0x00007fca87c99650 in __libc_start_main (main=0x41bf30, argc=2, argv=0x7ffcb2204f28, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffcb2204f18) at ../csu/libc-start.c:289
#36 0x000000000041ce19 in _start ()
Comment 7 Ben Widawsky 2016-12-30 00:57:27 UTC
You could try:

diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 40a8d07bfb..ae4967f15d 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -224,7 +224,7 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree *depth_mt,
 static struct intel_mipmap_tree *
 get_stencil_miptree(struct intel_renderbuffer *irb)
-   if (!irb)
+   if (!irb || !irb->mt)
       return NULL;
    if (irb->mt->stencil_mt)
       return irb->mt->stencil_mt;
Comment 8 Bernd Buschinski 2016-12-30 09:53:03 UTC
Ok, I did, but as this is not 100% reproduceable for me... I wonder if I can give a reliable "it is fixed for me".
Comment 9 Ben Widawsky 2016-12-30 19:30:30 UTC
It will certainly fix the segfault. The question is if it breaks something else.
Comment 10 Bernd Buschinski 2017-01-06 12:31:33 UTC
So after 1 week of no crashes, I would say yes, the crash is fixed for me.

But I go some "spam" in dmesg

[16128.464257] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe C (start=142813 end=142814) time 147 us, min 1073, max 1079, scanline start 1072, end 1082
[16163.647659] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe C (start=144924 end=144925) time 169 us, min 1073, max 1079, scanline start 1072, end 1083
[16208.947705] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe C (start=147642 end=147643) time 143 us, min 1073, max 1079, scanline start 1071, end 1081
[16229.047678] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe C (start=148848 end=148849) time 149 us, min 1073, max 1079, scanline start 1069, end 1080
[16414.864584] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe C (start=159997 end=159998) time 162 us, min 1073, max 1079, scanline start 1069, end 1080
[16419.897913] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe C (start=160299 end=160300) time 167 us, min 1073, max 1079, scanline start 1068, end 1080
[16455.046980] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe B (start=162411 end=162412) time 157 us, min 1073, max 1079, scanline start 1069, end 1080

Not sure if this is related or not, I could double check, but afaik it was not present before the patch.
Comment 11 Ben Widawsky 2017-01-06 19:04:58 UTC
It's probably unrelated.
Comment 12 GitLab Migration User 2019-09-25 18:54:42 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1494.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.