I have been testing wayland on r300g and things have been working reasonably until now. I get a segfault when trying to start any weston client except simple-egl. Here is the back trace with mesa, cairo and weston built with -O0: $ gdb ./clients/weston-desktop-shell GNU gdb (Ubuntu/Linaro 7.2-1ubuntu11) 7.2 Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i686-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/scott/src/wayland/weston/clients/weston-desktop-shell...done. (gdb) run Starting program: /home/scott/src/wayland/weston/clients/weston-desktop-shell [Thread debugging using libthread_db enabled] XDG_RUNTIME_DIR not set, falling back to . Program received signal SIGSEGV, Segmentation fault. 0x00d9488d in st_framebuffer_validate (stfb=0x184b700, st=0x80e11d8) at state_tracker/st_manager.c:186 186 int32_t new_stamp = p_atomic_read(&stfb->iface->stamp); (gdb) bt #0 0x00d9488d in st_framebuffer_validate (stfb=0x184b700, st=0x80e11d8) at state_tracker/st_manager.c:186 #1 0x00d958a3 in st_api_make_current (stapi=0x1802620, stctxi=0x80e11d8, stdrawi=0x0, streadi=0x0) at state_tracker/st_manager.c:731 #2 0x00d35ac6 in dri_make_current (cPriv=0x8073800, driDrawPriv=0x0, driReadPriv=0x0) at dri_context.c:216 #3 0x00d30bfc in driBindContext (pcp=0x8073800, pdp=0x0, prp=0x0) at ../../../../src/mesa/drivers/dri/common/dri_util.c:330 #4 0x00149efa in dri2_make_current (drv=0x8064910, disp=0x8063c38, dsurf=0x0, rsurf=0x0, ctx=0x8073858) at egl_dri2.c:818 #5 0x0013eb9b in eglMakeCurrent (dpy=0x8063c38, draw=0x0, read=0x0, ctx=0x8073858) at eglapi.c:502 #6 0x0028e4d2 in _egl_make_current_surfaceless (ctx=0x81bc548) at cairo-egl-context.c:127 #7 0x0028e5b9 in cairo_egl_device_create (dpy=0x8063c38, egl=0x8073858) at cairo-egl-context.c:160 #8 0x08050fe6 in init_egl (d=0x805ef00) at window.c:2787 #9 0x08051362 in display_create (argc=0xbffff350, argv=0xbffff354, option_entries=0x0) at window.c:2891 #10 0x0804c1f6 in main (argc=1, argv=0xbffff3f4) at desktop-shell.c:669 (gdb) bt full #0 0x00d9488d in st_framebuffer_validate (stfb=0x184b700, st=0x80e11d8) at state_tracker/st_manager.c:186 textures = {0x1842ff4, 0xbffff058, 0xd943dc, 0xbffff0ac, 0x184b700, 0x5f03, 0x184b700} width = 3221221512 height = 25474816 i = 3221221548 changed = 0 '\000' new_stamp = 14242151 #1 0x00d958a3 in st_api_make_current (stapi=0x1802620, stctxi=0x80e11d8, stdrawi=0x0, streadi=0x0) at state_tracker/st_manager.c:731 st = 0x80e11d8 stdraw = 0x184b700 stread = 0x184b700 ret = 8 '\b' #2 0x00d35ac6 in dri_make_current (cPriv=0x8073800, driDrawPriv=0x0, driReadPriv=0x0) at dri_context.c:216 ctx = 0x80720c0 draw = 0x0 read = 0x0 old_st = 0x80e11d8 #3 0x00d30bfc in driBindContext (pcp=0x8073800, pdp=0x0, prp=0x0) at ../../../../src/mesa/drivers/dri/common/dri_util.c:330 No locals. #4 0x00149efa in dri2_make_current (drv=0x8064910, disp=0x8063c38, dsurf=0x0, rsurf=0x0, ctx=0x8073858) at egl_dri2.c:818 dri2_drv = 0x8064910 dri2_dpy = 0x8064b28 dri2_dsurf = 0x0 dri2_rsurf = 0x0 dri2_ctx = 0x8073858 old_ctx = 0x8073858 old_dsurf = 0x0 old_rsurf = 0x0 ddraw = 0x0 rdraw = 0x0 cctx = 0x8073800 __PRETTY_FUNCTION__ = "dri2_make_current" #5 0x0013eb9b in eglMakeCurrent (dpy=0x8063c38, draw=0x0, read=0x0, ctx=0x8073858) at eglapi.c:502 disp = 0x8063c38 context = 0x8073858 draw_surf = 0x0 read_surf = 0x0 drv = 0x8064910 ret = 0 __FUNCTION__ = "eglMakeCurrent" #6 0x0028e4d2 in _egl_make_current_surfaceless (ctx=0x81bc548) at cairo-egl-context.c:127 extensions = 0x80644a0 "EGL_MESA_drm_image EGL_WL_bind_wayland_display EGL_KHR_image_base EGL_KHR_image_pixmap EGL_KHR_image EGL_KHR_gl_renderbuffer_image EGL_KHR_surfaceless_gles1 EGL_KHR_surfaceless_gles2 EGL_KHR_surfacele"... #7 0x0028e5b9 in cairo_egl_device_create (dpy=0x8063c38, egl=0x8073858) at cairo-egl-context.c:160 ctx = 0x81bc548 status = 134690904 attribs = {12375, 1, 12374, 1, 12344} config = 0x123c40 numConfigs = -1073745240 #8 0x08050fe6 in init_egl (d=0x805ef00) at window.c:2787 major = 1 minor = 4 n = 1 argb_cfg_attribs = {12339, 6, 12324, 1, 12323, 1, 12322, 1, 12321, 1, 12325, 1, 12352, 8, 12344} rgb_cfg_attribs = {12339, 6, 12324, 1, 12323, 1, 12322, 1, 12321, 0, 12325, 1, 12352, 8, 12344} #9 0x08051362 in display_create (argc=0xbffff350, argv=0xbffff354, option_entries=0x0) at window.c:2891 d = 0x805ef00 context = 0x805e990 xkb_option_group = 0x805e9c0 error = 0xbffff348 #10 0x0804c1f6 in main (argc=1, argv=0xbffff3f4) at desktop-shell.c:669 desktop = {display = 0x0, shell = 0x0, unlock_dialog = 0x0, unlock_task = {run = 0x804be34 <unlock_dialog_finish>, link = {prev = 0x0, next = 0x0}}, outputs = {prev = 0xbffff31c, next = 0xbffff31c}} config_file = 0x76e324 "" output = 0xbffff348 (gdb) q A debugging session is active. Inferior 1 [process 24323] will be killed. Quit anyway? (y or n) y After fiddling around a bit, I found a good mesa commit and bisected to arrive at the following: c87247f6a8c5505fea3fa29dac372f9f5316a118 is the first bad commit commit c87247f6a8c5505fea3fa29dac372f9f5316a118 Author: Brian Paul <brianp@vmware.com> Date: Fri Jan 6 12:42:40 2012 -0700 mesa: remove gl_framebuffer:_DepthBuffer, _StencilBuffer fields These were used by swrast to make a combined depth+stencil buffer look like separate depth and stencil buffers. But that's no longer needed after rewriting the depth/stencil code in swrast. Reviewed-by: Eric Anholt <eric@anholt.net> I double checked and the previous commit does indeed work, while this one causes the issue. System is x86 32 bit with RV350. Kernel 2.6.38. Please let me know if any further information or testing is needed.
Additionally, I've built mesa with the following configuration: --with-egl-platforms=wayland,drm,x11 --disable-gallium-egl --with-dri-drivers="" --enable-gles1 --enable-gles2 --with-gallium-drivers=r300,swrast --enable-shared-glapi --enable-gbm I've also tried --enable-gallium-egl with the same result, though the bt was using gallium paths.
It's hard to see how that commit could break anything. Have you made sure everything was rebuilt to match the new layout of struct gl_framebuffer, e.g. with make clean?
Yes, I have a script that builds the entire stack from wayland to mesa, cairo, weston and everything in between. For each component it does git reset --hard origin/master as well as git clean -fdx and installs to a nonstandard prefix. When I first found this bug, I removed the prefix and built the entire stack fresh. I can reliably reproduce the issue or not by toggling between the bad and previous commits respectively.
Hi, I can confirm this issue also happens with a r600 card. Here's an example of my backtrace from the weston-desktop-shell client crashing: #0 0x00007ffff4601fba in st_framebuffer_validate.isra.3 () from /home/damien/lib/dri/r600_dri.so #1 0x00007ffff4603469 in st_api_make_current () from /home/damien/lib/dri/r600_dri.so #2 0x00007ffff45bbe8f in driBindContext () from /home/damien/lib/dri/r600_dri.so #3 0x00007ffff71bde90 in dri2_make_current () from /home/damien/lib/libEGL.so.1 #4 0x00007ffff71b6159 in eglMakeCurrent () from /home/damien/lib/libEGL.so.1 #5 0x00007ffff771e58d in cairo_egl_device_create () from /home/damien/lib/libcairo.so.2 #6 0x0000000000409545 in init_egl (d=0x620630) at window.c:2822 #7 display_create (argc=0x7fffffffde0c, argv=0x7fffffffde00, option_entries=<optimized out>) at window.c:2926 #8 0x00000000004040a6 in main (argc=1, argv=0x7fffffffdf58) at desktop-shell.c:672 When I build mesa from commit 21b28d520ff218d165e86aa71dbd02050a3aa0cd (just before the first bad commit), then it works fine.
I can also confirm the bad commit, with a different codebase than Wayland (but exactly the same mesa backtrace). I use nouveau.
It might be useful if you post the backtrace, what program you're running and give more details about your system etc.
Another report from irc: <stfacc> hi, whenever I try to run any wayland client I get a segfault <stfacc> here is the bt http://dpaste.com/691450/ <stfacc> this happens for all clients using cairo (simple-egl works for example) bt paste contents: #0 st_framebuffer_validate (stfb=0x7ffff0acbce0, st=<optimized out>) at state_tracker/st_manager.c:186 #1 0x00007fffefc40e68 in st_api_make_current (stapi=<optimized out>, stctxi=0x7a36e0, stdrawi=<optimized out>, streadi=<optimized out>) at state_tracker/st_manager.c:731 #2 0x00007fffefc0238f in driBindContext (pcp=<optimized out>, pdp=<optimized out>, prp=<optimized out>) at ../../../../src/mesa/drivers/dri/common/dri_util.c:330 #3 0x00007ffff55ac670 in dri2_make_current (drv=0x623120, disp=0x6223a0, dsurf=0x0, rsurf=0x0, ctx=0x62b3c0) at egl_dri2.c:818 #4 0x00007ffff55a5829 in eglMakeCurrent (dpy=0x6223a0, draw=0x0, read=0x0, ctx=0x62b3c0) at eglapi.c:502 #5 0x00007ffff61effcd in _egl_make_current_surfaceless (ctx=<optimized out>) at cairo-egl-context.c:127 #6 cairo_egl_device_create (dpy=0x6223a0, egl=0x62b3c0) at cairo-egl-context.c:160 #7 0x00000000004093f7 in init_egl (d=0x61d200) at window.c:2822 #8 display_create (argc=0x7fffffffdb1c, argv=0x7fffffffdb10, option_entries=<optimized out>) at window.c:2926 #9 0x0000000000404767 in main (argc=1, argv=0x7fffffffdc38) at gears.c:373
Sorry, here are some more details. ran@ran:~$ uname -sr Linux 3.2.1-1-ARCH ran@ran:~$ lspci | grep nVi 01:00.0 VGA compatible controller: nVidia Corporation G94 [GeForce 9600 GT] (rev a1) ran@ran:~$ glxinfo | grep nouveau -A3 OpenGL vendor string: nouveau OpenGL renderer string: Gallium 0.4 on NV94 OpenGL version string: 2.1 Mesa 8.0-devel (git-c25e5300) OpenGL shading language version string: 1.20 Mesa config: --with-dri-drivers= --with-gallium-drivers=nouveau --with-egl-platforms=drm,x11 --enable-gallium-egl --enable-shared-dricore --enable-shared-glapi --enable-egl --enable-gles2 --enable-glx-tls --enable-xcb --enable-texture-float And the backtrace: Core was generated by `./test_terminal'. Program terminated with signal 11, Segmentation fault. #0 st_framebuffer_validate (stfb=0x7f89888e1e60, st=<optimized out>) at state_tracker/st_manager.c:186 186 int32_t new_stamp = p_atomic_read(&stfb->iface->stamp); (gdb) bt #0 st_framebuffer_validate (stfb=0x7f89888e1e60, st=<optimized out>) at state_tracker/st_manager.c:186 #1 0x00007f8987a5ca28 in st_api_make_current (stapi=<optimized out>, stctxi=0x1588910, stdrawi=<optimized out>, streadi=<optimized out>) at state_tracker/st_manager.c:731 #2 0x00007f89879b47cf in driBindContext (pcp=<optimized out>, pdp=<optimized out>, prp=<optimized out>) at ../../../../src/mesa/drivers/dri/common/dri_util.c:330 #3 0x00007f898c1aba60 in dri2_make_current (drv=0x14a4a70, disp=0x149eb20, dsurf=0x0, rsurf=0x0, ctx=0x14a5690) at egl_dri2.c:818 #4 0x00007f898c1a4d39 in eglMakeCurrent (dpy=0x149eb20, draw=0x0, read=0x0, ctx=0x14a5690) at eglapi.c:502 #5 0x00000000004065b2 in context_use (ctx=0x149c700) at src/output_context.c:589 #6 0x0000000000405206 in compositor_use (comp=0x146cf50) at src/output.c:936 #7 0x00000000004039e0 in setup_app (app=0x7fff094f6440) at tests/test_terminal.c:224 #8 0x0000000000403b98 in main (argc=1, argv=0x7fff094f6588) at tests/test_terminal.c:273 This only happens if eglMakeCurrent is called twice, which is the case in my program and in wayland also (e.g. there's a call to eglMakeCurrent followed by a call to cairo_egl_device_create, which also calls eglMakeCurrent). Since we use the surfaceless extension the first call to st_manager.c:st_api_make_current uses an incomplete buffer as a dummy (I think?), so then: (gdb) print stfb == &IncompleteFramebuffer $11 = 1 In the next call the following check at st_manager.c:730 : if (stdraw && stread) { passes but: (gdb) print stfb->iface $28 = (struct st_framebuffer_iface *) 0x0 So there's a null dereference. I'm not familiar with mesa so I can't help with a (correct) patch.
Possible fix: http://lists.freedesktop.org/archives/mesa-dev/2012-January/018029.html
(In reply to comment #9) > Possible fix: > http://lists.freedesktop.org/archives/mesa-dev/2012-January/018029.html I tested this patch and it solves the issue with weston clients here on r300g. Thanks Alex.
The tested patch is committed as 36fb83e4a868e047521b3d5e0edc4d7a77a96aaf, closing.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.