This problem is using the same procedure as in #27332. I am working on a "salome" package for mandriva: http://www.salome-platform.org, that uses opencascade in the "geometry" module. It appears to work correctly in software rendering (i.e. force it to use swraster, usually by switching the Xorg module to fbdev or vesa as a "fast hack"), but with radeon drm it doesn't work correctly. Running the command: % MESA_DEBUG=FP LIBGL_DEBUG=verbose runSalome then loading a sample file, gdb -p'ing the pid of SALOME_Session_Server, when I select the geometry module (what causes it to load the opencascade module/library), it crashes before reporting any debug from mesa: (gdb) c Continuing. [New Thread 0xb1717b70 (LWP 28308)] Program received signal SIGSEGV, Segmentation fault. 0xa8e003d0 in _glapi_set_dispatch () from /usr/lib/dri/i965_dri.so (gdb) bt #0 0xa8e003d0 in _glapi_set_dispatch () from /usr/lib/dri/i965_dri.so #1 0xa8df3e93 in _glapi_set_dispatch () from /usr/lib/dri/i965_dri.so #2 0xa8df6948 in _glapi_set_dispatch () from /usr/lib/dri/i965_dri.so #3 0xa8e581b3 in _glapi_set_dispatch () from /usr/lib/dri/i965_dri.so #4 0xa9536ee8 in TelInitWS (ws=203, w=1008, h=406, bgcolr=0.600000024, bgcolg=0.600000024, bgcolb=0.600000024) at ../../../src/OpenGl/OpenGl_telem_util.c:433 #5 0xa9534587 in call_subr_open_ws (aview=0xbfcf0920) at ../../../src/OpenGl/OpenGl_subrvis.c:285 #6 0xa954409c in call_togl_view (aview=0xbfcf0920) at ../../../src/OpenGl/OpenGl_togl_view.c:50 #7 0xa94fa297 in OpenGl_GraphicDriver::View (this=0x2001cfcc, ACView=<value optimized out>) at ../../../src/OpenGl/OpenGl_GraphicDriver_7.cxx:424 #8 0xaf544d3b in Visual3d_View::SetWindow (this=0x9079a7c, AWindow=...) at ../../../src/Visual3d/Visual3d_View.cxx:501 #9 0xaf5293af in V3d_View::SetWindow (this=0x8efbbc4, TheWindow=...) at ../../../src/V3d/V3d_View.cxx:485 #10 0xaf6a00fc in OCCViewer_ViewPort3d::attachWindow(Handle_V3d_View const&, Handle_Aspect_Window const&) () from /usr/lib/salome/libOCCViewer.so.0 I don't see anything suspecting, i.e: (gdb) frame 4 #4 0xa9536ee8 in TelInitWS (ws=203, w=1008, h=406, bgcolr=0.600000024, bgcolg=0.600000024, bgcolb=0.600000024) at ../../../src/OpenGl/OpenGl_telem_util.c:433 433 glClear(GL_COLOR_BUFFER_BIT); (gdb) l 428 } 429 else 430 { 431 glDrawBuffer(GL_FRONT_AND_BACK); 432 glClearColor(bgcolr, bgcolg, bgcolb, ( float )1.0); 433 glClear(GL_COLOR_BUFFER_BIT); 434 glDrawBuffer(GL_BACK); 435 } 436 #else 437 glDrawBuffer(GL_FRONT_AND_BACK); If I tell gdb to continue, the interface will show a popup about a segfault at address 34, and then some debug from mesa is printed: libGL: OpenDriver: trying /usr/lib/dri/i965_dri.so th. 3036104400 - Trace SALOME_Session_Server.cxx [97] : Debug: connect libGL: OpenDriver: trying /usr/lib/dri/i965_dri.so Mesa warning: couldn't open libtxc_dxtn.so, software DXTn compression/decompression unavailable libGL: Can't open configuration file /etc/drirc: No such file or directory. libGL: Can't open configuration file /home/pcpa/.drirc: No such file or directory. th. 3036104400 - Trace SALOME_Session_Server.cxx [100] : Warning: QWidget::repaint: Recursive repaint detected But it doesn't create a display window, or draw anything, only give errors about segfault or attempt to access a null object.
I tried to reproduce the bug with the "default" stuff in the distro (not forcing anything, but with runSalome still doing LIBGL_ALWAYS_INDIRECT=true). Distro is using mesa 7.8.1, x11-server 1.7.7, drm 2.4.20, kernel 2.6.33, intel 2.11.0 This is the backtrace I got: #0 intel_region_buffer (intel=0x2403ee0, region=0x0, flag=2) at intel_regions.c:498 #1 0x00007fbb6301ba56 in intelClearWithBlit (ctx=0x2403ee0, mask=3) at intel_blit.c:268 #2 0x00007fbb6301434a in intelClear (ctx=0x2403ee0, mask=<value optimized out>) at intel_clear.c:169 #3 0x00007fbb6456b293 in __glXDisp_Render (cl=<value optimized out>, pc=0x20367b8 "\b") at glxcmds.c:1823 #4 0x00007fbb6456f4be in __glXDispatch (client=0x20363e0) at glxext.c:578 #5 0x000000000043886c in Dispatch () at dispatch.c:439 #6 0x000000000042271c in main (argc=1, argv=0x7d9408, envp=<value optimized out>) at main.c:285 The segfault happens because at intel_regions.c:498 region=0x0 and we have this: if (region->pbo) {
Just for clarification. I asked Paulo Zanoni to look at it because my previous correction/workaround of setting LIBGL_ALWAYS_INDIRECT=true, to having a working package now causes the X Server to crash... If not setting LIBGL_ALWAYS_INDIRECT, now it appears to have a behavior very similar to the one of the ati driver, that is always triggering a SIGFPE, but in the client code, that handles it... And the root cause appears to be the same; at least the same mesa debug is visible: Mesa: User error: GL_INVALID_ENUM in glIsEnabled(0xb72) Mesa: User error: GL_INVALID_ENUM in glEnable(0xb72) https://bugs.freedesktop.org/show_bug.cgi?id=27332 describes the problem with the ati driver. (changing Importance to highest because now the package is unusable with the intel driver)
Created attachment 35503 [details] [review] Mesa-7.8.1-salome.patch I suggested applying this patch in Mandriva, based on the backtrace Pzanoni reported, but any feedback about it is welcome. The patch corrects the problem, and at least in the computer I tested, it did not have problems with multiple outputs also.
I will check monday if I there are opencascade sources available for the prebuilt salome binaries (I asked in the salome-platform forum for some hints just after I got a working package, but got no responses so far... http://www.salome-platform.org/forum/forum_12/459079973), but I am almost sure it uses OSMesa, as the problem doesn't happen there, but the binaries are for a very older Mandriva distro, and requires installing some old packages/libraries to get it working in Mandriva cooker. Changing priority because the intel driver is very common, but the current version makes the salome package almost unusable (almost, because only the geometry module has the problem; but I understand it is not a "common" package, neither have many users...)
Sounds like a DRI driver bug...
BTW, the attached patch was applied to mandriva packages, and the related mandriva bug marked as resolved/fixed. https://qa.mandriva.com/show_bug.cgi?id=59084
A crash in intel_region_buffer can still be obtained with the latest DRI driver as of 2010-08-16 (post glsl2 merge) through kwin-4.5.0 when full-screen OpenGL applications (e.g. FlightGear's fgfs) exit. I got this backtrace: [KCrash Handler] #6 intel_region_buffer (intel=0x2dabaf0, region=0x0, flag=2) at intel_regions.c:505 #7 0x00007fa97ad44ab9 in intelClearWithBlit (ctx=0x2dabaf0, mask=2) at intel_blit.c:266 #8 0x00007fa97ad46c3a in intelClear (ctx=0x2dabaf0, mask=<value optimized out>) at intel_clear.c:173 #9 0x00007fa9918a2bb5 in KWin::SceneOpenGL::paintBackground (this=<value optimized out>, region=<value optimized out>) at /usr/src/debug/kde-base/kwin-4.5.0/kwin-4.5.0/kwin/scene_opengl.cpp:892 #10 0x00007fa99189a5ce in KWin::Scene::paintGenericScreen (this=0x21f1170, orig_mask=32) at /usr/src/debug/kde-base/kwin-4.5.0/kwin-4.5.0/kwin/scene.cpp:187 #11 0x00007fa9918990ca in KWin::Scene::finalPaintScreen (this=0x21f1170, mask=32, region=<value optimized out>, data=<value optimized out>) at /usr/src/debug/kde-base/kwin-4.5.0/kwin-4.5.0/kwin/scene.cpp:177 #12 0x00007fa9918afd6f in KWin::EffectsHandlerImpl::paintScreen (this=<value optimized out>, mask=32, region=<value optimized out>, data=...)
Is this still present in master?
(In reply to comment #8) > Is this still present in master? Sorry for the delay, I was waiting for xorg/mesa packages to be updated, but not git master, sorry I am not following closely Xorg/Mesa for some time. The problem still happens, with these packages: $ rpm -q x11-server-xorg x11-driver-video-intel mesa libdrm2 x11-server-xorg-1.9.0.902-1mdv2011.0 x11-driver-video-intel-2.13.0-4mdv2011.0 mesa-7.9-1mdv2011.0 libdrm2-2.4.22-1mdv2011.0 [ guess I will need to rewrite the patch at some point so that users with an intel card can use the salome package; actually, last updates made me need to run my own computer with the vesa driver as the ati driver broke for me, segv at startup...] After installing the -debug packages, and attaching gdb to the Xserver, the backtrace is: (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. intel_region_buffer (intel=0xbf246f8, region=0x0, flag=2) at intel_regions.c:514 514 if (region->pbo) { (gdb) bt #0 intel_region_buffer (intel=0xbf246f8, region=0x0, flag=2) at intel_regions.c:514 #1 0xb6c0c95a in intelClearWithBlit (ctx=0xbf246f8, mask=3) at intel_blit.c:266 #2 0xb6c0ee68 in intelClear (ctx=0xbf246f8, mask=3) at intel_clear.c:173 #3 0xb6c87638 in _mesa_Clear (mask=<value optimized out>) at main/clear.c:179 #4 0xb7251037 in ?? () from /usr/lib/xorg/modules/extensions/libglx.so #5 0xb727c659 in ?? () from /usr/lib/xorg/modules/extensions/libglx.so #6 0xb727f2bf in ?? () from /usr/lib/xorg/modules/extensions/libglx.so #7 0x0806f777 in ?? () #8 0x080625e5 in _start () (gdb) p region $1 = (struct intel_region *) 0x0 If I do the pseudo patch: --- /usr/bin/runSalome #export LIBGL_ALWAYS_INDIRECT=true +#export LIBGL_ALWAYS_INDIRECT=true it shows a dialog message about segmentation violation at address 0x30 (that happens to be the offset of the pbo field), and pressing ok just keeps showing another "Attempt to access null object" dialog, so, the "solution" I had found the first time still would need remaking my mesa 7.8.1 patch.
Tested with a build of salome version 5.1.4 and it no longer crashes when using LIBGL_ALWAYS_INDIRECT. Only other difference is x11-server 1.9.2. But it still causes the segvs if not setting LIBGL_ALWAYS_INDIRECT. Also, the image in the opencascade viewer most times "keeps jumping", like when attempting to make a selection.
I'm pretty sure this is fixed by: commit 94ed481131e4f5ba2c83fe7f3b12715661660133 Author: Eric Anholt <eric@anholt.net> Date: Sun Jan 2 17:04:57 2011 -0800 intel: Handle forced swrast clears before other clear bits. Fixes a potential segfault on a non-native depthbuffer, and possible accidental swrast fallback on extra color buffers. I couldn't get the app installed to try it myself.
(In reply to comment #11) > I'm pretty sure this is fixed by: > > commit 94ed481131e4f5ba2c83fe7f3b12715661660133 > Author: Eric Anholt <eric@anholt.net> > Date: Sun Jan 2 17:04:57 2011 -0800 > > intel: Handle forced swrast clears before other clear bits. > > Fixes a potential segfault on a non-native depthbuffer, and possible > accidental swrast fallback on extra color buffers. > > I couldn't get the app installed to try it myself. Thanks. Looking at the patch, it appears to be going to correct the issue. In that case, I will probably remove/comment the "export LIBGL_ALWAYS_INDIRECT=true" from the script. I will test it as soon as possible. Mandriva updated mesa packages should be available soon. To test the application you probably would want to test on Mandriva Cooker, as I don't know if any other distro packages salome, and afaik, the binaries from www.salome-platform.org have some patches to render to an offscreen pixmap, or something related; the sources are available but need to register a free account to download, and I did only a quick look at it some month ago (there are plenty of other things to look when packaging such a large package :-)...
(In reply to comment #11) > I'm pretty sure this is fixed by: > > commit 94ed481131e4f5ba2c83fe7f3b12715661660133 > Author: Eric Anholt <eric@anholt.net> > Date: Sun Jan 2 17:04:57 2011 -0800 > > intel: Handle forced swrast clears before other clear bits. > > Fixes a potential segfault on a non-native depthbuffer, and possible > accidental swrast fallback on extra color buffers. > > I couldn't get the app installed to try it myself. I tested with: $ rpm -q x11-server-xorg libdri-drivers mesa x11-server-xorg-1.9.3-3mdv2011.0.i586 libdri-drivers-7.10-1-mdv2011.0.i586 mesa-7.10-1-mdv2011.0.i586 (the mesa and dri rpms are experimental rpms from pzanoni) And it would always fail with a weird "Unknwon Exception" dialog, message, and attaching gdb to all salome related processes would not help. So, it was failing with some error code somewhere, that would not trigger a signal. But, when doing some minor testing with driconf options, if enabling "Enable flush batchbuffer after each draw call" it crashes the X Server, with the backtrace: Program received signal SIGSEGV, Segmentation fault. intel_region_buffer (intel=0xbb363d0, region=0x0, flag=2) at intel_regions.c:514 514 if (region->pbo) { (gdb) bt #0 intel_region_buffer (intel=0xbb363d0, region=0x0, flag=2) at intel_regions.c:514 #1 0xb6b383fa in intelClearWithBlit (ctx=0xbb363d0, mask=3) at intel_blit.c:262 #2 0xb6b3ad5b in intelClear (ctx=0xbb363d0, mask=3) at intel_clear.c:174 #3 0xb6d4e2f8 in _mesa_Clear (mask=<value optimized out>) at main/clear.c:241 #4 0xb71c9fe7 in __glXDisp_Clear (pc=0xbb15fc4 "") at indirect_dispatch.c:1335 #5 0xb71f5609 in __glXDisp_Render (cl=0xbb0c1e8, pc=<value optimized out>) at glxcmds.c:1847 #6 0xb71f826f in __glXDispatch (client=0xbb0c110) at glxext.c:600 #7 0x0806f777 in Dispatch () at dispatch.c:432 #8 0x080625e5 in main (argc=8, argv=0xbf8f4234, envp=0xbf8f4258) at main.c:291 Weirdly enough, removing ~/.drirc or changing back the driconf option does not revert it to the state of not crashing the X Server, rebooting, powering down, etc does not revert it either (guess may may need powering down and leting it so for significant time...). I tested Mesa 7.10 on a x86_64 with an ati card and the salome package works. If remaking the Mesa-7.8.1-salome.patch (to apply on 7.10) and rebuilding the package, it will work again. But, need to set "export LIBGL_ALWAYS_INDIRECT=true" or it will fail with SIGFPEs when loading a sample file/project.
(In reply to comment #13) > Program received signal SIGSEGV, Segmentation fault. > intel_region_buffer (intel=0xbb363d0, region=0x0, flag=2) > at intel_regions.c:514 > 514 if (region->pbo) { > (gdb) bt > #0 intel_region_buffer (intel=0xbb363d0, region=0x0, flag=2) > at intel_regions.c:514 > #1 0xb6b383fa in intelClearWithBlit (ctx=0xbb363d0, mask=3) > at intel_blit.c:262 > #2 0xb6b3ad5b in intelClear (ctx=0xbb363d0, mask=3) at intel_clear.c:174 > #3 0xb6d4e2f8 in _mesa_Clear (mask=<value optimized out>) at main/clear.c:241 > #4 0xb71c9fe7 in __glXDisp_Clear (pc=0xbb15fc4 "") at indirect_dispatch.c:1335 > #5 0xb71f5609 in __glXDisp_Render (cl=0xbb0c1e8, pc=<value optimized out>) > at glxcmds.c:1847 > #6 0xb71f826f in __glXDispatch (client=0xbb0c110) at glxext.c:600 > #7 0x0806f777 in Dispatch () at dispatch.c:432 > #8 0x080625e5 in main (argc=8, argv=0xbf8f4234, envp=0xbf8f4258) at main.c:291 > I was able to see this backtrace on a SandyBridge, mesa 7.10, when playing extremetuxracer. Maybe the driver developers will find it easier to debug with etracer. Please see bug #33422
The code appearing in the last backtrace is gone -- do you still have problems? Also, seriously, stop setting LIBGL_ALWAYS_INDIRECT. We don't handle bugs when you do that.
(In reply to comment #15) > The code appearing in the last backtrace is gone -- do you still have > problems? I do not know if I will or how long it will take for me to be able to rebuild dependencies to have it working again. It should be ok close the bug now. > Also, seriously, stop setting LIBGL_ALWAYS_INDIRECT. We don't handle bugs > when you do that. At that time, I used it as a workaround to just display a dialog message about a segmentation fault at address 0x30 as described in #c9, instead of crashing the X Server.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.