Summary: | Problem with libopencascade - salome-platform | ||
---|---|---|---|
Product: | Mesa | Reporter: | Paulo César Pereira de Andrade <pcpa> |
Component: | Drivers/DRI/i965 | Assignee: | Eric Anholt <eric> |
Status: | RESOLVED MOVED | QA Contact: | |
Severity: | critical | ||
Priority: | medium | CC: | geromanas, przanoni |
Version: | git | ||
Hardware: | x86 (IA32) | ||
OS: | Linux (All) | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=30509 | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | Mesa-7.8.1-salome.patch |
Description
Paulo César Pereira de Andrade
2010-03-26 12:39:49 UTC
I tried to reproduce the bug with the "default" stuff in the distro (not forcing anything, but with runSalome still doing LIBGL_ALWAYS_INDIRECT=true). Distro is using mesa 7.8.1, x11-server 1.7.7, drm 2.4.20, kernel 2.6.33, intel 2.11.0 This is the backtrace I got: #0 intel_region_buffer (intel=0x2403ee0, region=0x0, flag=2) at intel_regions.c:498 #1 0x00007fbb6301ba56 in intelClearWithBlit (ctx=0x2403ee0, mask=3) at intel_blit.c:268 #2 0x00007fbb6301434a in intelClear (ctx=0x2403ee0, mask=<value optimized out>) at intel_clear.c:169 #3 0x00007fbb6456b293 in __glXDisp_Render (cl=<value optimized out>, pc=0x20367b8 "\b") at glxcmds.c:1823 #4 0x00007fbb6456f4be in __glXDispatch (client=0x20363e0) at glxext.c:578 #5 0x000000000043886c in Dispatch () at dispatch.c:439 #6 0x000000000042271c in main (argc=1, argv=0x7d9408, envp=<value optimized out>) at main.c:285 The segfault happens because at intel_regions.c:498 region=0x0 and we have this: if (region->pbo) { Just for clarification. I asked Paulo Zanoni to look at it because my previous correction/workaround of setting LIBGL_ALWAYS_INDIRECT=true, to having a working package now causes the X Server to crash... If not setting LIBGL_ALWAYS_INDIRECT, now it appears to have a behavior very similar to the one of the ati driver, that is always triggering a SIGFPE, but in the client code, that handles it... And the root cause appears to be the same; at least the same mesa debug is visible: Mesa: User error: GL_INVALID_ENUM in glIsEnabled(0xb72) Mesa: User error: GL_INVALID_ENUM in glEnable(0xb72) https://bugs.freedesktop.org/show_bug.cgi?id=27332 describes the problem with the ati driver. (changing Importance to highest because now the package is unusable with the intel driver) Created attachment 35503 [details] [review] Mesa-7.8.1-salome.patch I suggested applying this patch in Mandriva, based on the backtrace Pzanoni reported, but any feedback about it is welcome. The patch corrects the problem, and at least in the computer I tested, it did not have problems with multiple outputs also. I will check monday if I there are opencascade sources available for the prebuilt salome binaries (I asked in the salome-platform forum for some hints just after I got a working package, but got no responses so far... http://www.salome-platform.org/forum/forum_12/459079973), but I am almost sure it uses OSMesa, as the problem doesn't happen there, but the binaries are for a very older Mandriva distro, and requires installing some old packages/libraries to get it working in Mandriva cooker. Changing priority because the intel driver is very common, but the current version makes the salome package almost unusable (almost, because only the geometry module has the problem; but I understand it is not a "common" package, neither have many users...) Sounds like a DRI driver bug... BTW, the attached patch was applied to mandriva packages, and the related mandriva bug marked as resolved/fixed. https://qa.mandriva.com/show_bug.cgi?id=59084 A crash in intel_region_buffer can still be obtained with the latest DRI driver as of 2010-08-16 (post glsl2 merge) through kwin-4.5.0 when full-screen OpenGL applications (e.g. FlightGear's fgfs) exit. I got this backtrace: [KCrash Handler] #6 intel_region_buffer (intel=0x2dabaf0, region=0x0, flag=2) at intel_regions.c:505 #7 0x00007fa97ad44ab9 in intelClearWithBlit (ctx=0x2dabaf0, mask=2) at intel_blit.c:266 #8 0x00007fa97ad46c3a in intelClear (ctx=0x2dabaf0, mask=<value optimized out>) at intel_clear.c:173 #9 0x00007fa9918a2bb5 in KWin::SceneOpenGL::paintBackground (this=<value optimized out>, region=<value optimized out>) at /usr/src/debug/kde-base/kwin-4.5.0/kwin-4.5.0/kwin/scene_opengl.cpp:892 #10 0x00007fa99189a5ce in KWin::Scene::paintGenericScreen (this=0x21f1170, orig_mask=32) at /usr/src/debug/kde-base/kwin-4.5.0/kwin-4.5.0/kwin/scene.cpp:187 #11 0x00007fa9918990ca in KWin::Scene::finalPaintScreen (this=0x21f1170, mask=32, region=<value optimized out>, data=<value optimized out>) at /usr/src/debug/kde-base/kwin-4.5.0/kwin-4.5.0/kwin/scene.cpp:177 #12 0x00007fa9918afd6f in KWin::EffectsHandlerImpl::paintScreen (this=<value optimized out>, mask=32, region=<value optimized out>, data=...) Is this still present in master? (In reply to comment #8) > Is this still present in master? Sorry for the delay, I was waiting for xorg/mesa packages to be updated, but not git master, sorry I am not following closely Xorg/Mesa for some time. The problem still happens, with these packages: $ rpm -q x11-server-xorg x11-driver-video-intel mesa libdrm2 x11-server-xorg-1.9.0.902-1mdv2011.0 x11-driver-video-intel-2.13.0-4mdv2011.0 mesa-7.9-1mdv2011.0 libdrm2-2.4.22-1mdv2011.0 [ guess I will need to rewrite the patch at some point so that users with an intel card can use the salome package; actually, last updates made me need to run my own computer with the vesa driver as the ati driver broke for me, segv at startup...] After installing the -debug packages, and attaching gdb to the Xserver, the backtrace is: (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. intel_region_buffer (intel=0xbf246f8, region=0x0, flag=2) at intel_regions.c:514 514 if (region->pbo) { (gdb) bt #0 intel_region_buffer (intel=0xbf246f8, region=0x0, flag=2) at intel_regions.c:514 #1 0xb6c0c95a in intelClearWithBlit (ctx=0xbf246f8, mask=3) at intel_blit.c:266 #2 0xb6c0ee68 in intelClear (ctx=0xbf246f8, mask=3) at intel_clear.c:173 #3 0xb6c87638 in _mesa_Clear (mask=<value optimized out>) at main/clear.c:179 #4 0xb7251037 in ?? () from /usr/lib/xorg/modules/extensions/libglx.so #5 0xb727c659 in ?? () from /usr/lib/xorg/modules/extensions/libglx.so #6 0xb727f2bf in ?? () from /usr/lib/xorg/modules/extensions/libglx.so #7 0x0806f777 in ?? () #8 0x080625e5 in _start () (gdb) p region $1 = (struct intel_region *) 0x0 If I do the pseudo patch: --- /usr/bin/runSalome #export LIBGL_ALWAYS_INDIRECT=true +#export LIBGL_ALWAYS_INDIRECT=true it shows a dialog message about segmentation violation at address 0x30 (that happens to be the offset of the pbo field), and pressing ok just keeps showing another "Attempt to access null object" dialog, so, the "solution" I had found the first time still would need remaking my mesa 7.8.1 patch. Tested with a build of salome version 5.1.4 and it no longer crashes when using LIBGL_ALWAYS_INDIRECT. Only other difference is x11-server 1.9.2. But it still causes the segvs if not setting LIBGL_ALWAYS_INDIRECT. Also, the image in the opencascade viewer most times "keeps jumping", like when attempting to make a selection. I'm pretty sure this is fixed by: commit 94ed481131e4f5ba2c83fe7f3b12715661660133 Author: Eric Anholt <eric@anholt.net> Date: Sun Jan 2 17:04:57 2011 -0800 intel: Handle forced swrast clears before other clear bits. Fixes a potential segfault on a non-native depthbuffer, and possible accidental swrast fallback on extra color buffers. I couldn't get the app installed to try it myself. (In reply to comment #11) > I'm pretty sure this is fixed by: > > commit 94ed481131e4f5ba2c83fe7f3b12715661660133 > Author: Eric Anholt <eric@anholt.net> > Date: Sun Jan 2 17:04:57 2011 -0800 > > intel: Handle forced swrast clears before other clear bits. > > Fixes a potential segfault on a non-native depthbuffer, and possible > accidental swrast fallback on extra color buffers. > > I couldn't get the app installed to try it myself. Thanks. Looking at the patch, it appears to be going to correct the issue. In that case, I will probably remove/comment the "export LIBGL_ALWAYS_INDIRECT=true" from the script. I will test it as soon as possible. Mandriva updated mesa packages should be available soon. To test the application you probably would want to test on Mandriva Cooker, as I don't know if any other distro packages salome, and afaik, the binaries from www.salome-platform.org have some patches to render to an offscreen pixmap, or something related; the sources are available but need to register a free account to download, and I did only a quick look at it some month ago (there are plenty of other things to look when packaging such a large package :-)... (In reply to comment #11) > I'm pretty sure this is fixed by: > > commit 94ed481131e4f5ba2c83fe7f3b12715661660133 > Author: Eric Anholt <eric@anholt.net> > Date: Sun Jan 2 17:04:57 2011 -0800 > > intel: Handle forced swrast clears before other clear bits. > > Fixes a potential segfault on a non-native depthbuffer, and possible > accidental swrast fallback on extra color buffers. > > I couldn't get the app installed to try it myself. I tested with: $ rpm -q x11-server-xorg libdri-drivers mesa x11-server-xorg-1.9.3-3mdv2011.0.i586 libdri-drivers-7.10-1-mdv2011.0.i586 mesa-7.10-1-mdv2011.0.i586 (the mesa and dri rpms are experimental rpms from pzanoni) And it would always fail with a weird "Unknwon Exception" dialog, message, and attaching gdb to all salome related processes would not help. So, it was failing with some error code somewhere, that would not trigger a signal. But, when doing some minor testing with driconf options, if enabling "Enable flush batchbuffer after each draw call" it crashes the X Server, with the backtrace: Program received signal SIGSEGV, Segmentation fault. intel_region_buffer (intel=0xbb363d0, region=0x0, flag=2) at intel_regions.c:514 514 if (region->pbo) { (gdb) bt #0 intel_region_buffer (intel=0xbb363d0, region=0x0, flag=2) at intel_regions.c:514 #1 0xb6b383fa in intelClearWithBlit (ctx=0xbb363d0, mask=3) at intel_blit.c:262 #2 0xb6b3ad5b in intelClear (ctx=0xbb363d0, mask=3) at intel_clear.c:174 #3 0xb6d4e2f8 in _mesa_Clear (mask=<value optimized out>) at main/clear.c:241 #4 0xb71c9fe7 in __glXDisp_Clear (pc=0xbb15fc4 "") at indirect_dispatch.c:1335 #5 0xb71f5609 in __glXDisp_Render (cl=0xbb0c1e8, pc=<value optimized out>) at glxcmds.c:1847 #6 0xb71f826f in __glXDispatch (client=0xbb0c110) at glxext.c:600 #7 0x0806f777 in Dispatch () at dispatch.c:432 #8 0x080625e5 in main (argc=8, argv=0xbf8f4234, envp=0xbf8f4258) at main.c:291 Weirdly enough, removing ~/.drirc or changing back the driconf option does not revert it to the state of not crashing the X Server, rebooting, powering down, etc does not revert it either (guess may may need powering down and leting it so for significant time...). I tested Mesa 7.10 on a x86_64 with an ati card and the salome package works. If remaking the Mesa-7.8.1-salome.patch (to apply on 7.10) and rebuilding the package, it will work again. But, need to set "export LIBGL_ALWAYS_INDIRECT=true" or it will fail with SIGFPEs when loading a sample file/project. (In reply to comment #13) > Program received signal SIGSEGV, Segmentation fault. > intel_region_buffer (intel=0xbb363d0, region=0x0, flag=2) > at intel_regions.c:514 > 514 if (region->pbo) { > (gdb) bt > #0 intel_region_buffer (intel=0xbb363d0, region=0x0, flag=2) > at intel_regions.c:514 > #1 0xb6b383fa in intelClearWithBlit (ctx=0xbb363d0, mask=3) > at intel_blit.c:262 > #2 0xb6b3ad5b in intelClear (ctx=0xbb363d0, mask=3) at intel_clear.c:174 > #3 0xb6d4e2f8 in _mesa_Clear (mask=<value optimized out>) at main/clear.c:241 > #4 0xb71c9fe7 in __glXDisp_Clear (pc=0xbb15fc4 "") at indirect_dispatch.c:1335 > #5 0xb71f5609 in __glXDisp_Render (cl=0xbb0c1e8, pc=<value optimized out>) > at glxcmds.c:1847 > #6 0xb71f826f in __glXDispatch (client=0xbb0c110) at glxext.c:600 > #7 0x0806f777 in Dispatch () at dispatch.c:432 > #8 0x080625e5 in main (argc=8, argv=0xbf8f4234, envp=0xbf8f4258) at main.c:291 > I was able to see this backtrace on a SandyBridge, mesa 7.10, when playing extremetuxracer. Maybe the driver developers will find it easier to debug with etracer. Please see bug #33422 The code appearing in the last backtrace is gone -- do you still have problems? Also, seriously, stop setting LIBGL_ALWAYS_INDIRECT. We don't handle bugs when you do that. (In reply to comment #15) > The code appearing in the last backtrace is gone -- do you still have > problems? I do not know if I will or how long it will take for me to be able to rebuild dependencies to have it working again. It should be ok close the bug now. > Also, seriously, stop setting LIBGL_ALWAYS_INDIRECT. We don't handle bugs > when you do that. At that time, I used it as a workaround to just display a dialog message about a segmentation fault at address 0x30 as described in #c9, instead of crashing the X Server. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.