Bug 29251

Summary: xorg-server-1.9 randomly crashes with segmentation fault
Product: xorg Reporter: Alexander Sidorov <alex.n.sidorov>
Component: Server/GeneralAssignee: Kristian Høgsberg <krh>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: high CC: alex.n.sidorov, denis.kot, dlaroche7, Hugo.Mildenberger, jeremyhu, matthieu.herrb, tnnn
Version: 7.4 (2008.09)Keywords: regression
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard: 2012BRB_Reviewed
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 44202    
Attachments:
Description Flags
Xorg.0.log.old
none
cat of dmesg command
none
emerge --info command
none
Xorg.0.log
none
xorg config
none
2010-12-17 emerge --info
none
versions of packages
none
X.org config file
none
uname -a command responce
none
Xorg.0.og.old with error message and backtrace
none
Xorg.0.log with X server compiled with C flags "-march=native -pipe -O0 -ggdb3" none

Description Alexander Sidorov 2010-07-25 15:13:24 UTC
Created attachment 37383 [details]
Xorg.0.log.old

The xorg-server randomly crashes with segmentation fault. I usually can reproduce this bug by these steps: 1. open Chromium or Opera; 2. type smth in search edit. When the history list is dropped down - the X server crashes. This method is working every time, and X server crashes randomly for unknown reasons as well; for example: I can just seek on track in Amarok, or just look at monitor and drag mouse up and down :) The X server may or may not crach in this case, and if I produce actions described above, with Chromium or Opera, - the X server crashes always.

If I check xorg's log, I can see such lines:

Backtrace:
0: /usr/bin/X (xorg_backtrace+0x35) [0x486efd]
1: /usr/bin/X (0x400000+0x55e61) [0x455e61]
2: /lib/libpthread.so.0 (0x7fa4a5582000+0xf120) [0x7fa4a5591120]
3: /usr/bin/X (dixLookupPrivate+0xa) [0x440bf2]
4: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fa4a2dcb000+0x34982)
5: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fa4a2dcb000+0x2d7ed)
6: /usr/bin/X (FreeResource+0xae) [0x4431fe]
7: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fa4a2dcb000+0x2b29b)
8: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fa4a2dcb000+0x2d607)
9: /usr/bin/X (0x400000+0x2d4a3) [0x42d4a3]
10: /usr/bin/X (0x400000+0x24d1e) [0x424d1e]
11: /lib/libc.so.6 (__libc_start_main+0xfd) [0x7fa4a3c82bbd]
12: /usr/bin/X (0x400000+0x247f9) [0x4247f9]
  Segmentation fault at address 0x290
Fatal server error:
  Caught signal 11 (Segmentation fault). Server aborting

I have Gentoo x86_64, and I build xorg-server with CFLAGS = CXXFLAGS = "-Os -march=core2 -pipe"
You can see some info about my system, and logs in attached files.
Comment 1 Alexander Sidorov 2010-07-25 15:16:55 UTC
Created attachment 37384 [details]
cat of dmesg command
Comment 2 Alexander Sidorov 2010-07-25 15:17:20 UTC
Created attachment 37385 [details]
emerge --info command
Comment 3 Alexander Sidorov 2010-07-25 15:18:19 UTC
Created attachment 37386 [details]
Xorg.0.log
Comment 4 Luis 2010-07-26 20:49:31 UTC
I tested all other versions released in Gentoo after 1.8.0 and all had the same crash in my hardware. The last stable version to me was 1.8.0. 

Turning off the compositing solves the problem (in KDE alt+shift+F12). In addition I report that it crashes my system starting Firefox, but sometimes it crashes after I started it when I put my cursor over some button that shows a help/info message. 

Using radeonhd or ati (opensource) driver gives the same crash and error message in the Xorg.log. I'm running Linux 2.6.34 vanilla sources from Gentoo in a AMD64 laptop.

My uname -a:
Linux XXXX 2.6.34 #1 SMP Mon May 17 20:56:09 CDT 2010 x86_64 AMD Turion(tm) 64 X2 Mobile Technology TL-60 AuthenticAMD GNU/Linux


My video chip is ATI Radeon X 1200. My lspci -v says:
01:05.0 VGA compatible controller: ATI Technologies Inc RS690M [Radeon X1200 Series] (prog-if 00 [VGA controller])

GCC version: gcc (Gentoo 4.3.4 p1.1, pie-10.1.5) 4.3.4

My CFLAGS= CXXFLAGS = "-O2 -march=k8 -msse3 -pipe"

Not compiled with kdrive option.
Comment 5 Alexander Sidorov 2010-08-05 16:14:51 UTC
This bug occurs on xorg-server-1.8.2 and xorg-server-1.8.99.905

I have:
* 01:00.0 VGA compatible controller: ATI Technologies Inc M92 LP [Mobility Radeon HD 4300 Series]
* Linux bio 2.6.34-gentoo-r2 #1 SMP Sun Jul 25 02:36:58 EEST 2010 x86_64 Intel(R) Core(TM)2 Duo CPU T5870 @ 2.00GHz GenuineIntel GNU/Linux

* x11-base/xorg-x11-7.4-r1                                       
* x11-drivers/xf86-video-ati-6.13.1
* media-libs/mesa-7.8.2
* x11-libs/libdrm-2.4.21-r1
Comment 6 Alexander Sidorov 2010-08-05 16:19:58 UTC
UPD: the whole system is  built with:

app-shells/bash:      4.1_p7
dev-java/java-config: 2.1.11
dev-lang/python:      2.6.5-r3, 3.1.2-r4
dev-util/cmake:       2.8.1-r2
sys-apps/baselayout:  2.0.1
sys-apps/openrc:      0.6.1-r1
sys-apps/sandbox:     2.2
sys-devel/autoconf:   2.65-r1
sys-devel/automake:   1.7.9-r2, 1.8.5-r4, 1.9.6-r3, 1.10.3, 1.11.1
sys-devel/binutils:   2.20.1-r1
sys-devel/gcc:        4.4.4-r1
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:    2.2.10
virtual/os-headers:   2.6.34

CHOST="x86_64-pc-linux-gnu"
CFLAGS="-Os -march=core2 -pipe"
CXXFLAGS="-Os -march=core2 -pipe"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
OPENGL_PROFILE="xorg-x11"
Comment 7 Alexander Sidorov 2010-08-05 16:22:34 UTC
Created attachment 37604 [details]
xorg config
Comment 8 Michel Dänzer 2010-08-06 01:04:24 UTC
(In reply to comment #4)
> I tested all other versions released in Gentoo after 1.8.0 and all had the same
> crash in my hardware. The last stable version to me was 1.8.0. 

Can you try and bisect which commit from Git server-1.8-branch introduced the problem?

(In reply to comment #5)
> This bug occurs on xorg-server-1.8.2 and xorg-server-1.8.99.905

Thus marking as a blocker for 1.9.
Comment 9 Alexander Sidorov 2010-08-06 02:45:11 UTC
(In reply to comment #8)
As far, as I know, the last working release was xorg-server-1.8.0, so I can suggest, it was this commit: http://cgit.freedesktop.org/xorg/xserver/commit/?h=server-1.8-branch&id=495cec794dad95ed0c79048f3c410ad23e7d5ea4

As far as I remember, there were xorg-server-1.8.0 and xorg-server-1.8.1 ebuilds in Gentoo repo, without any intermediate releases.
Comment 10 Hugo Mildenberger 2010-08-07 03:55:20 UTC
Probably another symptom of the glx related memory bug described by me in bug #28181, which persists on xorg-server-1.8.2 also on X86. You may also try to revert commit 0460a76b9ae25fe26f683f0cbff1e4157287cf56 (and fix a single rejection manually) to see if these are really related.
Comment 11 Alexander Sidorov 2010-08-26 07:04:52 UTC
This bug is reproduced on the latest xorg-server-1.9 from this source: http://xorg.freedesktop.org/releases/individual/xserver/xorg-server-1.9.0.tar.bz2
Comment 12 Luis 2010-09-16 23:20:52 UTC
(In reply to comment #10)
> Probably another symptom of the glx related memory bug described by me in bug
> #28181, which persists on xorg-server-1.8.2 also on X86. You may also try to
> revert commit 0460a76b9ae25fe26f683f0cbff1e4157287cf56 (and fix a single
> rejection manually) to see if these are really related.

Yes, it fixed! Apparently there is not more crash due to that problem. What I did was download the 1.8 branch and revert just this patch. Now I can enjoy desktop effect s with no crashes.
Thank you all guys!
Comment 13 Luis 2010-09-16 23:32:05 UTC
Also, I'd like to add one more thing. When testing with patch in place (and crashing) every time I logged into my account using KDE 4.4 I got one flicker of an old full screen image (usually last image), even after reboots! Without the patch I see a blank screen (black) instead of an old screen. That may indicate that some part of the video memory has been used without initialization or cleaning. That's my guess.
Comment 14 Michel Dänzer 2010-09-17 00:14:47 UTC
Kristian, any ideas for fixing this? If not, maybe this should be reverted from the 1.8 and maybe 1.9 branch for now, though that would probably regress other stuff...
Comment 15 Jeremy Huddleston Sequoia 2010-09-17 11:49:04 UTC
Just quickly reviewing the patch itself, this looks wrong:

+    if (pDraw->id != glxDrawableId &&
+	!AddResource(pDraw->id, __glXDrawableRes, pGlxDraw)) {

shouldn't that be an *OR* rather than an *AND* ?
Comment 16 Jeremy Huddleston Sequoia 2010-09-17 11:51:35 UTC
ah wait, no... nevermind... looking at the code in full rather than the change, that hunk makes more sense... hmm...
Comment 17 Kristian Høgsberg 2010-09-23 06:53:59 UTC
Does this help?

http://lists.x.org/archives/xorg-devel/2010-September/013349.html
Comment 18 Alexander Sidorov 2010-12-16 15:52:54 UTC
Hello again.

I'm using xorg-server-1.9 and KDE 4.5.4. Other system parameters you can get from the attached files.

This bug is still reproducible.

I can notice that if I switch off all compositing effects in KDE, all troubles disappear.

Kristian,
How can I apply this patch?
Comment 19 Alexander Sidorov 2010-12-16 15:54:29 UTC
Created attachment 41194 [details]
2010-12-17 emerge --info
Comment 20 Alexander Sidorov 2010-12-16 15:55:14 UTC
Created attachment 41195 [details]
versions of packages
Comment 21 Alexander Sidorov 2010-12-16 15:56:00 UTC
Created attachment 41196 [details]
X.org config file
Comment 22 Alexander Sidorov 2010-12-16 15:56:42 UTC
Created attachment 41197 [details]
uname -a command responce
Comment 23 Alexander Sidorov 2010-12-16 15:57:24 UTC
Created attachment 41198 [details]
Xorg.0.og.old with error message and backtrace
Comment 24 Alexander Sidorov 2010-12-16 16:04:13 UTC
FYI, this is related issue in KDE's bugzilla:
https://bugs.kde.org/show_bug.cgi?id=260334
Comment 25 Michel Dänzer 2010-12-17 01:56:31 UTC
This looks like bug 28181. Does http://lists.x.org/archives/xorg-devel/2010-December/016969.html help?
Comment 26 Michel Dänzer 2010-12-17 02:00:01 UTC
(In reply to comment #18)
> How can I apply this patch?

The patches are for the xserver repository. Please ask Gentoo people for help if you don't know how to apply patches to your packages.
Comment 27 Alexander Sidorov 2010-12-17 12:18:44 UTC
(In reply to comment #25)

Michel,
I've just rebuild xorg-server with this patch, but it doesn't help me...
The trace and error are the same (see Xorg.0.log.old)
Comment 29 Michel Dänzer 2010-12-20 07:39:15 UTC
(In reply to comment #27)
> I've just rebuild xorg-server with this patch, but it doesn't help me...
> The trace and error are the same (see Xorg.0.log.old)

Hmm, looks like you may actually be triggering (at least) two separate crashes... The one in Xorg.0.log.old indeed looks more like bug 29792 than bug 28181.

Would be great if you could get more information about that crash with gdb.
Comment 30 Luis 2010-12-26 01:39:52 UTC
(In reply to comment #17)
> Does this help?
> 
> http://lists.x.org/archives/xorg-devel/2010-September/013349.html

Kristian, AFAIK your patched is available in the latest version of the xserver 1.9.902 available in the Gentoo portage however it doesn't fix the problem. 

The crash message I got now is:
[   185.850] 0: /usr/bin/X (xorg_backtrace+0x28) [0x49f198]
[   185.850] 1: /usr/bin/X (0x400000+0x62849) [0x462849]
[   185.850] 2: /lib/libpthread.so.0 (0x7f12f8854000+0xf120) [0x7f12f8863120]
[   185.850] 3: /usr/lib64/xorg/modules/libexa.so (exaMoveInPixmap+0x2d) [0x7f12f4f2f19d]
[   185.850] 4: /usr/lib64/xorg/modules/drivers/radeon_drv.so (0x7f12f5997000+0x99a28) [0x7f12f5a30a28]
[   185.850] 5: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f12f6ae4000+0x3f31a) [0x7f12f6b2331a]
[   185.851] 6: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f12f6ae4000+0x36238) [0x7f12f6b1a238]
[   185.851] 7: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f12f6ae4000+0x368d2) [0x7f12f6b1a8d2]
[   185.851] 8: /usr/bin/X (0x400000+0x2eda9) [0x42eda9]
[   185.851] 9: /usr/bin/X (0x400000+0x2475a) [0x42475a]
[   185.851] 10: /lib/libc.so.6 (__libc_start_main+0xfd) [0x7f12f7a48bbd]
[   185.851] 11: /usr/bin/X (0x400000+0x242f9) [0x4242f9]
[   185.851] Segmentation fault at address 0xe8
[   185.851] 
Fatal server error:
[   185.851] Caught signal 11 (Segmentation fault). Server aborting

Any ideas?
Comment 31 Luis 2010-12-26 01:45:24 UTC
(In reply to comment #25)
> This looks like bug 28181. Does
> http://lists.x.org/archives/xorg-devel/2010-December/016969.html help?

I applied this this patch but it doesn't help here. I guess this is something else.
Comment 32 Luis 2010-12-27 22:16:41 UTC
However I have to mention that this bug is not present in the KMS, kernel 2.6.36-gentoo-r5.
Comment 33 D. Laroche 2011-03-01 02:41:18 UTC
*** Bug 34854 has been marked as a duplicate of this bug. ***
Comment 34 D. Laroche 2011-03-01 03:01:44 UTC
(In reply to comment #32)
> However I have to mention that this bug is not present in the KMS, kernel
> 2.6.36-gentoo-r5.

I cannot say for sure, but the bug probably appeared for me when I upgraded to kernel 2.6.37.
Comment 35 D. Laroche 2011-03-01 03:04:41 UTC
(In reply to comment #24)
> FYI, this is related issue in KDE's bugzilla:
> https://bugs.kde.org/show_bug.cgi?id=260334

The suggestion in this bug report to add options nomodeset radeon.modeset=0 to the kernel worked for me. My notebook doesn't crash any more.
Comment 36 Hugo Mildenberger 2011-03-01 04:44:43 UTC
(In reply to comment #32)
> (In reply to comment #25)
> > This looks like bug 28181. Does
> > http://lists.x.org/archives/xorg-devel/2010-December/016969.html help?
> 
> I applied this this patch but it doesn't help here. I guess this is something
> else.

I suppose that applying e.g. the "Refcnt the drawable" patch attached to bug #32822 by Chris Wilson should cure the symptom here too. However, that patch in turn causes a quite large kernel memory whole each time an application is started. Depending on usage pattern, X crashes are at least delayed by hours or even days.

(This patch: https://bugs.freedesktop.org/attachment.cgi?id=41618)
Comment 37 D. Laroche 2011-03-01 07:03:22 UTC
(In reply to comment #35)
> (In reply to comment #24)
> > FYI, this is related issue in KDE's bugzilla:
> > https://bugs.kde.org/show_bug.cgi?id=260334
> 
> The suggestion in this bug report to add options nomodeset radeon.modeset=0 to
> the kernel worked for me. My notebook doesn't crash any more.

I spoke to fast... the X server is still crashing.
Comment 38 Jeremy Huddleston Sequoia 2011-04-11 14:17:41 UTC
Can you build your server with debugging symbols (CFLAGS="-O0 -ggdb3")?  That will provide a much more useful backtrace.
Comment 39 D. Laroche 2011-04-11 15:37:50 UTC
Ok, I first have to find how to do that. My distribution is Funtoo, and I
never had to pass special C flags to ebuild packages.

Thanks.

--
Denis Laroche

2011/4/11 <bugzilla-daemon@freedesktop.org>

> https://bugs.freedesktop.org/show_bug.cgi?id=29251
>
> --- Comment #38 from Jeremy Huddleston <jeremyhu@freedesktop.org>
> 2011-04-11 14:17:41 PDT ---
> Can you build your server with debugging symbols (CFLAGS="-O0 -ggdb3")?
>  That
> will provide a much more useful backtrace.
>
> --
> Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>
Comment 40 Jeremy Huddleston Sequoia 2011-04-11 22:05:59 UTC
If Funtoo is a Gentoo derivative, you should be able to edit /etc/make.conf and set your CFLAGS there.  Set it to something like:

CFLAGS="-O0 -ggdb3 -Wall -pipe"
#CFLAGS="-Os -ggdb3 -Wall -pipe"
CXXFLAGS="$CFLAGS"
OBJCFLAGS="$CFLAGS"

Then when you want to switch from debug to production CFLAGS, just change which line you have commented out.  You can also specify a "-march=___" value to produce code optimized for your CPU.  See this URL for more info on picking the correct value for -march.

http://gcc.gnu.org/onlinedocs/gcc/i386-and-x86_002d64-Options.html
Comment 41 D. Laroche 2011-04-17 12:54:15 UTC
Created attachment 45741 [details]
Xorg.0.log with X server compiled with C flags "-march=native -pipe -O0 -ggdb3"

Actually there's a way to specify the CFLAGS for a specific package; this is explained at http://sergiosdj.wordpress.com/2008/06/24/how-to-personalize-a-packages-cflags-in-gentoo/. 

I rebuilt the X server with C flags "-march=native -pipe -O0 -ggdb3". I let the server crash, and here's the new stack trace:

Backtrace:
[   446.998] 0: /usr/bin/X (xorg_backtrace+0x31) [0x47c2a9]
[   446.998] 1: /usr/bin/X (0x400000+0x82aeb) [0x482aeb]
[   446.998] 2: /lib/libpthread.so.0 (0x7f55bc9d5000+0xf120) [0x7f55bc9e4120]
[   446.998] 3: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f55ba392000+0x4c7f2) [0x7f55ba3de7f2]
[   446.998] 4: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f55ba392000+0x4c850) [0x7f55ba3de850]
[   446.998] 5: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f55ba392000+0x4c8f2) [0x7f55ba3de8f2]
[   446.998] 6: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f55ba392000+0x4c976) [0x7f55ba3de976]
[   446.998] 7: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f55ba392000+0x586e0) [0x7f55ba3ea6e0]
[   446.998] 8: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f55ba392000+0x4ad12) [0x7f55ba3dcd12]
[   446.999] 9: /usr/bin/X (FreeResource+0x13d) [0x45c637]
[   446.999] 10: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f55ba392000+0x4310b) [0x7f55ba3d510b]
[   446.999] 11: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f55ba392000+0x431bb) [0x7f55ba3d51bb]
[   446.999] 12: /usr/lib64/xorg/modules/extensions/libglx.so (0x7f55ba392000+0x4b68c) [0x7f55ba3dd68c]
[   446.999] 13: /usr/bin/X (0x400000+0x2c7c1) [0x42c7c1]
[   446.999] 14: /usr/bin/X (0x400000+0x24968) [0x424968]
[   446.999] 15: /lib/libc.so.6 (__libc_start_main+0xfd) [0x7f55bb95ebbd]
[   446.999] 16: /usr/bin/X (0x400000+0x24339) [0x424339]
[   446.999] Segmentation fault at address 0x240
[   446.999] 
Fatal server error:
[   446.999] Caught signal 11 (Segmentation fault). Server aborting

I also attached the whole Xorg.0.log.
Comment 42 Jeremy Huddleston Sequoia 2011-04-17 14:02:42 UTC
It looks like you're still missing debug symbols.  Make sure you have this set in /etc/make.conf, or portage will strip your installed binaries of the debug symbols you worked so hard to create ;)

FEATURES="nostrip"
Comment 43 D. Laroche 2011-04-18 08:06:29 UTC
I just rebuilt the server with FEATURES="nostrip", and the resulting executables report as not stripped:

$ file /usr/bin/Xorg
/usr/bin/Xorg: setuid ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped

$ file /usr/lib64/opengl/xorg-x11/extensions/libglx.so 
/usr/lib64/opengl/xorg-x11/extensions/libglx.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped

but still, the stack trace has no symbolic information:

Backtrace:
[   199.919] 0: /usr/bin/X (xorg_backtrace+0x31) [0x4d444d]
[   199.920] 1: /usr/bin/X (0x400000+0x81b6f) [0x481b6f]
[   199.920] 2: /lib/libpthread.so.0 (0x7fd34a2e3000+0xf120) [0x7fd34a2f2120]
[   199.920] 3: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fd347ca0000+0x4c7f2) [0x7fd347cec7f2]
[   199.920] 4: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fd347ca0000+0x4c850) [0x7fd347cec850]
[   199.920] 5: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fd347ca0000+0x4c8f2) [0x7fd347cec8f2]
[   199.920] 6: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fd347ca0000+0x4c976) [0x7fd347cec976]
[   199.920] 7: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fd347ca0000+0x586e0) [0x7fd347cf86e0]
[   199.920] 8: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fd347ca0000+0x4ad12) [0x7fd347cead12]
[   199.920] 9: /usr/bin/X (FreeResource+0x13d) [0x45c457]
[   199.920] 10: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fd347ca0000+0x4310b) [0x7fd347ce310b]
[   199.920] 11: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fd347ca0000+0x431bb) [0x7fd347ce31bb]
[   199.920] 12: /usr/lib64/xorg/modules/extensions/libglx.so (0x7fd347ca0000+0x4b68c) [0x7fd347ceb68c]
[   199.920] 13: /usr/bin/X (0x400000+0x2c5e1) [0x42c5e1]
[   199.920] 14: /usr/bin/X (0x400000+0x24968) [0x424968]
[   199.920] 15: /lib/libc.so.6 (__libc_start_main+0xfd) [0x7fd34926cbbd]
[   199.920] 16: /usr/bin/X (0x400000+0x24339) [0x424339]
[   199.920] Segmentation fault at address 0x240
[   199.920] 
Fatal server error:
[   199.920] Caught signal 11 (Segmentation fault). Server aborting
[   199.920]
Comment 44 Jeremy Huddleston Sequoia 2011-04-18 12:05:18 UTC
Weird... well atleast we know which commit introduced the probelm... Hopefully that's enough for krh... Any thoughts on this Kristian?
Comment 45 Hugo Mildenberger 2011-04-18 15:10:21 UTC
(In reply to comment #43)
> I just rebuilt the server with FEATURES="nostrip", and the resulting
> executables report as not stripped:
> 
> $ file /usr/bin/Xorg
> /usr/bin/Xorg: setuid ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
> dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped
> 
> $ file /usr/lib64/opengl/xorg-x11/extensions/libglx.so 
> /usr/lib64/opengl/xorg-x11/extensions/libglx.so: ELF 64-bit LSB shared object,
> x86-64, version 1 (SYSV), dynamically linked, not stripped
> 
> but still, the stack trace has no symbolic information:
> 

The problem is probably due to symlinks introduced by eselect openGL:

/usr/lib64/xorg/modules/extensions/libglx.so
/usr/lib64/opengl/xorg-x11/extensions/libglx.so

I don't know if you also have splitdebug in FEATURES. I had to copy or symlink the debug symbol files to the corresponding library path for libglx,libdri and (I think) libdri2. Anyway, this is a genuin Gentoo problem. Virtually no Gentoo user has gl-related symbols ready when reporting xorg errors. 

Just recovered the script I used to repair opengl -related debug symbols:

#!/bin/bash
pushd /usr/lib64/debug/usr/lib64/xorg/modules/extensions
rm libdri* libglx*
cp ../../../opengl/xorg-x11/extensions/* . -a
popd
Comment 46 D. Laroche 2011-04-21 13:22:01 UTC
> I don't know if you also have splitdebug in FEATURES. I had to copy or symlink
> the debug symbol files to the corresponding library path for libglx,libdri and
> (I think) libdri2. Anyway, this is a genuin Gentoo problem. Virtually no Gentoo
> user has gl-related symbols ready when reporting xorg errors. 
> 
> Just recovered the script I used to repair opengl -related debug symbols:
> 
> #!/bin/bash
> pushd /usr/lib64/debug/usr/lib64/xorg/modules/extensions
> rm libdri* libglx*
> cp ../../../opengl/xorg-x11/extensions/* . -a
> popd

I added splitdebug and rebuilt the Xorg server, still no symbols in the stack trace.
There's no directory /usr/lib64/debug on my system.
Comment 47 Hugo Mildenberger 2011-04-22 08:25:02 UTC
> I added splitdebug and rebuilt the Xorg server, still no symbols in the stack
> trace.
> There's no directory /usr/lib64/debug on my system.

1/ you would need to install dev-util/debugedit first, else splitdebug 
   has no effect. 
2/ in your comment #43 above, you checked the file  
   "/usr/lib64/opengl/xorg-x11/extensions/libglx.so" for debug symbols.
   But X did load the module from"/usr/lib64/xorg/modules/extensions/libglx.so".
   So please check if these entries really represent the same file.
Comment 48 D. Laroche 2011-04-23 07:05:20 UTC
(In reply to comment #47)
> 1/ you would need to install dev-util/debugedit first, else splitdebug 
>    has no effect. 
> 2/ in your comment #43 above, you checked the file  
>    "/usr/lib64/opengl/xorg-x11/extensions/libglx.so" for debug symbols.
>    But X did load the module
> from"/usr/lib64/xorg/modules/extensions/libglx.so".
>    So please check if these entries really represent the same file.

Ok, installed package debugedit (version 5.1.9), rebuilt xorg-server with FEATURES="nostrip splitdebug" in /etc/make.conf, but alas, still no symbolic info in Xorg.0.log after X server crashes.

/usr/lib64/xorg/modules/extensions/libglx.so is a symbolic link to /usr/lib64/opengl/xorg-x11/extensions/libglx.so, which exists and is not stripped.
Comment 49 Julien Cristau 2011-04-27 00:29:58 UTC
> --- Comment #43 from D. Laroche <dlaroche7@gmail.com> 2011-04-18 08:06:29 PDT ---
> but still, the stack trace has no symbolic information:
> 
the backtrace in the log is never going to have that.  you need to use
gdb for debug info.
http://www.x.org/wiki/Development/Documentation/ServerDebugging
Comment 50 D. Laroche 2011-04-30 15:34:49 UTC
(In reply to comment #49)
> the backtrace in the log is never going to have that.  you need to use
> gdb for debug info.
> http://www.x.org/wiki/Development/Documentation/ServerDebugging

Great! Finally here's a stack trace:

Program received signal SIGSEGV, Segmentation fault.
0x00007f7bd39747f2 in dixGetPrivateAddr (privates=0x240, key=0x7f7bd3bb2620)
    at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/include/privates.h:117
117	/var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/include/privates.h: No such file or directory.
	in /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/include/privates.h
(gdb) bt f
#0  0x00007f7bd39747f2 in dixGetPrivateAddr (privates=0x240, key=0x7f7bd3bb2620)
    at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/include/privates.h:117
        __PRETTY_FUNCTION__ = "dixGetPrivateAddr"
#1  0x00007f7bd3974850 in dixGetPrivate (privates=0x240, key=0x7f7bd3bb2620)
    at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/include/privates.h:131
        __PRETTY_FUNCTION__ = "dixGetPrivate"
#2  0x00007f7bd39748f2 in dixLookupPrivate (privates=0x240, key=0x7f7bd3bb2620)
    at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/include/privates.h:161
No locals.
#3  0x00007f7bd3974976 in glxGetScreen (pScreen=0x0) at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/glx/glxscreens.c:200
No locals.
#4  0x00007f7bd39806e0 in __glXDRIdrawableDestroy (drawable=0x235a430)
    at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/glx/glxdri.c:233
        private = 0x235a430
        screen = 0x1e251f0
        i = 1
#5  0x00007f7bd3972d12 in DrawableGone (glxPriv=0x235a430, xid=23068891)
    at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/glx/glxext.c:171
        c = 0x0
        next = 0x0
#6  0x000000000045c457 in FreeResource (id=23068891, skipDeleteFuncType=0)
    at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/dix/resource.c:596
        rtype = 58
        cid = 11
        res = 0x235ac40
        prev = 0x4c377d0
        head = 0x4c377d0
        eltptr = 0x874568
        elements = 371
#7  0x00007f7bd396b10b in DoDestroyDrawable (cl=0x2a3a748, glxdrawable=23068891, type=1)
    at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/glx/glxcmds.c:1340
        pGlxDraw = 0x235a430
        err = -1
#8  0x00007f7bd396b1bb in __glXDisp_DestroyPixmap (cl=0x2a3a748, pc=0x2875764 "\232\027\002")
    at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/glx/glxcmds.c:1364
        client = 0x2a3a630
        req = 0x2875764
#9  0x00007f7bd397368c in __glXDispatch (client=0x2a3a630) at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/glx/glxext.c:583
        rendering = 0 '\000'
        stuff = 0x2875764
        opcode = 23 '\027'
        proc = 0x7f7bd396b16a <__glXDisp_DestroyPixmap>
        cl = 0x2a3a748
        retval = 31634464
#10 0x000000000042c5e1 in Dispatch () at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/dix/dispatch.c:431
        clientReady = 0x2488c10
        result = 0
        client = 0x2a3a630
        nready = 0
        icheck = 0x87f200
        start_tick = 340
#11 0x0000000000424968 in main (argc=10, argv=0x7fff9b588608, envp=0x7fff9b588660)
    at /var/tmp/portage/x11-base/xorg-server-1.10.1/work/xorg-server-1.10.1/dix/main.c:287
        i = 1
        alwaysCheckForInput = {0, 1}
Comment 51 Jeremy Huddleston Sequoia 2011-05-28 17:48:04 UTC
If you install your packages with FEATURES="noclean" enabled in /etc/make.conf, portage will not delete the sources after merge, so you won't have all the "can't find source ..." warnings.
Comment 52 Jeremy Huddleston Sequoia 2011-07-29 11:18:27 UTC
Kristian?
Comment 53 debguy 2014-11-28 15:24:54 UTC
see the below.  i made /xorg/xorg-server-1.9.3 from scratch

Mesa-7.8.2/ was a total "B", needed patches, and prevented near close versions of other things (such as glew) from building after they'd alread built successfully

https://bugs.freedesktop.org/show_bug.cgi?id=86810

ANSWER:

$ mv /usr/local /usr/local.old

# ./configure, make clean, make, make install: all targets in depends order
# cd /usr/lib ; rm (all libs+bins that are now in /usr/local)

$ ldconfig

you now have glx that works

however you of course know Mesa projects has a huge nuber of bugs fix since X11R7.6

-----------------------
problem is: to use newer mesa you need to upgrade drm, which likely means getting newest X11R7 (Xorg will fail to build with too early or to late drm), knowing specifically what version of every x.org package (plus their depends) will build without fail (or have patches for).  because getting "the drm+mesa" that works is a "b" and very versional

all that to upgrade mesa for fixes.  why because they keep changing the API in incompatible ways (and or the Makefile(s) like about necessary versions - but you'd need to dig in real deep and test every api fun to know if it's lying)

------------------------
alternate: do not use drm, use "software only mesa", and mabye (maybe) you can upgrade mesa without re-depends (find all right versions and download) and re-build of Xorg + all that depends on Xorg's version 

i have an older "all in silicon GL" card.  excellent when released.  problem was render wares soon release that needed GL version higher than hw had.  and "hal" software meant to fix the problem?  never works.

and for gaming ?  when i game (seldom) i use PS2 (wii, iphone, or what have you) i'd never use a pc - too much damage.

------------------------
PS  i have no idea why you automagically blame Mesa

what application is running when the crash happens ?  what else is "going on" on the desktop ?

it's more likely your application doesn't follow all of the OpenGL specification and takes short-cuts - thus crashes the driver.  luckily for you it didn't freeze the video card !  or did it ?
Comment 54 debguy 2014-11-28 15:28:06 UTC
OOPS

# cd /usr/lib ; rm (all libs+bins+files in /usr/ and /etc/ that are now in /usr/local) (n.i. /etc/fonts which you have to cp from /usr/local/etc)

ldconfig
Comment 55 debguy 2014-11-28 15:39:01 UTC
both X and GL call a program's "call backs"

so GL (hw or software) may pass the instruction pointer to the program running: which may segfault.  (or the driver could seg if program gave GL commands wrong.  much recent work on Mesa is protecting a good api from bad applications)

often programs are not written as client server are all messed up

NOT X client server.  Xorg does that well

a application should not be "inside an X callback".  callbacks may or may not happen - or may happen while being done.  callbacks must be "re-entrant code" (see wikipedia on re-entrant code, recursion, assembler language re-entrant interrupts / code writing)

rather: and application should be stand alone and talk to callbacks / get feedback from callbacks that X invoked (and all callbacks must be re-entrant)

the app must be prepared for the event X client or server forgets it or called it many times and do the right thing.  better yet you should be able to talk to the application from a shell: it's only interface shouldn't be just by X activated button callbacks.  and CERTAINLY, CERTAINLY the programs "code" should not be in any part of the X windows code

see: Open Desktop Adminitrator's Guide  (ODT)

(ps THAT GOES DOUBLE FOR MICROSOFT WINDOWS who's callbacks (in earlier versions) were infamous for wierndness and whose libs WERE the cause of failure and who's bluescreens ALWAYS blamed the application when infact, i can prove know for sure and have still: was microsoft released errors)

see: Open Desktop Adminitrator's Guide  (ODT)

RTFM.  read the f manual

now back to GL hardware.  if you have more than one GL window open the program (which isn't inside an X widget - which is autonomous from X) then program should fork a subprogram for each open GL window.  and talk only via/through atomic locked memory / thread locking

my guess is your application does FEW of any of the above
Comment 56 debguy 2014-11-28 15:47:21 UTC
some "trivial" X applications have X code inside widgets (?xmessage?)

however these apps are not as trivial as they appear.  and killing them have no over-all effect on X in the worst case

-----------------------
something like "GL DOOM" can really take down X if it messes up - leave a ton of s allocations un-free and threads open / waiting / looping callbacks

-----------------------
now GTK widget on the other hand cuddles software and is more allowing (takes care of many details of handing X details).  still even a GTK widget code should be aware that the program and x widgets should be client server (in the same way X is designed) with itself.  because even "high level widgets" are not fool proof from a program that doesn't understand re-entrance and client-server separation
Comment 57 GitLab Migration User 2018-12-13 22:24:15 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/xserver/issues/403.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.