Bug 28181 - Combination of foss ati driver 6.13.0 and xorg-server 1.8.1 gives seg faults.
Combination of foss ati driver 6.13.0 and xorg-server 1.8.1 gives seg faults.
Status: NEW
Product: xorg
Classification: Unclassified
Component: Server/General
unspecified
x86-64 (AMD64) All
: medium normal
Assigned To: Kristian Høgsberg
Xorg Project Team
:
: 28616 30026 33897 36153 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-19 22:00 UTC by No Tellin
Modified: 2013-03-08 12:38 UTC (History)
10 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg log showing segfault (94.91 KB, text/plain)
2010-05-20 03:51 UTC, No Tellin
no flags Details
vanilla kernel 2.6.33.4 .config (64.70 KB, application/octet-stream)
2010-05-27 03:58 UTC, No Tellin
no flags Details
vanilla kernel 2.6.34 .config (65.73 KB, application/octet-stream)
2010-05-27 04:05 UTC, No Tellin
no flags Details
patch to detect and work around the problem (587 bytes, patch)
2010-12-04 02:50 UTC, Andriy Gapon
no flags Details | Splinter Review
Possible fix (540 bytes, patch)
2010-12-08 09:52 UTC, Michel Dänzer
no flags Details | Splinter Review
The patch that I am using with 1.10.3 (2.46 KB, patch)
2011-07-30 13:23 UTC, Andriy Gapon
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description No Tellin 2010-05-19 22:00:13 UTC
There is an ongoing thread describing the problem on the Gentoo forums board here: http://forums.gentoo.org/viewtopic-t-828553.html

Thread posts include some Xorg.0.log info, system configuration info and problem description.

Basically, if you use the 'xf86-video-ati' open source radeon driver after upgrading from xorg-server-1.8.0 to xorg-server-1.8.1, you may have 'blind' random segfaulting of xorg-server.
Comment 1 Julien Cristau 2010-05-20 02:14:59 UTC
On Wed, May 19, 2010 at 22:00:13 -0700, bugzilla-daemon@freedesktop.org wrote:

> Thread posts include some Xorg.0.log info, system configuration info and
> problem description.

Please include the relevant information here, to avoid us trolling through
random forum posts.  Thanks.
Comment 2 No Tellin 2010-05-20 03:51:16 UTC
Created attachment 35770 [details]
Xorg log showing segfault

This is with xorg-server-1.8.1 and with xf86-video-ati-6.13.0.

The video driver was re-compiled after the install of 1.8.1.

The comparable log of the combination of 1.8.0 and this same video driver is identical up to the point of the blank line just before the backtrace message.

Other people on the noted forum thread are also reporting the same foss ati driver/server combination segfault. However, the fault doesn't happen at the same instance in time for everyone.

In my case, it is after kde-4.4.3 starts and various programs included in my saved desktop are launching. When I start xfce4, the segfault doesn't occur.

The implication is that the segfault occurs when some specific video function or combination of functions is requested.

People with other video cards are reporting they don't have a problem.

This includes people running the proprietary ati drivers.

You may still want to peruse the noted forum thread.

Thank you looking at this. It's really appreciated. If you need further information, please let me know.

Here is some configuration information on my system:

Portage 2.2_rc67 (default/linux/amd64/10.0, gcc-4.4.3, glibc-2.11.1-r0, 2.6.33.4 x86_64)
=================================================================
System uname: Linux-2.6.33.4-x86_64-AMD_Phenom-tm-_9600_Quad-Core_Processor-with-gentoo-2.0.1
Timestamp of tree: Wed, 19 May 2010 01:00:01 +0000
ccache version 2.4 [enabled]
app-shells/bash:     4.1_p5
dev-java/java-config: 2.1.11
dev-lang/python:     2.5.4-r4, 2.6.5-r2, 3.1.2-r3
dev-python/pycrypto: 2.1.0
dev-util/ccache:     2.4-r8
dev-util/cmake:      2.8.1-r1
sys-apps/baselayout: 2.0.1
sys-apps/openrc:     0.6.1-r1
sys-apps/sandbox:    2.2                                                                                                                                                                  
sys-devel/autoconf:  2.13, 2.65                                                                                                                                                           
sys-devel/automake:  1.8.5-r4, 1.9.6-r3, 1.10.3, 1.11.1                                                                                                                                   
sys-devel/binutils:  2.20.1-r1                                                                                                                                                            
sys-devel/gcc:       4.4.3-r2                                                                                                                                                             
sys-devel/gcc-config: 1.4.1                                                                                                                                                               
sys-devel/libtool:   2.2.6b                                                                                                                                                               
virtual/os-headers:  2.6.33                                                                                                                                                               
ACCEPT_KEYWORDS="amd64 ~amd64"                                                                                                                                                            
ACCEPT_LICENSE="* -@EULA dlj-1.1 PUEL"                                                                                                                                                    
CBUILD="x86_64-pc-linux-gnu"                                                                                                                                                              
CFLAGS="-march=opteron -O2 -pipe"                                                                                                                                                         
CHOST="x86_64-pc-linux-gnu"
Comment 3 No Tellin 2010-05-23 13:08:36 UTC
In KDE4 System Settings > General > Display > Advanced > Composting Type

Changing from OpenGL to XRender 'fixes' the problem. 

i.e. This is appears to be an OpenGL problem with the foss radeon driver in combination with xorg-server-1.8.1.

Switching to XRender solves the problem for me.
Comment 4 Paweł Rumian 2010-05-26 15:19:41 UTC
I'm not 100% sure if it's related, but:

I'm using Gentoo with kernel 2.6.34, xf86-video-ati-6.13.0, xorg-server-1.8.1-r1, mesa-7.8.1 with a R300 (R9500), KMS enabled.

KDE 4.4.3 starts with no problems an runs stable, but the screen is totally corrupted beyond width about 2600px (I'm using dual-screen config).

After changing desktop effects rendering to XRender or disabling them completely everything is OK. 
So I also suspect OpenGL to play some role here, but I have no idea how to debug this. The logs are clear.

Any help will be appreciated :)

greetings,
Paweł
Comment 5 Alex Deucher 2010-05-26 15:27:33 UTC
(In reply to comment #4)
> I'm not 100% sure if it's related, but:
> 

It's not.  this bug is about a segfault.

> I'm using Gentoo with kernel 2.6.34, xf86-video-ati-6.13.0,
> xorg-server-1.8.1-r1, mesa-7.8.1 with a R300 (R9500), KMS enabled.
> 
> KDE 4.4.3 starts with no problems an runs stable, but the screen is totally
> corrupted beyond width about 2600px (I'm using dual-screen config).
> 
> After changing desktop effects rendering to XRender or disabling them
> completely everything is OK. 
> So I also suspect OpenGL to play some role here, but I have no idea how to
> debug this. The logs are clear.

That's not a bug it's a hardware limitation.  The max render target size on r3xx is 2560x2560 pixels and the max textures size is 2048x2048 pixels.  if your desktop is larger than that, it will not draw correctly.
Comment 6 Paweł Rumian 2010-05-26 15:38:51 UTC
Thanks for the reply and sorry for messing up those two things.

Anyway I can report that xf86-video-ati-6.13.0 works for me with xorg-server-1.8.1-r1, althougs I didn't try it without KMS.

greetings,
Paweł
Comment 7 No Tellin 2010-05-27 03:58:52 UTC
Created attachment 35881 [details]
vanilla kernel 2.6.33.4 .config

Vanilla kernel settings used in version 2.6.33.4 kernel.
Comment 8 No Tellin 2010-05-27 04:05:10 UTC
Created attachment 35882 [details]
vanilla kernel 2.6.34 .config

Vanilla kernel 2.6.34 configuration settings.

This is just loverly.

I upgraded my kernel from 2.6.33.4 to 2.6.34. I no longer am having segfaults with the xorg-server/foss ati driver combination for which I originally opened this bug for.

I've included as attachments the .config kernel from both kernel versions.

I guess this makes this bug moot.

Thanks for you time and attention. It really is appreciated.
Comment 9 No Tellin 2010-05-27 16:34:18 UTC
{sigh}

I'm back to kernel 2.6.33.5 and xorg-server-1.8.0.

While the seqfaults for xorg-server-1.8.1 when away under kernel 2.6.34, opening kmplayer {and friends} resulted in mplayer opening it's own window instead of within kmplayer. None of the playback flags which kmplayer would normally pass to mplayer got passed. This results in a really sucky movie watching experience. {sorry about the really technical description}
Comment 10 Nikos Chantziaras 2010-06-09 09:04:49 UTC
Arch Linux has the same problem:

http://bugs.archlinux.org/task/19271

I also get crashes, exactly as described here and in the forums.

1.8.0 works fine.  1.8.1 and 1.8.1.901 will segfault at random.

Gentoo Linux AMD64
Kernel: 2.6.34
GPU: ATI Radeon HD4870 (R770)
GPU driver: xf86-video-ati from Git
Mesa: 7.8.1

I'm not using KMS, but UMS.  KMS doesn't crash but I don't want to use it because it's too slow to be useful.
Comment 11 Hugo Mildenberger 2010-06-20 07:53:44 UTC
http://bugs.gentoo.org/show_bug.cgi?id=320055  and  
https://bugzilla.redhat.com/show_bug.cgi?id=588845 are presumably related. 

I'm just copying relevant parts of my comment from redhat:

I encountered a very similar problem using an ATI Radeon IGP 9100 (R200) and
xorg server version 1.8.1-901 on Gentoo.  An X crash is often triggered when Konqueror's address bar opens a drop downlist. Also minimizing and maximizing windows quickly, or using the 3D cube desktop switch back and forth may trigger it, but only if compositing via OpenGL is enabled within KDE. compositing via xrender is unaffected. xorg-server-1.8.0 is the last working
release tested by me. Using xorg-1.8.1-901, the stack was

 #9  <signal handler called>
 #10 0x517d8210 in DrawableGone (glxPriv=0x12b06288, xid=20971921)
     at glxext.c:133
 #11 0x1221cfaf in FreeResource (id=20971921, skipDeleteFuncType=0)
     at resource.c:560
 #12 0x517d5112 in DoDestroyDrawable (cl=<value optimized out>, 
     glxdrawable=20971921, type=1) at glxcmds.c:1275
 #13 0x517d80ab in __glXDispatch (client=0x126a10b0) at glxext.c:601
 #14 0x12200ced in Dispatch () at dispatch.c:439
 #15 0x121f64b5 in main (argc=10, argv=0x5ea4adc4, envp= Cannot access 
     memory at address 0x3e

This appears to have a similar fingerprint. In my environment, DrawableGone
failed due to an invalid value for glxPriv->pDraw:

 133     if (glxPriv->drawId != glxPriv->pDraw->id) {
 134  if (xid == glxPriv->drawId)
 135      FreeResourceByType(glxPriv->pDraw->id, __glXDrawableRes,
                    TRUE);

 print glxPriv->drawId
 $1 = 20971921
 print glxPriv->pDraw
 $2 = (DrawablePtr) 0x3cbaf008
 print glxPriv->pDraw->id
 Cannot access memory at address 0x3cbaf00c


Here is also some info on the resource id and type gained from the FreeResource
frame:
  550 #ifdef XSERVER_DTRACE
  551   XSERVER_RESOURCE_FREE(res->id, res->type,
  552          res->value, TypeNameString(res->type));
  553 #endif      
  554   *prev = res->next;

  print *res
  $1 = {next = 0x1280d760, id = 20971921, type = 54, value = 0x12b06288}


Bisecting xorg-server between 1.8.0 and 1.8.1-901 finally revealed that

  0460a76b9ae25fe26f683f0cbff1e4157287cf56 is the first bad commit
  commit 0460a76b9ae25fe26f683f0cbff1e4157287cf56
  Author: Kristian Høgsberg <krh@bitplanet.net>
  Date:   Fri Apr 16 05:55:33 2010 -0400

  glx: Let the resource system destroy pixmaps

  GLX pbuffers are implemented using a pixmap allocated by the server.
  With the change to DRI2 to track DRI2 drawables as resources, we need to make
  sure that every drawable we create a DRI2 drawable for has an XID.  By
  using the XID of the pbuffer, the resource system will automatically
  reclaim the hidden pixmap and the DRI2 drawable when the pbuffer is
  destroyed or the client exits.

  Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
  Signed-off-by: Keith Packard <keithp@keithp.com>
  (cherry picked from commit 22da7aa9d743deee198aaf6df5d370a446db9763)

  :040000 040000 47f59391028a3c792c3ea22a0eb65a65c9f414c4
   ac43336bcc8ee3545f1b673affc5d2121f9054c7 M      glx

The problem disappeared after reverting that particular commit.
Comment 12 Alex Deucher 2010-06-20 22:29:22 UTC
Re-assigning to xserver.  Is this also an issue with xserver master/1.9.x?
Comment 13 Michel Dänzer 2010-06-21 02:13:36 UTC
Might be worth trying current upstream Git server-1.8-branch as well.
Comment 14 Alex Deucher 2010-06-21 06:49:22 UTC
*** Bug 28616 has been marked as a duplicate of this bug. ***
Comment 15 Nikos Chantziaras 2010-06-21 13:40:11 UTC
Not able to test xorg-server from Git, since it gets caught in an infinite loop of crashes in the KDE startup screen.

I will wait for the next RC or something.
Comment 16 Nikos Chantziaras 2010-07-16 01:10:12 UTC
I just updated to latest libdrm, xorg-server, mesa and xf86-video-ati from Git, as well as kernel 2.6.35-rc5. It still crashes with UMS (and KMS is not working correctly with KDE.)
Comment 17 Anton Shterenlikht 2010-07-20 00:54:30 UTC
I think I might be seeing the same problem on FreeBSD.

On FreeBSD 9.0-CURRENT #0 r210229 amd64
I launch a graphical app on another host via ssh -X.
In case it matters, the app is Paraview 3.8.0 (www.paraview.org)
Closing any window in Paraview results in Xorg crash.

On the desktop I've:

xf86-video-ati-6.13.0
xorg-server-1.7.5,1
xdm-1.1.8_2

After a crash I get this backtrace from Xorg.core:

*skip few lines*

Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done.
Loaded symbols for /libexec/ld-elf.so.1
#0  0x0000000801a30d8c in kill () from /lib/libc.so.7
[New Thread 801c071c0 (LWP 100079)]
(gdb) bt full
#0  0x0000000801a30d8c in kill () from /lib/libc.so.7
No symbol table info available.
#1  0x0000000801a2fb4b in abort () from /lib/libc.so.7
No symbol table info available.
#2  0x0000000000476744 in ddxGiveUp ()
No symbol table info available.
#3  0x000000000046fcdd in AbortServer ()
No symbol table info available.
#4  0x000000000047035f in FatalError ()
No symbol table info available.
#5  0x000000000046a581 in OsInit ()
No symbol table info available.
#6  <signal handler called>
No symbol table info available.
#7  0x0000000802257ddb in DrawableGone () from /usr/local/lib/xorg/modules/extensions/libglx.so
No symbol table info available.
#8  0x00000000004524cc in FreeResource ()
No symbol table info available.
#9  0x0000000000434c2b in ProcDestroyWindow ()
No symbol table info available.
#10 0x0000000000437460 in Dispatch ()
No symbol table info available.
#11 0x000000000042d7ba in main ()
No symbol table info available.
(gdb)

It's this line:
#7  0x0000000802257ddb in DrawableGone () from /usr/local/lib/xorg/modules/extensions/libglx.so

that led me to suggest the problem might be
with GLX module.

I apologise if my problem has nothing to do with this bug.

many thanks
anton
Comment 18 Nikos Chantziaras 2010-07-22 00:33:58 UTC
I just tried server 1.8.2. Problem persists and I'm still stuck at 1.8.0, the only 1.8 version that doesn't crash.

Any chance to see this fixed?
Comment 19 Michel Dänzer 2010-07-22 01:08:24 UTC
(In reply to comment #18)
> I just tried server 1.8.2. Problem persists and I'm still stuck at 1.8.0, the
> only 1.8 version that doesn't crash.

Would be great if someone could bisect which commit from Git server-1.8-branch broke this.
Comment 20 Hugo Mildenberger 2010-07-22 10:08:20 UTC
(In reply to comment #19)
> (In reply to comment #18)
> > I just tried server 1.8.2. Problem persists and I'm still stuck at 1.8.0, the
> > only 1.8 version that doesn't crash.
> 
> Would be great if someone could bisect which commit from Git server-1.8-branch
> broke this.

Comment #11 above seems to have bypassed your attention:

  Bisecting xorg-server between 1.8.0 and 1.8.1-901 finally revealed that

     0460a76b9ae25fe26f683f0cbff1e4157287cf56 is the first bad commit
     commit 0460a76b9ae25fe26f683f0cbff1e4157287cf56
     Author: Kristian Høgsberg <krh@bitplanet.net>
     Date:   Fri Apr 16 05:55:33 2010 -0400

     glx: Let the resource system destroy pixmaps


As I already said, reverting this commit cured the symptom, if not the cause.
Comment 21 Nikos Chantziaras 2010-07-22 10:44:28 UTC
(In reply to comment #20)
> (In reply to comment #19)
> > (In reply to comment #18)
> > > I just tried server 1.8.2. Problem persists and I'm still stuck at 1.8.0, the
> > > only 1.8 version that doesn't crash.
> > 
> > Would be great if someone could bisect which commit from Git server-1.8-branch
> > broke this.
> 
> Comment #11 above seems to have bypassed your attention:
> 
>   Bisecting xorg-server between 1.8.0 and 1.8.1-901 finally revealed that
> 
>      0460a76b9ae25fe26f683f0cbff1e4157287cf56 is the first bad commit
>      commit 0460a76b9ae25fe26f683f0cbff1e4157287cf56
>      Author: Kristian Høgsberg <krh@bitplanet.net>
>      Date:   Fri Apr 16 05:55:33 2010 -0400
> 
>      glx: Let the resource system destroy pixmaps
> 
> 
> As I already said, reverting this commit cured the symptom, if not the cause.

I'm not able to reverse apply that patch though:

patching file glx/glxcmds.c
Reversed (or previously applied) patch detected!  Assume -R? [n] y
Hunk #1 succeeded at 1102 (offset 1 line).
Hunk #2 succeeded at 1119 (offset 2 lines).
Hunk #3 FAILED at 1135.
Hunk #4 succeeded at 1154 (offset 3 lines).
Hunk #5 succeeded at 1167 with fuzz 1 (offset 3 lines).
Hunk #6 succeeded at 1310 (offset 3 lines).
Hunk #7 succeeded at 1320 with fuzz 2 (offset 3 lines).
Hunk #8 succeeded at 1428 (offset 4 lines).
1 out of 8 hunks FAILED -- saving rejects to file glx/glxcmds.c.rej

Any way I can get that patch to reverse-apply to 1.8.2?
Comment 22 Hugo Mildenberger 2010-07-22 15:00:41 UTC
(In reply to comment #21)

> Any way I can get that patch to reverse-apply to 1.8.2?
Obviously not without manual intervention. That file simply had been changed too much in the meantime. Try this instead:

git clone git://cgit.freedesktop.org/xorg/xserver/
cd xserver
git checkout xorg-server-1.8.1.901
git revert 0460a76b9ae25fe26f683f0cbff1e4157287cf56
Comment 23 Nikos Chantziaras 2010-07-23 04:33:02 UTC
Thanks. I can now confirm too that reverting that commit fixes all the crashes.
Comment 24 Michel Dänzer 2010-09-05 03:35:17 UTC
*** Bug 30026 has been marked as a duplicate of this bug. ***
Comment 25 Andriy Gapon 2010-11-15 06:17:06 UTC
I can still reproduce the issue at will with both xorg-server-1.8.2 and xorg-server-1.9.2 on FreeBSD/amd64. ati driver version is at 6.13.1.
All I have to do in minimize and un-minimize an application in KDE4 by clicking twice on its taskbar icon.
Comment 26 Herton Ronaldo Krzesinski 2010-11-15 11:08:33 UTC
Hi,

I have a machine here which is also affected by same issue apparently, but my backtrace is different. I never used xserver 1.8 in it, was using 1.7 and jumped directly to 1.9 and started to have segfaults.

And can only reproduce  it in this machine with ati card (radeon 4670). I trigger the crash easily while typing on firefox or chrome address bar, while they do auto completion X crashes.

With latest xserver and xf86-video-ati of today, this is what I get in gdb when X crashes:

Program received signal SIGSEGV, Segmentation fault.
0xb725b19a in __glXDRIreleaseTexImage (baseContext=0x9ec6460, buffer=8414, pixmap=0x94447f8) at glxdri.c:586
586             (__GLXDRIscreen *) glxGetScreen(pixmap->pDraw->pScreen);
(gdb) bt
#0  0xb725b19a in __glXDRIreleaseTexImage (baseContext=0x9ec6460, buffer=8414, pixmap=0x94447f8) at glxdri.c:586
#1  0xb724e872 in __glXDisp_ReleaseTexImageEXT (cl=0x9a5fbd8, pc=0x9b636cc "\217\001\200\001\336 ") at glxcmds.c:1634
#2  0xb724f61a in __glXDisp_VendorPrivate (cl=0x9a5fbd8, pc=0x9b636c0 "\234\020\005") at glxcmds.c:2314
#3  0xb725160f in __glXDispatch (client=0x9a5fb00) at glxext.c:600
#4  0x0806f0b7 in Dispatch () at dispatch.c:431
#5  0x080620b5 in main (argc=7, argv=0xbfb45de4, envp=0xbfb45e04) at main.c:287
(gdb) l
581     __glXDRIreleaseTexImage(__GLXcontext *baseContext,
582                             int buffer,
583                             __GLXdrawable *pixmap)
584     {
585         __GLXDRIscreen *screen =
586             (__GLXDRIscreen *) glxGetScreen(pixmap->pDraw->pScreen);
587         __GLXDRIdrawable *drawable = (__GLXDRIdrawable *) pixmap;
588
589         __glXDRIdoReleaseTexImage(screen, drawable);
590
(gdb) p pixmap->pDraw
$1 = (DrawablePtr) 0x92051008
(gdb) p pixmap->pDraw->pScreen
Cannot access memory at address 0x92051018

Going back to older xserver releases, I can get the crash until xserver 1.8.1, xserver 1.8.0 doesn't crash. Doing a bisect between them, I also found the same commit 0460a76b9ae25fe26f683f0cbff1e4157287cf56 ("glx: Let the resource system destroy pixmaps") as causing this crash. May be this new backtrace helps tracking the real issue.
Comment 27 Andriy Gapon 2010-11-17 05:28:34 UTC
BTW, I wonder if there could be some deep connection between this issue and the following KDE-triggered bug:
https://bugs.kde.org/show_bug.cgi?id=256359

In one case it's accessing stale TexImage, in the other case it's running out of slots for TexImage-es.
Hm.
Comment 28 Andriy Gapon 2010-12-04 02:50:34 UTC
Created attachment 40794 [details] [review]
patch to detect and work around the problem

I think that I've made a new step in debugging of this problem.
It seems that it is caused by Drawable-s that have id of zero.
That is, I see that sometimes dixLookupDrawable() call in DoCreateGLXPixmap() returns a drawable such that pDraw->id == 0.
Please see the patch that detects this condition and provides a very naive workaround for it.

With this patch I get messages like the following from time to time:
[952178.180] (EE) DANGER: DoCreateGLXPixmap - drawableId 27282812, pDraw->id 0
[952178.180] (EE) ... fixing up
[952188.769] (EE) DANGER: DoCreateGLXPixmap - drawableId 27283144, pDraw->id 0
[952188.770] (EE) ... fixing up
[952189.271] (EE) DANGER: DoCreateGLXPixmap - drawableId 27283147, pDraw->id 0
[952189.271] (EE) ... fixing up

I was not able to fully understand how those drawables originate.
It seems that they are created in compAllocPixmap => compNewPixmap => CreatePixmap.  It looks like at this point the pixmap is returned with the id field of zero.
And associated with a window in compSetPixmap.
ProcCompositeNameWindowPixmap then gets the drawable/pixmap via GetWindowPixmap and adds it as a RT_PIXMAP resource, this is where the drawable gets registered under a non-zero XID.

I am not sure exactly which step causes the trouble.

Please bear in mind that the above debugging was done of FreeBSD, so no newer features like KMS/GEM/DRI2.
The patch is against xorg server 1.8.2 sources.
Comment 29 Michel Dänzer 2010-12-08 09:52:57 UTC
Created attachment 40927 [details] [review]
Possible fix

I just noticed something that looks like a stale reference to pDraw->id. Does this patch instead of the previous one fix the problem?
Comment 30 Andriy Gapon 2010-12-08 11:19:49 UTC
(In reply to comment #29)
> I just noticed something that looks like a stale reference to pDraw->id. Does
> this patch instead of the previous one fix the problem?

Will do, but please note that DrawableGone() in glxext.c also uses glxPriv->pDraw->id and I don't an easy way to fix that besides fixing up pDraw->id.
Comment 31 Andriy Gapon 2010-12-14 22:53:00 UTC
Tested the patch - the original crashes seem to be gone, but now I am getting some new ones - more random and more mysterious.

Here is a stack trace:
#0  0x00000008170a0f8a in bo_open () from /usr/local/lib/dri/r600_dri.so
#1  0x00000008170a6a4c in radeonRefillCurrentDmaRegion () from /usr/local/lib/dri/r600_dri.so
#2  0x00000008170a6d5c in radeonAllocDmaRegion () from /usr/local/lib/dri/r600_dri.so
#3  0x000000081709a671 in r700DrawPrims () from /usr/local/lib/dri/r600_dri.so
#4  0x0000000817144b40 in vbo_exec_vtx_flush () from /usr/local/lib/dri/r600_dri.so
#5  0x0000000817142178 in vbo_exec_FlushVertices_internal () from /usr/local/lib/dri/r600_dri.so
#6  0x00000008171421b0 in vbo_exec_FlushVertices () from /usr/local/lib/dri/r600_dri.so
#7  0x00000008171180b3 in _mesa_set_scissor () from /usr/local/lib/dri/r600_dri.so
#8  0x0000000803217c98 in __glXDisp_Scissor (pc=0x802e224e8 "�\003") at indirect_dispatch.c:1054
#9  0x0000000803249699 in __glXDisp_Render (cl=0x829fe38e0, pc=0x802e224e4 "\024") at glxcmds.c:1822
#10 0x0000000803250a4d in __glXDispatch (client=0x802c1f800) at glxext.c:586
#11 0x000000000046dc58 in Dispatch () at dispatch.c:439
#12 0x0000000000421a12 in main (argc=8, argv=0x7fffffffecb0, envp=0x7fffffffecf8) at main.c:286

Going back to my original hack I get no crashes at all.
Comment 32 Michel Dänzer 2010-12-15 03:23:41 UTC
Does http://lists.x.org/archives/xorg-devel/2010-December/016969.html help?
Comment 33 Andriy Gapon 2010-12-16 04:53:38 UTC
(In reply to comment #32)
> Does http://lists.x.org/archives/xorg-devel/2010-December/016969.html help?

Yes! Thanks a lot!
Comment 34 Chris Wilson 2011-02-04 03:30:23 UTC
*** Bug 33897 has been marked as a duplicate of this bug. ***
Comment 35 Toralf Förster 2011-02-11 02:36:59 UTC
If I aplied this patch an almost stable Gentoo then the compositing was broken completely under KDE 4.4.5. With the common Gentoo xorg-server package (plus this patch http://www.mail-archive.com/xorg-devel@lists.x.org/msg18111.html) I can use compositing effects - however I've to avoid to call an Open-GL screen saver(s) - see bug #32822
Comment 36 Michael Lorenz 2011-03-16 15:09:16 UTC
This bug in exists in NetBSD -current as well:

Core was generated by `Xorg'.
Program terminated with signal 6, Aborted.
#0  0x00007f7ff4b506ba in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0  0x00007f7ff4b506ba in _lwp_kill () from /usr/lib/libc.so.12
#1  0x00007f7ff4b5067e in raise (s=6) at /s/jmmv/os/netbsd/src/lib/libc/gen/raise.c:48
#2  0x00007f7ff4b4fd73 in abort () at /s/jmmv/os/netbsd/src/lib/libc/stdlib/abort.c:74
#3  0x0000000000607672 in OsAbort () at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/os/utils.c:1263
#4  0x0000000000487a43 in ddxGiveUp () at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/hw/xfree86/common/xf86Init.c:940
#5  0x0000000000487b4e in AbortDDX () at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/hw/xfree86/common/xf86Init.c:986
#6  0x00000000005fe556 in AbortServer () at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/os/log.c:424
#7  0x00000000005fea4e in FatalError (f=0x6a2b98 "Caught signal %d (%s). Server aborting\n") at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/os/log.c:552
#8  0x00000000006081e5 in OsSigHandler (signo=11, sip=0x7f7fffffd620, unused=0x7f7fffffd6a0) at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/os/osinit.c:156
#9  <signal handler called>
#10 0x000000000061e9ca in DrawableGone (glxPriv=0x7f7fd8cfe080, xid=20973460) at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/glx/glxext.c:133
#11 0x0000000000446a21 in FreeResource (id=20973460, skipDeleteFuncType=0) at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/dix/resource.c:601
#12 0x0000000000627259 in DoDestroyDrawable (cl=0x7f7ff7769948, glxdrawable=20973460, type=1) at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/glx/glxcmds.c:1281
#13 0x00000000006272c1 in __glXDisp_DestroyPixmap (cl=0x7f7ff7769948, pc=0x7f7fd500703c "\234\027\003")
   at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/glx/glxcmds.c:1297
#14 0x000000000061f59e in __glXDispatch (client=0x7f7ff7769820) at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/glx/glxext.c:600
#15 0x0000000000460c40 in Dispatch () at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/dix/dispatch.c:432
#16 0x000000000042e088 in main (argc=4, argv=0x7f7fffffdcb0, envp=0x7f7fffffdcd8) at /home/jmmv/os/netbsd/xsrc/external/mit/xorg-server/dist/dix/main.c:291
(gdb) 

[ 86612.709] 
X.Org X Server 1.9.2
Release Date: 2010-10-30
[ 86612.710] X Protocol Version 11, Revision 0
[ 86612.710] Build Operating System: NetBSD/amd64  - 
[ 86612.710] Current Operating System: NetBSD desky 5.99.48 NetBSD 5.99.48 (GENERIC) #4: Mon Mar 14 21:41:36 GMT 2011 jmmv@desky:/s/jmmv/os/netbsd/obj.amd64/s/jmmv/os/netbsd/src/sys/arch/amd64/compile/GENERIC amd64
[ 86612.710] Build Date: 23 November 2010  01:33:05AM
Comment 37 Julio Merino 2011-03-18 02:47:49 UTC
I am the original reporter of the logs submitted in comment #36.  I sent those as a NetBSD bug which can be accessed here for further reference:

http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=44730

I was experiencing random and very frequent random crashes of Xorg when running KDE 4.5 on my NetBSD-current machine, a MacBookPro2,2 with an ATI Radeon Mobility X1600.

I applied the patch posted in this bug report and my machine has been rock solid since then, with no apparent problems elsewhere.
Comment 38 Chris Wilson 2011-04-12 05:27:54 UTC
*** Bug 36153 has been marked as a duplicate of this bug. ***
Comment 39 Andriy Gapon 2011-07-27 07:00:16 UTC
I can still reproduce the problem with xorg-server-1.10.3.
The patch referenced in Comment 32 no longer applies to the code.
Comment 40 Andriy Gapon 2011-07-30 13:23:11 UTC
Created attachment 49751 [details] [review]
The patch that I am using with 1.10.3
Comment 41 Adam 2013-03-08 12:38:53 UTC
Hi, just a thanks for the patch in comment 40.  This bug is still present in *ubuntu 12.04 and debian wheezy (when I tested it a few weeks ago) with radeon UMS.  Ubuntu used to apply the previous patch in this report, but it was dropped during the 11.10 cycle.

Un-minimising a window such as Konsole in open-gl 12.04 Kubuntu pretty much guarantees a crash of the Xorg.  If you use compiz 0.8.* with emerald or the metacity window decorators then resizing a file manager in 'normal' mode again pretty much gurantees you being chucked out of your X session.  Crashes are far less (or zero?) with the compiz cairo window decorator.  Anyway, the patch in post 40 solves the problems for me in *ubuntu 12.04. Many thanks!