Bug 14004 - using xrestop to monitor memory usage and starting glxgears, xorg crashes always
Summary: using xrestop to monitor memory usage and starting glxgears, xorg crashes always
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Server/General (show other bugs)
Version: git
Hardware: x86 (IA32) Linux (All)
: medium normal
Assignee: Xorg Project Team
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-10 02:16 UTC by Jens Stroebel
Modified: 2008-04-09 06:35 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Fix off-by-1 error in ProcXResQueryClients() (418 bytes, patch)
2008-04-09 03:36 UTC, Michel Dänzer
no flags Details | Splinter Review

Description Jens Stroebel 2008-01-10 02:16:12 UTC
With xorg from git (2008-01-07), when I use xrestop to monitor memory usage and start glxgears, xorg crashes sometimes/often; starting a second glxgears crashes xorg quite reliably (read= nearly always).

a gdb backtrace of that:
#0  0xb7f68410 in __kernel_vsyscall ()
#1  0xb7b4b7b1 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2  0xb7b4cfc8 in *__GI_abort () at abort.c:88
#3  0xb7b8136b in __libc_message (do_abort=2, fmt=0xb7c34888 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#4  0xb7b86fe0 in malloc_printerr (action=2, str=0xb7c31729 "corrupted double-linked list",
    ptr=<value optimized out>) at malloc.c:5758
#5  0xb7b88690 in _int_free (av=0xb7c4b120, mem=0x863c848) at malloc.c:4548
#6  0xb7b88ac9 in *__GI___libc_free (mem=0x863c848) at malloc.c:3541
#7  0x0815c872 in Xfree (ptr=0x863c848) at utils.c:1451
#8  0xb7a9dfcf in ProcXResQueryClients (client=0x8724598) at xres.c:105
#9  0xb7a9e971 in ProcResDispatch (client=0x8724598) at xres.c:316
#10 0x08187962 in XaceCatchExtProc (client=0x8724598) at xace.c:307
#11 0x08083750 in Dispatch () at dispatch.c:467
#12 0x0806b8a8 in main (argc=5, argv=0xbf8d5974, envp=0xbf8d598c) at main.c:448
Comment 1 Jens Stroebel 2008-01-10 06:51:04 UTC
(In reply to comment #0)
> With xorg from git (2008-01-07), when I use xrestop to monitor memory usage and
> start glxgears, xorg crashes sometimes/often; starting a second glxgears
> crashes xorg quite reliably (read= nearly always).

For what it's worth:
When toying around with this bug, xorg crashed on me without anything but 2 xterms running when I started xrestop in one of them.

 gdb backtrace:
#0  0xb7f57410 in __kernel_vsyscall ()
#1  0xb7b3a7b1 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2  0xb7b3bfc8 in *__GI_abort () at abort.c:88
#3  0xb7b7036b in __libc_message (do_abort=2, fmt=0xb7c23888 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#4  0xb7b75fe0 in malloc_printerr (action=2, str=0xb7c20729 "corrupted double-linked list",
    ptr=<value optimized out>) at malloc.c:5758
#5  0xb7b7609a in malloc_consolidate (av=0xb7c3a120) at malloc.c:4714
#6  0xb7b7806a in _int_malloc (av=0xb7c3a120, bytes=608) at malloc.c:4081
#7  0xb7b7951e in *__GI___libc_malloc (bytes=608) at malloc.c:3468
#8  0x0815c627 in Xalloc (amount=608) at utils.c:1332
#9  0x080847cd in ProcQueryTree (client=0x86195a0) at dispatch.c:892
#10 0x08187889 in XaceCatchDispatchProc (client=0x86195a0) at xace.c:285
#11 0x08083750 in Dispatch () at dispatch.c:467
#12 0x0806b8a8 in main (argc=5, argv=0xbffcc624, envp=0xbffcc63c) at main.c:448
Comment 2 Jens Stroebel 2008-01-30 04:42:50 UTC
With all parts of xorg from git (libs, xserver, drm, drm kernel modules, pixman, mesa, drivers) of 2008-01-29, the described effect still persists.
Comment 3 Jens Stroebel 2008-02-13 03:29:00 UTC
(In reply to comment #2)
> With all parts of xorg from git (libs, xserver, drm, drm kernel modules,
> pixman, mesa, drivers) of 2008-02-12, the described effect still persists.

But the crash happens reliably after a couple of seconds and the gdb backtrace looks different now:

#0  0xb7f2d410 in __kernel_vsyscall ()
#1  0xb7b057b1 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2  0xb7b06fc8 in *__GI_abort () at abort.c:88
#3  0xb7b3b36b in __libc_message (do_abort=2,
    fmt=0xb7bee888 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#4  0xb7b40fe0 in malloc_printerr (action=2,
    str=0xb7bee8f8 "free(): invalid next size (normal)", ptr=<value optimized out>)
    at malloc.c:5758
#5  0xb7b42ac9 in *__GI___libc_free (mem=0x85ef180) at malloc.c:3541
#6  0x0815cfe2 in Xfree (ptr=0x85ef180) at utils.c:1451
#7  0x08141433 in miRegionValidate (badreg=0x8695398, pOverlap=0xbf8252e0) at miregion.c:1405
#8  0x0814170e in miRectsToRegion (nrects=2, prect=0x868aeb0, ctype=6) at miregion.c:1484
#9  0xb784e573 in exaPolyFillRect (pDrawable=0x8686000, pGC=0x8692f78, nrect=2,
    prect=0x868aea0) at exa_accel.c:777
#10 0xb784e1b7 in exaPolylines (pDrawable=0x8686000, pGC=0x8692f78, mode=0, npt=3,
    ppt=0x8629934) at exa_accel.c:694
#11 0x081b01da in damagePolylines (pDrawable=0x8686000, pGC=0x8692f78, mode=0, npt=3,
    ppt=0x8629934) at damage.c:993
#12 0x0808755e in ProcPolyLine (client=0x86a3340) at dispatch.c:1913
#13 0x080837e9 in Dispatch () at dispatch.c:468
#14 0x0806b918 in main (argc=5, argv=0xbf825614, envp=0xbf82562c) at main.c:448
Comment 4 Michel Dänzer 2008-02-13 04:07:39 UTC
(In reply to comment #3)
> But the crash happens reliably after a couple of seconds and the gdb backtrace
> looks different now:

[...]

> #4  0xb7b40fe0 in malloc_printerr (action=2,
>     str=0xb7bee8f8 "free(): invalid next size (normal)", ptr=<value optimized
> out>)
>     at malloc.c:5758
> #5  0xb7b42ac9 in *__GI___libc_free (mem=0x85ef180) at malloc.c:3541
> #6  0x0815cfe2 in Xfree (ptr=0x85ef180) at utils.c:1451
> #7  0x08141433 in miRegionValidate (badreg=0x8695398, pOverlap=0xbf8252e0) at
> miregion.c:1405
> #8  0x0814170e in miRectsToRegion (nrects=2, prect=0x868aeb0, ctype=6) at
> miregion.c:1484
> #9  0xb784e573 in exaPolyFillRect (pDrawable=0x8686000, pGC=0x8692f78, nrect=2,
>     prect=0x868aea0) at exa_accel.c:777

Hmm... does it also happen with XAA instead of EXA?

Although, it looks like there's just memory corruption occurring somewhere, not necessarily related to the places where it's being detected... running the X server in valgrind might be useful.
Comment 5 Jens Stroebel 2008-02-13 04:52:38 UTC
(In reply to comment #4)
> Hmm... does it also happen with XAA instead of EXA?

Unfortunately, XAA broke again somewhere along the way...
I had to re-open bug #12922 again because of that.

> Although, it looks like there's just memory corruption occurring somewhere, not
> necessarily related to the places where it's being detected... running the X
> server in valgrind might be useful.

This will have to wait a little. I hope I'll find the time tomorrow. 

Comment 6 Jens Stroebel 2008-02-14 06:29:25 UTC
(In reply to comment #4)
  [...] 
> Although, it looks like there's just memory corruption occurring somewhere, not
> necessarily related to the places where it's being detected... running the X
> server in valgrind might be useful.

I tried that today, but unfortunately valgrind exits before the server is actually started with:

valgrind: m_syswrap/syswrap-generic.c:1722 (vgModuleLocal_generic_POST_sys_shmdt): Assertion 's->kind == SkShmC' failed

So ... 

As the bug doesn't happen with what we actually use on/in bclinux now
  (server-1.4-branch, mesa_7_0_branch, xf86-video-intel-2.2-branch)
I can live with it. Sorry we couldn't help here.
Comment 7 Jens Stroebel 2008-02-27 08:12:47 UTC
new backtrace with xorg from git (2008-02-27, all parts, including mesa+drm-kernel-modules):

#0  0xb7f3d410 in __kernel_vsyscall ()
#1  0xb7b157b1 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2  0xb7b16fc8 in *__GI_abort () at abort.c:88
#3  0xb7b4b36b in __libc_message (do_abort=2,
    fmt=0xb7bfe888 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#4  0xb7b50fe0 in malloc_printerr (action=2, str=0xb7bfb729 "corrupted double-linked list",
    ptr=<value optimized out>) at malloc.c:5758
#5  0xb7b52690 in _int_free (av=0xb7c15120, mem=0x86e2a88) at malloc.c:4548
#6  0xb7b52ac9 in *__GI___libc_free (mem=0x86e2a88) at malloc.c:3541
#7  0x0815c2d2 in Xfree (ptr=0x86e2a88) at utils.c:1451
#8  0xb7a67f17 in ProcXResQueryClients (client=0x8745080) at xres.c:105
#9  0xb7a688b9 in ProcResDispatch (client=0x8745080) at xres.c:316
#10 0x08083586 in Dispatch () at dispatch.c:469
#11 0x0806b686 in main (argc=5, argv=0xbfbc11b4, envp=0xbfbc11cc) at main.c:439
Comment 8 Jens Stroebel 2008-03-04 05:50:00 UTC
The torture never stops... errrr I mean:
The double-linked list is gone, but now there's something new.

new backtrace with xorg from git (2008-03-04, all parts, including
mesa+drm-kernel-modules):

#0  0xb7fbd410 in __kernel_vsyscall ()
#1  0xb7b957b1 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2  0xb7b96fc8 in *__GI_abort () at abort.c:88
#3  0xb7bcb36b in __libc_message (do_abort=2,
    fmt=0xb7c7e888 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#4  0xb7bd0fe0 in malloc_printerr (action=2,
    str=0xb7c7e8f8 "free(): invalid next size (normal)", ptr=<value optimized out>)
    at malloc.c:5758
#5  0xb7bd2ac9 in *__GI___libc_free (mem=0x867ea30) at malloc.c:3541
#6  0x0815c4d2 in Xfree (ptr=0x867ea30) at utils.c:1451
#7  0x08191921 in FreePicture (value=0x867ea30, pid=0) at picture.c:1531
#8  0x0818b354 in miGlyphs (op=3 '\003', pSrc=0x8757800, pDst=0x86179e8,
    maskFormat=0x82737d0, xSrc=0, ySrc=0, nlist=-1, list=0xbfeb47f4, glyphs=0xbfeb448c)
    at glyph.c:767
#9  0x081ae139 in damageGlyphs (op=3 '\003', pSrc=0x8757800, pDst=0x86179e8,
    maskFormat=0x82737d0, xSrc=0, ySrc=0, nlist=1, list=0xbfeb47e8, glyphs=0xbfeb43e8)
    at damage.c:654
#10 0x0818adb1 in CompositeGlyphs (op=3 '\003', pSrc=0x8757800, pDst=0x86179e8,
    maskFormat=0x82737d0, xSrc=0, ySrc=0, nlist=1, lists=0xbfeb47e8, glyphs=0xbfeb43e8)
    at glyph.c:629
#11 0x08195ce2 in ProcRenderCompositeGlyphs (client=0x860abd8) at render.c:1461
#12 0x0819796e in ProcRenderDispatch (client=0x860abd8) at render.c:2086
#13 0x08083e6d in Dispatch () at dispatch.c:454
#14 0x0806b75b in main (argc=5, argv=0xbfeb4ca4, envp=0xbfeb4cbc) at main.c:441
Comment 9 Jens Stroebel 2008-03-18 06:49:54 UTC
In the mood for another backtrace?

xorg from git, all HEAD, from 2008-03-17, kernel 2.6.23.17, drm-modules from git HEAD:

#0  0xb7fb8410 in __kernel_vsyscall ()
#1  0xb7b907b1 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2  0xb7b91fc8 in *__GI_abort () at abort.c:88
#3  0xb7bc636b in __libc_message (do_abort=2,
    fmt=0xb7c79888 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#4  0xb7bcbfe0 in malloc_printerr (action=2,
    str=0xb7c798b4 "munmap_chunk(): invalid pointer", ptr=<value optimized out>)
    at malloc.c:5758
#5  0x0815cd7a in Xfree (ptr=0x863fcf0) at utils.c:1451
#6  0x0813f1c3 in miRegionDestroy (pReg=0x863fcf0) at miregion.c:261
#7  0x08135c89 in miComputeCompositeClip (pGC=0x8600138, pDrawable=0x86932c8) at migc.c:214
#8  0xb78e9b76 in fbValidateGC (pGC=0x8600138, changes=524556, pDrawable=0x86932c8)
    at fbgc.c:215
#9  0xb78c7462 in exaValidateGC (pGC=0x8600138, changes=524556, pDrawable=0x86932c8)
    at exa.c:606
#10 0x081ae0f1 in damageValidateGC (pGC=0x8600138, changes=524556, pDrawable=0x86932c8)
    at damage.c:445
#11 0x0809dbf3 in ValidateGC (pDraw=0x86932c8, pGC=0x8600138) at gc.c:79
#12 0x08087543 in ProcPolyRectangle (client=0x87087c0) at dispatch.c:1718
#13 0x08083ead in Dispatch () at dispatch.c:454
#14 0x0806b79b in main (argc=5, argv=0xbfb55204, envp=0xbfb5521c) at main.c:441
Comment 10 Jens Stroebel 2008-04-09 03:13:28 UTC
(In reply to comment #9)
> In the mood for another backtrace?

and back to "corrupted double-linked list":

xorg from git, all HEAD, from 2008-04-09, kernel 2.6.23.17, drm-modules from
git HEAD:

###################################################
#0  0xb7fa2410 in __kernel_vsyscall ()
#1  0xb7b807b1 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2  0xb7b81fc8 in *__GI_abort () at abort.c:88
#3  0xb7bb636b in __libc_message (do_abort=2,
    fmt=0xb7c69888 "*** glibc detected *** %s: %s: 0x%s ***\n")
    at ../sysdeps/unix/sysv/linux/libc_fatal.c:170
#4  0xb7bbbfe0 in malloc_printerr (action=2, str=0xb7c66729 "corrupted double-linked list",
    ptr=<value optimized out>) at malloc.c:5758
#5  0xb7bbd690 in _int_free (av=0xb7c80120, mem=0x862da20) at malloc.c:4548
#6  0xb7bbdac9 in *__GI___libc_free (mem=0x862da20) at malloc.c:3541
#7  0x0815d4fa in Xfree (ptr=0x862da20) at utils.c:1458
#8  0xb7ad2f17 in ProcXResQueryClients (client=0x862e2e0) at xres.c:105
#9  0xb7ad38b9 in ProcResDispatch (client=0x862e2e0) at xres.c:316
#10 0x080842fd in Dispatch () at dispatch.c:454
#11 0x0806bbea in main (argc=5, argv=0xbfb14424, envp=0xbfb1443c) at main.c:455
###################################################
Comment 11 Michel Dänzer 2008-04-09 03:36:55 UTC
Created attachment 15780 [details] [review]
Fix off-by-1 error in ProcXResQueryClients()

Hmm, looks like there's an off-by-1 error in ProcXResQueryClients()... does this patch help?
Comment 12 Jens Stroebel 2008-04-09 04:24:10 UTC
(In reply to comment #11)
> Created an attachment (id=15780) [details]
> Fix off-by-1 error in ProcXResQueryClients()
> 
> Hmm, looks like there's an off-by-1 error in ProcXResQueryClients()... does
> this patch help?


yes, seems this has been IT .. :)

has been there for a while...
Fixed now. thx.
Comment 13 Michel Dänzer 2008-04-09 06:35:41 UTC
Pushed to master and server-1.5-branch.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.