RHEL4u3's crappy version of Metacity creates a 32-bit parent window for every
top-level window. Start the X server at depth 16 and run xcompmgr -c, metacity,
and xterm. Resize the xterm. Eventually, the server will crash with a
backtrace that looks like this:
Starting program: /home/aaron/X/7.1-root/bin/X -logfile /var/log/Xorg.0.log
-depth 16 -config xorg.conf.nv
X Window System Version 7.0.0
Release Date: 21 December 2005
X Protocol Version 11, Revision 0, Release 7.0
Build Operating System: Linux 2.6.16-gentoo-r7 i686
Current Operating System: Linux tenor 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:54:53
EST 2006 i686
Build Date: 05 July 2006
Before reporting problems, check http://wiki.x.org
to make sure that you have the latest version.
Module Loader present
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(++) Log file: "/var/log/Xorg.0.log", Time: Thu Jul 6 14:09:04 2006
(++) Using config file: "/etc/X11/xorg.conf.nv"
[tcsetpgrp failed in terminal_inferior: Operation not permitted]
(EE) Failed to initialize GLX extension (Compatible NVIDIA X driver not found)
Program received signal SIGABRT, Aborted.
0x0054c7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#0 0x0054c7a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x005907f5 in raise () from /lib/tls/libc.so.6
#2 0x00592199 in abort () from /lib/tls/libc.so.6
#3 0x005c44ea in __libc_message () from /lib/tls/libc.so.6
#4 0x005cc8c2 in _int_realloc () from /lib/tls/libc.so.6
#5 0x005cd786 in realloc () from /lib/tls/libc.so.6
#6 0x081e7679 in Xrealloc (ptr=0x98b5980, amount=48) at utils.c:1457
#7 0x0813f18f in miRegionValidate (badreg=0xbfe666b0, pOverlap=0xbfe6667c)
#8 0x0814446f in miValidateTree (pParent=0x987c4d0, pChild=0x9876918, kind=VTOther)
#9 0x0814b01e in miSlideAndSizeWindow (pWin=0x9876918, x=215, y=329, w=214, h=290,
pSib=0x0) at miwindow.c:764
#10 0x081db368 in compResizeWindow (pWin=0x9876918, x=5, y=21, w=214, h=290,
#11 0x080761fd in ConfigureWindow (pWin=0x9876918, mask=12, vlist=0x986a380,
client=0x986b560) at window.c:2479
#12 0x08084b3c in ProcConfigureWindow (client=0x986b560) at dispatch.c:761
#13 0x0808409c in Dispatch () at dispatch.c:459
#14 0x0806e326 in main (argc=7, argv=0xbfe66e34, envp=0xbfe66e54) at main.c:447
Tested with both the nv and nvidia drivers on X.org 6.8.2, 7.1, and git commit
Sorry about the phenomenal bug spam, guys. Adding xorg-team@ to the QA contact so bugs don't get lost in future.
Move to 7.3 tracker.
The issue appears to be caused by compNewPixmap. I was under the impression that this function copied the contents of the old pixmap into the new one. However, what it actually does is copy the contents of the *parent window* into the new pixmap. When the parent window is depth 32 and the child pixmap is depth 16, this overruns the pixmap by a factor of 2.
Created attachment 11157 [details] [review]
I was wrong: It ends up in fb24_32CopyMtoN so it overruns the destination pixmap by a factor of 3/2 rather than 2.
Anyone got any objections to an embargo until August 29th? Might as well do it at the same time.
And here I was thinking I was getting more competent with Bugzilla and security.
Comment on attachment 11157 [details] [review]
That patch is totally busted. I don't know what I was thinking. I'll fix it tomorrow.
Basically, the copyarea just has to be replaced with a composite, right?
That would work, though it would be slower.
Created attachment 11368 [details] [review]
Use Composite when depths don't match
We'll use Composite when the depths don't match. This patch seems obvious enough. I haven't tested it myself though.
That patch works for me.
well, the patch is now in git, so i guess we should drop any pretense of secrecy.
does anyone feel like backporting the fix?
i'm having what i believe to be the same problem, caught internally (we use a 16-bit server):
STEPS TO REPRODUCE
launch xcompmgr &
run attached testcase and provide it '-f' as parameter.
------- Comment #5 From Stone Daniel (Nokia) 2007-08-17 09:49 [reply] -------
Strange, this doesn't happen in Xephyr.
------- Comment #6 From Palli Tapani (Nokia) 2007-08-17 10:24 [reply] -------
(In reply to comment #5)
> Strange, this doesn't happen in Xephyr.
Yep, does not happen in Xephyr for me aswell :/ ... please try on target.
------- Comment #8 From Tamminen Eero (Nokia) 2007-08-17 11:22 [reply] -------
(In reply to comment #6)
> > Strange, this doesn't happen in Xephyr.
> Yep, does not happen in Xephyr for me aswell :/
If you Valgrind Xephyr, do you get any errors?
(the error could be there, but causing crash only on arm)
------- Comment #9 From Stone Daniel (Nokia) 2007-08-17 13:43 [reply] -------
This one's not going to be much fun. We die inside ValidateGC (specifically,
the composite wrapper), despite the ShmPutImage arguments being exactly the
same. When setup comes, we do everything right:
Breakpoint 2, cwSetWindowPixmap (pWindow=0x2be9a0, pPixmap=0x2a7ef8)
610 in ../../../miext/cw/cw.c
(gdb) print pWindow->drawable.id
$4 = 4194316 [note: this is 0x40000c]
But by the time we come down into ShmPutImage, we've lost out badly:
shm put image: format 2, depth 16, drawable 40000c, gc 40000d, shmseg 40000e,
offset (nil), width 800, height 480, srcx 0, srcy 0, srcw 800, srch 480
Program received signal SIGSEGV, Segmentation fault.
0x001cc390 in cwGetBackingDrawable (pDrawable=0x2be9a0, x_off=0xbeb97f98,
y_off=0xbeb97f94) at ../../../miext/cw/cw.c:96
96 in ../../../miext/cw/cw.c
(gdb) print pDrawable.id
$9 = 4194316
(gdb) print ((WindowPtr) pDrawable)->devPrivates[cwWindowIndex].ptr
$28 = (pointer) 0x404ce008
(Note that x_off and y_off are pointers for return values, not offsets outside
the co-ordinate limits.)
------- Comment #11 From Stone Daniel (Nokia) 2007-09-03 12:56 [reply] -------
==17151== Conditional jump or move depends on uninitialised value(s)
==17151== at 0x403D64F: pixman_region_translate (pixman-region.c:2017)
==17151== by 0x80D556A: miTranslateRegion (miregion.c:1586)
==17151== by 0x8388002: compSetRedirectBorderClip (compwindow.c:647)
==17151== by 0x80D9BCC: miComputeClips (mivaltree.c:258)
==17151== by 0x80DA67F: miValidateTree (mivaltree.c:770)
==17151== by 0x80869E4: MapWindow (window.c:2841)
==17151== by 0x8096975: ProcMapWindow (dispatch.c:740)
==17151== by 0x80F8440: XaceCatchDispatchProc (xace.c:281)
==17151== by 0x809712E: Dispatch (dispatch.c:502)
==17151== by 0x807F2E4: main (main.c:452)
only happens with testcase -f, not testcase. (note that the offending code is
in miTranslateRegion in our codebase, not pixman_translate_region.)
Created attachment 11397 [details]
testcase for crasher
Running with no arguments and xcompmgr running on a 16bpp server is fine, but using -f causes the server to die.
cherry-picked to server-1.4
daniels, I failed to reproduce your issue, with or without the cherry-pick, with a 16-bit XAA server.
This has been assigned CVE-2007-4730, according to http://lists.debian.org/debian-security-announce/debian-security-announce-2007/msg00134.html
I realize this bug is fixed, but I have to ask if there could be another way to close the security hole. As it is now, what Aaron Plattner said "That would work, though it would be slower." is far beyond true. It's to the point where it's impractical to use any composite aware terminal with compositing on.
Is there any way to avoid using compiz for the case of differing depths, or possibly to set the default depth for all windows to be 32 when compiz is active even if the window is not using all 32 bits?
If need be I can provide more information, but the heart of it is that I remember a time when a composited terminal was faster than non, but now my newer system with much faster graphics (geforce 8800gt vs geforce fx5200) can't display a terminal emulator without lagging... doesn't seem to sit right somehow.
I'd be happy to work on developing a patch, but I'm admittedly new to X11 development, so I wanted to ask you, the experts, whether there's something which can be done before I sink time into it.