Summary: | pixman crashes on bad trapezoids | ||
---|---|---|---|
Product: | pixman | Reporter: | Marcin Slusarz <marcin.slusarz> |
Component: | pixman | Assignee: | Søren Sandmann Pedersen <soren.sandmann> |
Status: | RESOLVED FIXED | QA Contact: | Søren Sandmann Pedersen <soren.sandmann> |
Severity: | critical | ||
Priority: | medium | CC: | fleming, leon+freedesktop, peak |
Version: | git master | ||
Hardware: | All | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
test case
wrong but good enough patch Minimal crasher using pixman only |
Description
Marcin Slusarz
2008-06-29 06:55:21 UTC
render/picture.c:1784 is: PictureScreenPtr ps = GetPictureScreen(pDst->pDrawable->pScreen); Which... really shouldn't ever segfault. Drawables can't be created without reference to a screen, but pictures can have null drawable if they're source-only pictures like gradients. However, it's not supposed to be legal to render to those, and besides, ProcRenderTrapezoids already checks for that. I can't reproduce this on server 1.5 with radeon and exa. exa doesn't invoke cw, but cw should have no effect here since pDst is the same in both CompositePicture and miTrapezoids. now it crashes differently: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f3becf2f6f0 (LWP 12047)] 0x0000000000512cc8 in ValidateOnePicture (pPicture=0xe3b1a0) at picture.c:1606 1606 (*ps->ValidatePicture) (pPicture, pPicture->stateChanges); (gdb) bt #0 0x0000000000512cc8 in ValidateOnePicture (pPicture=0xe3b1a0) at picture.c:1606 #1 0x0000000000512cf9 in ValidatePicture (pPicture=0xe3b1a0) at picture.c:1615 #2 0x0000000000512dbd in CompositePicture (op=12 '\f', pSrc=0xe6c7f0, pMask=0xe3b1a0, pDst=0xe47c50, xSrc=0, ySrc=0, xMask=0, yMask=0, xDst=0, yDst=0, width=1, height=1) at picture.c:1788 #3 0x0000000000512386 in miTrapezoids (op=12 '\f', pSrc=0xe6c7f0, pDst=0xe47c50, maskFormat=0xc37508, xSrc=0, ySrc=0, ntrap=0, traps=0xe6dd64) at mitrap.c:174 #4 0x000000000052f1fd in cwTrapezoids (op=12 '\f', pSrcPicture=<value optimized out>, pDstPicture=0xe47c50, maskFormat=0xc37508, xSrc=0, ySrc=0, ntrap=3, traps=0xe6dcec) at cw_render.c:365 #5 0x000000000051b316 in ProcRenderTrapezoids (client=0xc8ce30) at render.c:818 #6 0x000000000044ebca in Dispatch () at dispatch.c:457 #7 0x0000000000437d5b in main (argc=1, argv=0x7ffff5064828, envp=<value optimized out>) at main.c:445 (gdb) it looks like pScreen is a garbage... just to confirm that: (gdb) up #1 0x0000000000512cf9 in ValidatePicture (pPicture=0x230d080) at picture.c:1615 1615 ValidateOnePicture (pPicture); (gdb) up #2 0x0000000000512dbd in CompositePicture (op=12 '\f', pSrc=0x230d950, pMask=0x230d080, pDst=0x230cfb0, xSrc=0, ySrc=0, xMask=0, yMask=0, xDst=0, yDst=0, width=1, height=1) at picture.c:1788 1788 ValidatePicture (pMask); (gdb) print pSrc->pDrawable $1 = (DrawablePtr) 0x22f05a0 (gdb) print pDst->pDrawable $2 = (DrawablePtr) 0x232b5b0 (gdb) print pMask->pDrawable $3 = (DrawablePtr) 0x22c47b0 (gdb) print pSrc->pDrawable->pScreen $4 = (ScreenPtr) 0x2058090 (gdb) print pDst->pDrawable->pScreen $5 = (ScreenPtr) 0x2058090 (gdb) print pMask->pDrawable->pScreen $6 = (ScreenPtr) 0xff020580ff (gdb) I get the same thing on Debian etch. Firefox: 3.0.2, downloaded from mozilla.com gtk+: 2.10.14, downloaded from gtk.org, compiled, and installed into a non-system directory. The Firefox from mozilla.com requires a GTK+ newer than the one in Debian Etch. xserver-xephyr: 1.1.1-21etch5, though recompiled to disable optimizations and enable debug symbols everything else: current stock Debian etch I am able to segfault Xephyr, Xnest, and the main xserver-org (running with the closed-source nvidia driver). I loaded the Xephyr core dump into ddd, and it indicates failure on line 1598 of xorg-server-1.1.1/render/picture.c. I'm viewing the Special:Allmessages page of my intranet mediawiki installation to trigger Xephyr to crash. Previously Xnest and the main xserver-org servers were crashing when I would close Firefox tabs. Is there something more I can do to help debug this? Thanks. another webpage which crashes my xserver: http://www.alsa-project.org/main/index.php/Changes_v1.0.15_v1.0.16rc1_detail I'm able to work around this by using Xvnc4. I've seen the problem too. X Window System Version 1.3.0 Release Date: 19 April 2007 X Protocol Version 11, Revision 0, Release 1.3 Current Operating System: Linux thinkpad 2.6.25-gentoo-r7 #12 PREEMPT Wed Sep 24 16:55:00 OMSST 2008 i686 Build Date: 07 August 2008 (II) Loading /usr/lib/xorg/modules/drivers//radeon_drv.so (II) Module radeon: vendor="X.Org Foundation" compiled for 1.3.0, module version = 4.3.0 Module class: X.Org Video Driver ABI class: X.Org Video Driver, version 1.2 from x11-drivers/xf86-video-ati-6.9.0 Browser: www-client/mozilla-firefox-bin-3.0.3 Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.3) Gecko/2008092416 Firefox/3.0.3 I'm using x86 processor model name : Intel(R) Pentium(R) M processor 1600MHz flags : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 tm pbe bts est tm2 and most of my software is built with CFLAGS="-O2 -march=pentium-m -pipe" VGA compatible controller: ATI Technologies Inc Radeon Mobility M7 LW [Radeon Mobility 7500] Last lines from my Xorg.0.log are: ------8(------8(------8(------8(------8(------ Backtrace: 0: /usr/bin/X(xf86SigHandler+0x84) [0x80e3e84] 1: [0xb7f59400] 2: /usr/bin/X(ValidatePicture+0x11) [0x815d481] 3: /usr/bin/X(CompositePicture+0x8c) [0x815d53c] 4: /usr/bin/X(miTrapezoids+0x233) [0x815ca13] 5: /usr/bin/X(CompositeTrapezoids+0x93) [0x815d893] 6: /usr/bin/X [0x8166164] 7: /usr/bin/X [0x81607c5] 8: /usr/bin/X [0x8153c1e] 9: /usr/bin/X(Dispatch+0x1ab) [0x808cfcb] 10: /usr/bin/X(main+0x488) [0x8074d88] 11: /lib/libc.so.6(__libc_start_main+0xdc) [0xb7cd4fdc] 12: /usr/bin/X(FontFileCompleteXLFD+0x1f5) [0x80740b1] Fatal server error: Caught signal 11. Server aborting (II) AIGLX: Suspending AIGLX clients for VT switch (II) RADEON(0): RADEONRestoreMemMapRegisters() : (II) RADEON(0): MC_FB_LOCATION : 0x1fff0000 0xe3ffe000 (II) RADEON(0): MC_AGP_LOCATION : 0x27ff2000 finished PLL2 Entering Restore TV Restore TV PLL Restore TVHV Restore TV Restarts Restore Timing Tables Restore TV standard Leaving Restore TV ------8(------8(------8(------8(------8(------ I could not reproduce the bug visiting same page again, but I could reproduce the bug with http://en.wikibooks.org/wiki/X86_Assembly/Print_Version — backtrace is same. my xserver also crash on a sites using: Gentoo Linux x86 mozilla-firefox-3.0.5, xulrunner-1.9.0.5 xorg-server-1.3.0 xf86-video-ati-6.6.3 Chipset ATI Radeon Mobility 9200 (M9+) 5C61 (AGP) found gcc-4.1.2, glibc-2.6.1, gtk+-2.12.11, cairo-1.6.4, pango-1.20.5 my crash-site: http://www.wikileaks.net/wiki/Denmark:_3863_sites_on_censorship_list,_Feb_2008 X Window System Version 1.3.0 Release Date: 19 April 2007 X Protocol Version 11, Revision 0, Release 1.3 Build Operating System: UNKNOWN Current Operating System: Linux 2.6.28 #3 Sat Dec 27 15:25:53 CET 2008 i686 Build Date: 28 October 2008 Backtrace: 0: X(xf86SigHandler+0x81) [0x80dec31] 1: [0xb7fa4400] 2: X(miTrapezoids+0x233) [0x815d8a3] 3: X(CompositeTrapezoids+0x93) [0x815e743] 4: X [0x816700f] 5: X [0x8161685] 6: X [0x8154aae] 7: X(Dispatch+0x19d) [0x808ee4d] 8: X(main+0x48e) [0x8076f1e] 9: /lib/libc.so.6(__libc_start_main+0xdc) [0xb7d1dfdc] 10: X(BitOrderInvert+0xbd) [0x8076241] Fatal server error: Caught signal 11. Server aborting before the site is full loaded and viewed the xserver freezes I encountered a very similar bug (it crashed in the same place and pointers were mangled in the same way) in an older version of Xorg (1.1.1 shipped with CentOS 5). I discovered the problem occurs when fbRasterizeEdges() is called with a negative (and very large) value of parameter "t". The function draws outside the allocated buffer ("buf") and causes a lot of collateral damage: (gdb) bt 3 #0 fbRasterizeEdges (buf=0x8f81558, bpp=8, width=1, stride=1, l=0xbf83d314, r=0xbf83d2ec, t=-2147481464, b=63350) at fbedge.c:301 [note the negative value of t here] #1 0x0047a7af in fbRasterizeTrapezoid (pPicture=0x8f81560, trap=0xae7c1cfc, x_off=<value optimized out>, y_off=0) at fbtrap.c:143 #2 0x08146d3a in miTrapezoids (op=12 '\f', pSrc=0x8f90220, pDst=0x8f81488, maskFormat=<value optimized out>, xSrc=0, ySrc=0, ntrap=1, traps=0xae7c1d24) at mitrap.c:171 The sources of negative values of "t" is RenderSampleCeilY() called from fbRasterizeTrapezoid(). Something overflows and yields a negative result when it is called with a large positive value of its first parameter "y" (afaik >= 2147481463) and this happens when the client asks the server to draw a strange trapezoid very close the edge of the coordinate space: (gdb) bt 1 #0 fbRasterizeTrapezoid (pPicture=0x94d5dc8, trap=0x9509b1c, x_off=0, y_off=0) at fbtrap.c:137 (gdb) p *trap $36 = {top = 2147483647, bottom = 2147483647, left = {p1 = {x = 0, y = 0}, p2 = {x = 0, y = 2147483647}}, right = {p1 = {x = 65536, y = 2147483647}, p2 = {x = 0, y = 2147483647}}} (gdb) p t $37 = -2147481464 (gdb) print RenderSampleCeilY(2147483647, 8) $38 = -2147481464 The current code uses pixman_rasterize_edges() instead of fbRasterizeEdges(), pixman_rasterize_trapezoid() instead of fbRasterizeTrapezoid() and pixman_sample_ceil_y() instead of RenderSampleCeilY() but the code looks almost identical and I have verified pixman_sample_ceil_y() can return a negative result. Created attachment 22507 [details]
test case
$ gcc trapezoid_of_death.c -lX11 -lXext -lXrender
$ DISPLAY=[the display you want to kill] ./a.out
(It is necessary to send two trapezoids, one with saner top/bottom, to get past
a check in miTrapezoids().)
I have the freeze with Firefox too. The only thing I can do is a manual reboot. This happens every time I visit this site: www.labmacs.diiga.univpm.it and click on "Courses". I discovered that, unfortunately, I have the same problem when I try to open a Powerpoint presentation with Openoffice 3. I don't know where to retrieve useful information, if I have to look for something, just ask me. I have an ATI Radeon Mobility U1 as a video card. The bug is related to this one in Red Hat-s Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=480349 submitted by me and closed upstream. Program versions: Fedora 10 firefox-3.0.7-1.fc10.i386 openoffice.org-impress-3.0.1-15.2.fc10.i386 xorg-x11-drv-ati-6.10.0-2.fc10.i386 xorg-x11-server-Xorg-1.5.3-13.fc10.i386 I've just found the presentation on a public site: http://www.dascaricare.net/file-pps/?nav=display&file=161 so you can enjoy freezing your PC with me! I don't know if the Openoffice bug is related to this Firefox one, but the symptom is the same one! Of course, if you want I can split my report in the Firefox part, to keep here, and in the Openoffice part to put into a new bug report. I forgot: I have the freeze when viewing the presentation in fullscreen mode (I mean by pressing the F5 button once the file is loaded) Sorry for my 3200 posts in a hour, but I forgot another important thing: I can't reproduce the bug with the sites provided by the other reporters. If I have to report it as a new bug, just say it. I haven't seen this option... Anyway, it's disgraceful that an error of such an importance hasn't been fixed since June! I must use Windows until this one hasn't been fixed! Could you fix it before I go to the Apple store, please? Does happen with the latest update: xorg-x11-server-Xorg-1.5.3-15.fc10.i386 Emanuele: I think your freezes are a different issue. I cannot crash or freeze my xserver on address you provided and ppt file does not crash it too. I confirm crashes on address from comment 8 and with attached test case. (I'm still on xorg-server-1.3.0.0) My friend confirmed that attached test case crashes xorg-server-1.5.3. Thank you, I moved my bug here: http://bugs.freedesktop.org/show_bug.cgi?id=20682 Moreover I was wrong with this bug's severity (I'm new to this Bugzilla), sorry. Created attachment 25044 [details] [review] wrong but good enough patch Clamps the span walk to saturate rather than overflow. Clearly not correct, but stops the crash here. Tested with Xvfb. Created attachment 25045 [details]
Minimal crasher using pixman only
I applied the patch. It will only trigger for traps that fall entirely after the last subpixel sample row (or before the first), so they don't get rendered at all, which means any slope (or other) inaccuracy is irrelevant. e483af47db769fcba559dda72699bc80d154b575 |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.