Bug 24274

Summary: reproducible X crash in uxa_check_poly_lines+0x138
Product: xorg Reporter: Luka Renko <lure>
Component: Server/GeneralAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: medium CC: 2, przanoni, soren.sandmann
Version: 7.7 (2012.06)Keywords: patch
Hardware: x86 (IA32)   
OS: Linux (All)   
URL: https://launchpad.net/bugs/415357
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 44202    
Attachments:
Description Flags
Xorg.0.log with backrace
none
Exercise fbBresSolid()
none
Assert that the read/write pointer is within bounds
none
gdb backtrace
none
reorder Bresenham error correction to avoid overshoot.
none
A backtrace of a crash caused by this problem none

Description Luka Renko 2009-10-02 05:01:34 UTC
Created attachment 29995 [details]
Xorg.0.log with backrace

HW: ThinkPad X200s with gm45
SW: Kubuntu Karmic up-to-date + update mesa & intel 2.9.0 driver from x-updates PPA

I would like to report reproducible crash from Ubuntu bug 415357 (there is similar crash in bug 416421, but w/o reproduction instructions):
https://launchpad.net/bugs/415357
https://launchpad.net/bugs/416421

Bug is reproducible by performing the following steps:
1. Open Kolourpaint (KDE drawing program), and select the pen tool.
2. Start drawing some doodle by pressing the left button of your mouse
   _without releasing it_ for at least 30 secs (keep the cursor moving).
3. X crashes.

In Xorg.0.log, I get the following backtrace:

0: /usr/bin/X(xorg_backtrace+0x26) [0x4f0136]
1: /usr/bin/X(xf86SigHandler+0x41) [0x4850f1]
2: /lib/libc.so.6 [0x7ffcbef32530]
3: /usr/lib/xorg/modules//libfb.so(fbBresSolid+0x1d6) [0x7ffcbcced336]
4: /usr/lib/xorg/modules//libfb.so(fbSegment+0x282) [0x7ffcbccec4d2]
5: /usr/lib/xorg/modules//libfb.so(fbZeroLine+0xfd) [0x7ffcbcce90ad]
6: /usr/lib/xorg/modules/drivers//intel_drv.so(uxa_check_poly_lines+0x138) [0x7ffcbd775f38]
7: /usr/bin/X [0x539947]
8: /usr/bin/X(ProcPolyLine+0xe2) [0x44be82]
9: /usr/bin/X(Dispatch+0x384) [0x44e044]
10: /usr/bin/X(main+0x3b5) [0x433fa5]
11: /lib/libc.so.6(__libc_start_main+0xfd) [0x7ffcbef1dabd]
12: /usr/bin/X [0x433429]
Saw signal 11.  Server aborting.
Comment 1 Chris Wilson 2009-10-02 15:42:28 UTC
Created attachment 30003 [details]
Exercise fbBresSolid()

The attached code exercises the call stack from the reported backtrace, but is insufficient to reproduce the crash here. 

Luka, do you mind trying the test case (with gcc large-poly-lines.c -lX11 && ./a.out) and seeing if it reproduces the crash on your machine?
Comment 2 Chris Wilson 2009-10-02 15:45:26 UTC
Ok, I can reproduce the crash here by wiggling kolourpaint...
Comment 3 Chris Wilson 2009-10-02 17:04:52 UTC
Created attachment 30004 [details] [review]
Assert that the read/write pointer is within bounds

A simple assertion to check that the pointer we are about to read from and write is valid (i.e points to within the destination drawable).
Comment 4 Chris Wilson 2009-10-02 17:07:47 UTC
X: fbseg.c:92: fbBresSolid: Assertion `dst >= start && dst < end' failed.

As the assertion above is being hit is not specific to the intel driver, I'm reassigning the bug to the core server.
Comment 5 Paulo Zanoni 2010-01-22 04:35:29 UTC
I can reproduce this problem on both 1.6.5 and 1.7.4. I didn't have time to test on git master yet.
I also tried Chris Wilson's patch on 1.6.5 and then instead of giving a backtrace X dies with this message:
X: fbseg.c:92: fbBresSolid: Assertion `dst >= start && dst < end' failed.

Link to Mandriva bug report:
https://qa.mandriva.com/show_bug.cgi?id=57105
Comment 6 Renato Caldas 2011-04-24 00:56:28 UTC
I cannot reproduce this with 1.9.5, and judging by the date of this bug I assume it can be closed, right?
Comment 7 Simon Schubert 2012-01-24 09:24:27 UTC
I experience a related bug when using KiCad:

[  2507.868] 
Backtrace:
[  2507.868] 0: /usr/bin/X (xorg_backtrace+0x26) [0x566a86]
[  2507.868] 1: /usr/bin/X (0x400000+0x16a6e9) [0x56a6e9]
[  2507.868] 2: /lib/libpthread.so.0 (0x7fa9d12c8000+0xf8a0) [0x7fa9d12d78a0]
[  2507.868] 3: /usr/lib/xorg/modules/libfb.so (fbBresSolid+0x21c) [0x7fa9cd94a1cc]
[  2507.868] 4: /usr/lib/xorg/modules/libfb.so (fbSegment+0x3f7) [0x7fa9cd94b667]
[  2507.868] 5: /usr/lib/xorg/modules/libfb.so (fbPolySegment32+0x4bd) [0x7fa9cd93f88d]
[  2507.868] 6: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fa9ce380000+0x380cc) [0x7fa9ce3b80cc]
[  2507.868] 7: /usr/lib/xorg/modules/drivers/intel_drv.so (0x7fa9ce380000+0x2f1ec) [0x7fa9ce3af1ec]
[  2507.868] 8: /usr/bin/X (0x400000+0xf9a3f) [0x4f9a3f]
[  2507.868] 9: /usr/bin/X (0x400000+0x302a3) [0x4302a3]
[  2507.868] 10: /usr/bin/X (0x400000+0x33cb9) [0x433cb9]
[  2507.868] 11: /usr/bin/X (0x400000+0x22eea) [0x422eea]
[  2507.868] 12: /lib/libc.so.6 (__libc_start_main+0xed) [0x7fa9d017f38d]
[  2507.868] 13: /usr/bin/X (0x400000+0x231dd) [0x4231dd]
[  2507.868] Segmentation fault at address 0x7fa9cc816ffc

It is reproducible.

No idea why the intel driver symbols don't show up, but addr2line shows me:

0x380cc = xf86-video-intel-2.17.0/uxa/uxa-unaccel.c:24 = uxa_check_poly_segment()
0x2f1ec = xf86-video-intel-2.17.0/uxa/uxa-accel.c:624 = uxa_poly_segment()
Comment 8 Simon Schubert 2012-01-24 15:44:57 UTC
Created attachment 56113 [details]
gdb backtrace

gdb backtrace of the bug.  dst is out of bounds.  I can provide core file and binaries if required.
Comment 9 Simon Schubert 2012-01-24 16:32:43 UTC
The problem seems to be that there are negative coordinates being passed in to ProcPolySegment:

(gdb) p/x *(xSegment*)&((xPolySegmentReq *)0x2918e1c)[1]                                              
$11 = {x1 = 0x24, y1 = 0x10, x2 = 0xfffe, y2 = 0xffff}

I don't know who is supposed to catch this.  Looking at the call sequence, nobody really makes sure that these values are in bounds.
Comment 10 Chris Wilson 2012-01-24 16:44:49 UTC
Clipping is performed in fbSegment(), see OUTCODES() and miZeroClipLine().
Comment 11 Simon Schubert 2012-01-24 18:22:04 UTC
Ah.  I believe this is the problem, or at least very closely related:

<http://cgit.freedesktop.org/xorg/xserver/tree/fb/fbseg.c#n693>:
	    if (clip2 != 0 || drawLast)
		len++;

in combination with these variables:

        new_x1 = 36
        new_x2 = 0
        new_y1 = 16
        new_y2 = 0
        clip2 = 10
        len = 37

This incremented len to 37, extending (in reverse) the line below (0, 0), which leads to a segmentation fault.
Comment 12 Simon Schubert 2012-01-25 04:37:50 UTC
Ok, I see what is going on there.

The len++ is to make the end coordinates inclusive, which they should be if drawLast is set, or if we clipped the end.

Now, we changed the end coordinates, but we keep the Bresenham error terms, because we want the same angle (I suppose).

However, if we look at fbBresSolid, we see this sequence:

	while (len--)
	{
...
/// (1) ///
		WRITE(dst, FbDoMaskRRop (READ(dst), and, xor, bits));
		bits = 0;
		dst += signdx;
...
	    e += e1;
/// (2) ///
	    if (e >= 0)
	    {
/// (3) ///
		WRITE(dst, FbDoMaskRRop (READ(dst), and, xor, bits));
		bits = 0;
		dst += dstStride;
		e += e3;
	    }
	}

Now assume we have arrived at len = 1.  We start the last iteration for the last pixel, at (x2,y2).  We draw the pixel (location (1)), and we *should* be done.  However, because of the previously unmodified Bresenham error terms, it can happen that the error total overflows (location (2)), and we will draw another pixel, now at (x2+signdx,y2), before adjusting the error terms and exiting the loop.

In short, it might happen that (I'm using signdx=-1, just because my case happens to be that way):

- (orig_x2, orig_y2) get clipped
- the algorithm then goes on to draw:

(x1,y1), (x1-1,y1), ..., (x2,y2), (x2-1,y2)

Now, if x2 = 0, y2 = 0, then we overshoot into negative address land (-1,0) and might segfault (actually do).


Solutions
=========

I don't directly see how this could be fixed:

a) Check dst for every Bresenham error pixel, but that seems excessive.
b) Adjust the error terms, but that would change the slope of the line (slightly)
c) Check for this case in advance and reduce len, but then you'd lose one pixel at the end
d) Rewrite fbseg to draw the error pixel in the next iteration, instead of in the same.  This touches central code though.
Comment 13 Simon Schubert 2012-01-25 04:49:47 UTC
Just a follow-up to say that solution (d) seems to work for me.
Comment 14 Simon Schubert 2012-02-16 06:28:22 UTC
Created attachment 57155 [details] [review]
reorder Bresenham error correction to avoid overshoot.

When fbBresSolid draws a line, it can happen that after the last
pixel, the Bresenham error term overflows, and fbBresSolid paints
another pixel before adjusting the error term.

However, if this happens on the last pixel (len=0), this extra pixel
might overshoot the boundary, and, in rare cases, lead to a segfault.

Fix this issue by adjusting for the Bresenham error term before
drawing the main pixel, not after.
Comment 15 Mitch Davis 2012-03-26 17:22:04 UTC
I am getting this crash when doing zone fills in KiCad.  Fedora 16, unaccelerated video.  I have applied Simon Schubert's patch, and the crashes no longer happen.
Comment 16 Mitch Davis 2012-03-26 17:23:41 UTC
Created attachment 59095 [details]
A backtrace of a crash caused by this problem
Comment 18 Jeremy Huddleston Sequoia 2012-08-31 06:05:50 UTC
This has caused a regression:
https://bugs.freedesktop.org/show_bug.cgi?id=54168
Comment 19 Sergey Lapin 2013-04-05 14:25:47 UTC
This bug still occurs with Kicad and recent X11
Comment 20 Adam Jackson 2018-06-11 20:50:49 UTC
commit 1b94fd77792310c80b0a2bcf4bf6d4e4c4c23bca
Author: Alex Orange <crazycasta@gmail.com>
Date:   Fri Oct 3 15:41:38 2014 -0600

    fb: Fix Bresenham algorithms for commonly used small segments.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.