Bug 101894

Summary: segfault in glamor_composite_clipped_region() (xorg-server-1.19.3, Fedora 26)
Product: xorg Reporter: Fabrice Bellet <fabrice>
Component: Server/Acceleration/glamorAssignee: Xorg Project Team <xorg-team>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
gdb backtrace of the crash
none
[PATCH xserver] glamor: handle NULL source pixmap
none
earlier errors in pixman
none
glamor: Avoid overflow between box32 and box16 box none

Description Fabrice Bellet 2017-07-24 11:42:41 UTC
Created attachment 132862 [details]
gdb backtrace of the crash

This is a crash happening on a Fedora 26, with an AMD card, while running eog on a large file. It seems that the case where source, mask and source_pixmap  are null (from the definition of the macro COMPOSITE_REGION) is not handled properly in function glamor_composite_clipped_region().
Comment 1 Olivier Fourdan 2017-07-24 13:32:29 UTC
Created attachment 132866 [details] [review]
[PATCH xserver] glamor: handle NULL source pixmap

Yeap, clearly COMPOSITE_REGION() can pass NULL as the source pixmap, and that won't fly with glamor_composite_clipped_region() as it is.

I cannot reproduce here (tried with Xwayland and Xephyr), so I am not sure the obvious, trivial patch attached fixes the issue (as it may not crash but still reach the assert(0) now in the macro, I am not sure).

Can you try that patch and see if that helps before I submit it to the devel ML?
Comment 2 Fabrice Bellet 2017-07-24 13:52:59 UTC
Created attachment 132867 [details]
earlier errors in pixman

I noticed this earlier error in pixman, that may be related. As I'm using the binary from my distribution, the code is optimized, and I don't have access to all temporary variables
Comment 3 Fabrice Bellet 2017-07-24 13:57:28 UTC
(In reply to Olivier Fourdan from comment #1)
> Created attachment 132866 [details] [review] [review]
> [PATCH xserver] glamor: handle NULL source pixmap
> 
> Yeap, clearly COMPOSITE_REGION() can pass NULL as the source pixmap, and
> that won't fly with glamor_composite_clipped_region() as it is.
> 
> I cannot reproduce here (tried with Xwayland and Xephyr), so I am not sure
> the obvious, trivial patch attached fixes the issue (as it may not crash but
> still reach the assert(0) now in the macro, I am not sure).
> 
> Can you try that patch and see if that helps before I submit it to the devel
> ML?

I put a breakpoint at the cause of the error, that directly jumps to the out label of the function, and it seems to work just like your patch (but the pixmap error remains). I'll test on an updated binary including  your patch just to be sure, and I'll confirm.
Comment 4 Fabrice Bellet 2017-07-24 14:01:59 UTC
I put the image causing the crash there : https://bellet.info/gstreamer/empathy-1500893915.png (it's a png file with size 32766x3550)
Comment 5 Olivier Fourdan 2017-07-24 14:14:44 UTC
Yeap, that image (from comment 4) crashes Xephyr -glamor as well.

As suspected,  even with my trivial patch, it crashes further down the line in glamor_composite_choose_shader() for the same reason (source being NULL)
Comment 6 Olivier Fourdan 2017-07-24 15:24:00 UTC
I suspect the problem lies in glamor_compute_transform_clipped_regions(), when copying the box32 to a box16, it overflows:

https://cgit.freedesktop.org/xorg/xserver/tree/glamor/glamor_largepixmap.c#n685

A pixman_box32 temp_box of {x1 = 32373, y1 = -2, x2 = 32768, y2 = 3103}

becomes:

$1 = {x1 = 32373, y1 = -2, x2 = -32768, y2 = 3103}

When passed to pixman_region_init_rects, and with x1 > x2 we hit a BAD_RECT() in pixman,
Comment 7 Fabrice Bellet 2017-07-24 17:11:52 UTC
Yep, I confirm the overflow (int)32768 -> (short)-32768. 

For the anecdotal aspect, graphviz truncated the generated png file from the gstreamer dot file (cairo limitation maybe, I don't remember precisely): one pixel smaller, and the bug would have remained unnoticed.
Comment 8 Olivier Fourdan 2017-07-25 07:50:46 UTC
Created attachment 132948 [details] [review]
glamor: Avoid overflow between box32 and box16 box

Right, this trivial patch is enough to avoid the crash with the given image (comment 4).

Submitted here: https://patchwork.freedesktop.org/series/27833/
Comment 9 Fabrice Bellet 2017-07-25 11:07:38 UTC
This may be a purely theoretical situation, but maybe we should keep short_box.x1 < short_box.x2, when both temp_box.x1 and temp_box.x2 overflow an int16, to prevent a resulting box with a width=0 ?
Comment 10 Fabrice Bellet 2017-07-25 11:48:42 UTC
The proposed patch works for me, on my sample image, the server doesn't crash anymore. Thanks!
Comment 11 Olivier Fourdan 2017-07-26 07:47:21 UTC
(In reply to Fabrice Bellet from comment #9)
> This may be a purely theoretical situation, but maybe we should keep
> short_box.x1 < short_box.x2, when both temp_box.x1 and temp_box.x2 overflow
> an int16, to prevent a resulting box with a width=0 ?

Yes, pixman has actually 3 cases, GOOD_RECT() when (x1,y1) < (x2,y2), BAD_RECT() when (x1,y1) > (x2,y2) and the third case when it's neither a GOOD_RECT() nor a BAD_RECT():

    https://cgit.freedesktop.org/pixman/tree/pixman/pixman-region.c#n84

So to be on the safe side, we could force (x1,y1) to be strictly lower than (x2,y2) in case we have an overflow, it's pretty trivial, I'll send an updated patch.

Now, how likely this can occur I don't know, eog will fail with a XError when trying to load an image larger than 32767 anyway.
Comment 12 Olivier Fourdan 2017-07-26 07:52:29 UTC
Updated patch sent to the ML, if that patch (still) works for you, would you mind replying the email on the xorg-devel ML with a "Tested-by: <yourself>" please?
Comment 13 Fabrice Bellet 2017-07-26 14:39:47 UTC
(In reply to Olivier Fourdan from comment #12)
> Updated patch sent to the ML, if that patch (still) works for you, would you
> mind replying the email on the xorg-devel ML with a "Tested-by: <yourself>"
> please?

done
Comment 14 GitLab Migration User 2018-12-13 18:26:02 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/xserver/issues/104.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.