Bug 23645

Summary: Crash uploading multiple glyphs
Product: xorg Reporter: Clemens Eisserer <linuxhippy>
Component: Server/GeneralAssignee: Peter Hutterer <peter.hutterer>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: peter.hutterer
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 23613    
Attachments:
Description Flags
0001-render-Fix-crash-in-RenderAddGlyphs-23645.patch none

Description Clemens Eisserer 2009-09-02 02:39:04 UTC
I tried to enhance the glyph-upload paths of my java2d-xrender backend
by uploading multiple glyphs at once, however doing so makes
xorg-1.6.99.1 (Fedora rawhide ~20090810) crash quite frequently.

I experienced those crashes with both vesa and the intel driver. Twice
the crashes happend in glibc's memory management, and another time in
some sha1 assembler. (guess the stuff which checks for equal glyphs).

It seems sha1_block_data_order is reading data from random locations:

==17163== Invalid read of size 4
==17163==    at 0x439E91A: sha1_block_data_order (sx86-elf.s:76)
==17163==    by 0xFA42F463: ???
==17163==  Address 0x4815360 is 0 bytes after a block of size 4,096 alloc'd
==17163==    at 0x4028D7E: malloc (vg_replace_malloc.c:207)
==17163==    by 0x80AE954: Xalloc (utils.c:1056)
==17163==    by 0x80AA42D: AllocateInputBuffer (io.c:1017)
==17163==    by 0x80A9545: InsertFakeRequest (io.c:498)

I had a look at the source but I have a pretty hard time figuring out
whats going on there :-/
The crash happens with a quite large framework I am working on, quite
hard to build your own. I could provide a binary package or wireshark
protocol if that would help?

GDB backtrace:
malloc_consolidate (av=<value optimized out>) at malloc.c:5114
5114              nextsize = chunksize(nextchunk);
(gdb) bt
#0  malloc_consolidate (av=<value optimized out>) at malloc.c:5114
#1  0x00b17359 in _int_malloc (av=<value optimized out>,
   bytes=<value optimized out>) at malloc.c:4348
#2  0x00b198ee in __libc_malloc (bytes=can't compute CFA for this frame
) at malloc.c:3638
#3  0x080a87ca in Xalloc (amount=3856) at utils.c:1070
#4  0x08088f43 in AllocatePixmap (pScreen=0x84faa18,
   pixDataSize=1879048192) at pixmap.c:116
#5  0x0049e9e5 in fbCreatePixmapBpp (pScreen=0x84faa18, width=136,
---Type <return> to continue, or q <return> to quit---
   height=7, depth=can't compute CFA for this frame
) at fbpixmap.c:53
#6  0x0049eaef in fbCreatePixmap (pScreen=0x84faa18, width=can't
compute CFA for this frame
)
   at fbpixmap.c:95
#7  0x081a8b1f in miGlyphs (op=3 '\003', pSrc=can't compute CFA for this frame
) at glyph.c:683
#8  0x08118a8d in damageGlyphs (op=176 '\260', pSrc=can't compute CFA
for this frame
) at damage.c:721
#9  0x081a8fd7 in CompositeGlyphs (op=176 '\260', pSrc=can't compute
CFA for this frame
) at glyph.c:632
#10 0x08112f9f in ProcRenderCompositeGlyphs (client=0x97db6d8)
   at render.c:1415
#11 0x0810eb44 in ProcRenderDispatch (client=0xc1d3b0) at render.c:2041
#12 0x0806ee37 in Dispatch () at dispatch.c:426
#13 0x08063115 in main (argc=6, argv=0xbfb082f4, envp=0xbfb08310)
   at main.c:282
Comment 1 Michel Dänzer 2009-09-03 07:01:54 UTC
> Twice the crashes happend in glibc's memory management, and another time in
> some sha1 assembler.

That probably means there's some kind of memory corruption.

Unfortunately, from your information I have no clue where that could come from: I don't know where sha1_block_data_order is, and neither valgrind nor gdb seemed able to get all the debugging information. Can you get better information?

E.g., as the valgrind output indicates a buffer overrun, maybe running the X server in gdb with something like ElectricFence enabled could produce a nice backtrace of the culprit.
Comment 2 Clemens Eisserer 2009-09-20 19:13:27 UTC
Sorry for the delay.
Would it help if I upload the executable (x86) triggering this issue somewhere?
Comment 3 Peter Hutterer 2009-09-27 18:31:33 UTC
Better than nothing I guess. Do you have valgrind output as well?
Comment 4 Clemens Eisserer 2009-09-28 16:08:29 UTC
The binary can be downloaded from:
http://93.83.133.214/glyphtest.tar.bz2

Sorry far the large size and for the slow download rate - its located on a Nokia770 with a tor-relay eating all its available bandwith.

Simply untar the directory, cd into it, and execute:
bin/java -Dsun.java2d.xrender=True -Xbootclasspath/p:xrender -jar Java2Demo.jar

That should cause Xorg to crash immediatly.
Comment 5 Peter Hutterer 2009-09-29 02:09:44 UTC
Thanks for the app. I am inclined to blame the client here, at least in part. After some (random) interval, a dodgy request is sent by the application. The symptom is always the same, a RenderAddGlyphs request with nglyph has some of the glyphs set to what looks suspiciously like uninitialized data. See the last glyph from this example:

        ............REQUEST: RenderRequest
               RENDERREQUEST: RenderAddGlyphs
             sequence number: 000001ec
              request length: 004e
                    glyphset: GLYPHSET 00200017
                     nglyphs: 00000003
                      glyphs:
                     glyphid: 0000000c
                       width: 0006
                      height: 0009
                           x: 1
                           y: 9
                        xOff: 8
                        yOff: 0
                     glyphid: 0000000c
                       width: 0007
                      height: 0009
                           x: 0
                           y: 9
                        xOff: 8
                        yOff: 0
                     glyphid: 0000000c
                       width: a1a1
                      height: 538c
                           x: 27836
                           y: 11003
                        xOff: 6658
                        yOff: 3303

It's not always the last one, sometimes nglyphs is 5 and only the first one is valid. It is AFAICT always the case that the last (couple of) glyphs, suggesting that nglyph may be passed in wrong.
I put a few debug statements into libXrender and this data is passed in by the application.

Anyway. The server crash as a result of this data is caused by an overflow in the size parameter. Once that is fixed, another crash turns up with zero-width glyphs. Both fixes are in the following patch. Please test this and let me know if it works for you.
Comment 6 Peter Hutterer 2009-09-29 02:10:13 UTC
Created attachment 29933 [details] [review]
0001-render-Fix-crash-in-RenderAddGlyphs-23645.patch
Comment 7 Clemens Eisserer 2009-09-29 05:52:37 UTC
Thanks for the detailed analysis, and for fixing the problem. Works for me.

I'll debug the app and fix it.
Comment 8 Peter Hutterer 2009-09-30 23:10:27 UTC
commit 622fc98fd08aba98369e6933c3ab8c9ff85385d5
Author: Peter Hutterer <peter.hutterer@who-t.net>
Date:   Thu Oct 1 10:03:42 2009 +1000

    render: Fix crash in RenderAddGlyphs (#23645)
    
    This patch fixes two bugs:
    size is calculated as glyph height * padded_width. If the client submits
    garbage, this may get above INT_MAX, resulting in a negative size if size is
    unsigned. The sanity checks don't trigger for negative sizes and the server
    goes and writes into random memory locations.
    
    If the client submits glyphs with a width or height 0, the destination
    pixmap is NULL, causing a null-pointer dereference. Since there's nothing to
    composite if the width/height is 0, we might as well skip the whole thing
    anyway.
    
and

commit 758ab55d2defc78d0169fd61a7036eb9f889e9e7
Author: Peter Hutterer <peter.hutterer@who-t.net>
Date:   Thu Oct 1 15:22:19 2009 +1000

    render: set the glyph picture to NULL by default.
    
    In a follow-up patch we may have glyphs with a NULL picture. To cope with
    that, always set the pictures for glyphs to NULL at creation time and cope
    with cleaning up such glyphs. Also, since compositing a NULL source doesn't
    do a lot anyway, skip trying to do so.
    

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.