Bug 68387

Summary: xinput2 XIQueryDevice reply not read correctly
Product: XCB Reporter: Ran Benita <ran234>
Component: ProtocolAssignee: xcb mailing list dummy <xcb>
Status: RESOLVED FIXED QA Contact: xcb mailing list dummy <xcb>
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 71752    
Attachments: XIQueryDevice test
reset xcb_block_len
xcb_get_setup test
reset xcb_block_len (with commit message)
reset xcb_block_len (with commit message)

Description Ran Benita 2013-08-21 16:46:16 UTC
Created attachment 84398 [details]
XIQueryDevice test

Trying to use the XIQueryDevice from the recently-merged xinput2 support, the reply seems to be read incorrectly starting from the second XIDeviceInfo.
The test code I've used is attached (you can comment the assert's to get a bit further, then it will probably segfault).
Comment 1 Ran Benita 2013-08-21 20:53:53 UTC
After discussing this with Daniel Martin on irc, the problem in this specific instance is the following snippet in the generated xinput.c:

    int
    xcb_input_xi_device_info_sizeof (const void  *_buffer  /**< */)
    {
        char *xcb_tmp = (char *)_buffer;
        [...]
        unsigned int xcb_block_len = 0;
        [...]

        xcb_block_len += sizeof(xcb_input_xi_device_info_t);
        xcb_tmp += xcb_block_len;
        /* name */
        xcb_block_len += (((_aux->name_len + 3) / 4) * 4) * sizeof(char);
        xcb_tmp += xcb_block_len;
        [...]
    }

There's a missing

    xcb_block_len = 0;

above the /* name */ comment, which results in xcb_tmp being incremented by sizeof(xcb_input_xi_device_info_t) twice, which is of course one too many.

This function xcb_input_xi_device_info_sizeof() is used by xcb_input_xi_device_info_next(), which is used in the test program, which therefore doesn't work.

Daniel says the problemtic section in c_client.py is this:
http://cgit.freedesktop.org/xcb/libxcb/tree/src/c_client.py?id=0289348f2c4ed3b1b286c51df19d82c6787c2b36#n1015
But that simply adding a 'xcb_block_len = 0;' there breaks other iterators.

The same issue also affects other requests, like XKB's GetGeometry.
Comment 2 Daniel Martin 2013-08-21 21:45:49 UTC
Created attachment 84415 [details] [review]
reset xcb_block_len
Comment 3 Ran Benita 2013-08-22 09:59:09 UTC
Thanks for the patch Daniel.

So it seems this bug is pretty far-reaching; basically all of the _sizeof() functions have it, though it only manifests on the ones which actually use xcb_tmp for anything. So to see the affected functions, you can grep for the pattern "(xcb_tmp)".

Most of them are _sizeof()'s of requests/events/errors, which no one cares about I think. But some are used in iterators for _next() functions. The most affected extensions are xkb and xinput, but also others. For example, I'm pretty sure xcb_get_setup() is broken when there is more than one Screen (roots_len > 1) - in this case the attached program should show garbage.

So, if you submit the patch with some commit message,
Tested-By: Ran Benita <ran234@gmail.com>
Reviewed-By: Ran Benita <ran234@gmail.com>
As the fix seems pretty obvious, fixes this test and doesn't cause any other problems that I've seen.

And I also think this is ample demonstration that no one is using the _sizeof() functions directly, and that they should therefore be static in the c files and not exposed in the headers; I also think this will reduce the extension library sizes substantially due to dead code elimination.
Comment 4 Ran Benita 2013-08-22 10:00:14 UTC
Created attachment 84441 [details]
xcb_get_setup test
Comment 5 Uli Schlachter 2013-08-22 15:42:32 UTC
(In reply to comment #3)
> For example,
> I'm pretty sure xcb_get_setup() is broken when there is more than one Screen
> (roots_len > 1) - in this case the attached program should show garbage.

Nope, works fine here. I tested this against "Xephyr -screen 1000x500 -screen 500x500 :1" and diff't the output for the two screens. Only IDs were different (e.g. the visual IDs).

What exactly would you expect to go wrong and why does it still work correctly?
Comment 6 Ran Benita 2013-08-22 18:51:32 UTC
(In reply to comment #5)
> (In reply to comment #3)
> > For example,
> > I'm pretty sure xcb_get_setup() is broken when there is more than one Screen
> > (roots_len > 1) - in this case the attached program should show garbage.
> 
> Nope, works fine here. I tested this against "Xephyr -screen 1000x500
> -screen 500x500 :1" and diff't the output for the two screens. Only IDs were
> different (e.g. the visual IDs).
> 
> What exactly would you expect to go wrong and why does it still work
> correctly?

Thanks for trying, I didn't know about this Xephyr trick. And you're right; while xcb_screen_sizeof() uses xcb_tmp, it doesn't in fact increment it twice:

    xcb_block_len += sizeof(xcb_screen_t);
    xcb_tmp += xcb_block_len;
    /* allowed_depths */
    for(i=0; i<_aux->allowed_depths_len; i++) {
        xcb_tmp_len = xcb_depth_sizeof(xcb_tmp);
        xcb_block_len += xcb_tmp_len;
        xcb_tmp += xcb_tmp_len;
    }

i.e in this case the struct and the depths are considered in the same 'block' (intended or not...). By looking, there aren't any iterator bugs in xproto to show off :) like in xinput and xkb.

However to 'compensate', xcb_setup_sizeof() is definitely broken, e.g. on my machine, should be: 5972, reported: 6284. To see this you can add
   printf("offset: %ld\n", (char *) visuals_iter.data - (char *) setup);
right after
   xcb_visualtype_next(&visuals_iter);
and look at the last offset. With the patch the reported sizeof is correct.
Comment 7 Ran Benita 2013-11-18 15:29:49 UTC
Created attachment 89413 [details] [review]
reset xcb_block_len (with commit message)

Attached Daniel's patch with a commit message (adapted from the bug).

The diff on the generated code shows some 364 locations where the two lines are added, all of them _sizeof() functions. Most _sizeof() are not used neither internally for iterators nor externally (I've never seen them used anywhere), so could have been broken silently for years. Out of the ones used for iterators, I believe there are two situations:

- The changes have no semantic change, as in the screens example (see above comment); it just splits a big "block" to two, having the same end result.

- It fixes a real bug rendering the iterators usable, like in xinput2 and xkb (the previously attached test crashes without, works with).

The diff is too large for me to inspect every change manually, but I did look at some random sample. I also tried the patched libxcb with some qt5 applications, and there weren't any problems. So I think the patch should go in.
Comment 8 Daniel Martin 2013-11-20 08:55:54 UTC
Created attachment 89518 [details] [review]
reset xcb_block_len (with commit message)

Thanks Ran for sanitizing the patch.

In this updated version I've added myself as reviewer and removed the preceeding 2 lines:
    # probably not needed
    #_c_serialize_helper_insert_padding(context, code_lines, space, False)

The _c_serialize_helper_insert_padding() had been commented out with:
    22e1013 added accessors for special cases
and thereby partly caused that regression. Partly because, the implicit padding would set xcb_block_len to zero as we want, but the implicit padding itself might lead to other length/offset calculation problems.
Comment 9 Daniel Martin 2013-11-20 09:23:47 UTC
After filtering out _sizeof()s for events, requests and replies the list of affected locations shrinks to 50. These are the structs for the remaining _sizeof()s:
    Input - AddMaster
    Input - ButtonClass
    Input - DeviceClass
    Input - DeviceCtl
    Input - DeviceName
    Input - DeviceResolutionCtl
    Input - DeviceResolutionState
    Input - DeviceState
    Input - EventMask
    Input - FeedbackCtl
    Input - FeedbackState
    Input - HierarchyChange
    Input - InputState
    Input - KeyClass
    Input - StringFeedbackCtl
    Input - StringFeedbackState
    Input - ValuatorInfo
    Input - ValuatorState
    Input - XIDeviceInfo
    RandR - RefreshRates
    Record - ClientInfo
    Render - PICTDEPTH
    Render - PICTSCREEN
    Res - ClientIdValue
    Res - ResourceSizeValue
    SELinux - ListItem
    Sync - SYSTEMCOUNTER
    Xv - AdaptorInfo
    Xv - AttributeInfo
    Xv - EncodingInfo
    Xv - Image
    xkb - CountedString16
    xkb - DeviceLedInfo
    xkb - KeySymMap
    xkb - KeyType
    xkb - Listing
    xkb - Outline
    xkb - Overlay
    xkb - OverlayRow
    xkb - Row
    xkb - SetKeyType
    xkb - Shape
    xproto - DEPTH
    xproto - HOST
    xproto - SCREEN
    xproto - STR
    xproto - Setup
    xproto - SetupAuthenticate
    xproto - SetupFailed
    xproto - SetupRequest

I've thought about filtering even more, i.e. only the structs that are used in a list. But, the effort didn't seemed worth it.

So, if you find a list using such a struct and try to iterate over it, then it's very likely that you hit this bug too.
Comment 10 Daniel Martin 2013-11-21 13:33:49 UTC
(In reply to comment #8)
> Created attachment 89518 [details] [review] [review]
> reset xcb_block_len (with commit message)

With this patch applied the bug can be closed (from my pov).
Comment 11 Uli Schlachter 2013-12-21 14:00:47 UTC
Merged as commit 18f0afab3f0de68114fe185e89d8b25a8c072a2c

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.