Bug 94075 - SIGSEGV in xkb_context_ref from /usr/lib/libxkbcommon.so.0
Summary: SIGSEGV in xkb_context_ref from /usr/lib/libxkbcommon.so.0
Status: RESOLVED NOTOURBUG
Alias: None
Product: Wayland
Classification: Unclassified
Component: wayland (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Wayland bug list
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-10 11:16 UTC by Robert Folland
Modified: 2016-05-09 21:22 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments

Description Robert Folland 2016-02-10 11:16:23 UTC
After upgrading to libinput 1.1.6 I am getting SIGSEGV from a program using SDL2 on Wayland/Weston. Downgrading to libinput 1.1.5 solves the problem.

Backtrace from gdb:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff515b953 in xkb_context_ref () from /usr/lib/libxkbcommon.so.0
(gdb) bt
#0  0x00007ffff515b953 in xkb_context_ref () from /usr/lib/libxkbcommon.so.0
#1  0x00007ffff515d9f6 in ?? () from /usr/lib/libxkbcommon.so.0
#2  0x00007ffff515d3dc in xkb_keymap_new_from_buffer ()
   from /usr/lib/libxkbcommon.so.0
#3  0x00007ffff7b9010e in ?? () from /usr/lib/libSDL2-2.0.so.0
#4  0x00007ffff57911f0 in ffi_call_unix64 () from /usr/lib/libffi.so.6
#5  0x00007ffff5790c58 in ffi_call () from /usr/lib/libffi.so.6
#6  0x00007ffff599c758 in ?? () from /usr/lib/libwayland-client.so.0
#7  0x00007ffff5999a60 in ?? () from /usr/lib/libwayland-client.so.0
#8  0x00007ffff5999adc in ?? () from /usr/lib/libwayland-client.so.0
#9  0x00007ffff599a7af in wl_display_dispatch_queue ()
   from /usr/lib/libwayland-client.so.0
#10 0x00007ffff599aa6f in wl_display_roundtrip_queue ()
   from /usr/lib/libwayland-client.so.0
#11 0x00007ffff7b91160 in ?? () from /usr/lib/libSDL2-2.0.so.0
#12 0x00007ffff7b7c589 in ?? () from /usr/lib/libSDL2-2.0.so.0
#13 0x00007ffff7ae0f47 in ?? () from /usr/lib/libSDL2-2.0.so.0
#14 0x0000000000401e57 in main (argc=4, argv=<optimized out>) at vplay.cpp:321
(gdb)
Comment 1 Peter Hutterer 2016-02-16 07:08:26 UTC
weird. libinput doesn't use libxkbcommon and this segfault is on the client side. libinput runs in the compositor (unless SDL uses libinput?). Are you sure this is fixed by a libinput change?

What's the output of libinput-list-devices before and after the downgrade?
Comment 2 Robert Folland 2016-02-16 07:12:45 UTC
Yes, I think the cause must be something else. After upgrading to libinput 1.1.7 I could not reproduce, and not after going back to 1.1.6 either. So there must have been some other cause (I usually do a 'pacman -Syu' to upgrade all).

Since I can no longer reproduce I will close this bug now.
Comment 3 Robert Folland 2016-05-08 20:48:09 UTC
Hi,
I am still getting segfaults when initializing SDL on wayland. I avoided the problem with a watchdog which would kill and restart my program in case of a crash. But I would prefer to get it fixed.
I made a very simple example which calls SDL_Init and SDL_Quit with a pause in between. Running this loop 100 times will result in a segmentation fault. I am running weston/wayland on arch linux on an Intel NUC.

Here is the example program:

#include <stdlib.h>
#include <SDL2/SDL.h>
#include <unistd.h>
#include <iostream>

int main(int argc, char **argv)
{
    int count = atoi(argv[1]);

    for (int i = 0; i < count; i++) {
        std::cout << "Init " << i << std::endl;
        if (SDL_Init(SDL_INIT_VIDEO) < 0) {
            SDL_LogError(SDL_LOG_CATEGORY_APPLICATION,
                         "Couldn't initialize SDL: %s\n",
                         SDL_GetError());
            return 1;
        }
        sleep(1);
        std::cout << "Quit" << std::endl;
        SDL_Quit();
        sleep(1);
    }
    return 0;
}

And here is a backtrace from running the program with gdb:


Init 70
Quit
Init 71
Quit
Init 72
Quit
Init 73
Quit
Init 74

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff1cfdcd3 in xkb_context_ref () from /usr/lib/libxkbcommon.so.0
(gdb) bt
#0  0x00007ffff1cfdcd3 in xkb_context_ref () from /usr/lib/libxkbcommon.so.0
#1  0x00007ffff1cffeb9 in ?? () from /usr/lib/libxkbcommon.so.0
#2  0x00007ffff1cff7fc in xkb_keymap_new_from_buffer ()
   from /usr/lib/libxkbcommon.so.0
#3  0x00007ffff710010e in ?? () from /usr/lib/libSDL2-2.0.so.0
#4  0x00007ffff23331f0 in ffi_call_unix64 () from /usr/lib/libffi.so.6
#5  0x00007ffff2332c58 in ffi_call () from /usr/lib/libffi.so.6
#6  0x00007ffff253ee5e in ?? () from /usr/lib/libwayland-client.so.0
#7  0x00007ffff253bc30 in ?? () from /usr/lib/libwayland-client.so.0
#8  0x00007ffff253ce1c in wl_display_dispatch_queue_pending ()
   from /usr/lib/libwayland-client.so.0
#9  0x00007ffff253d12f in wl_display_roundtrip_queue ()
   from /usr/lib/libwayland-client.so.0
#10 0x00007ffff7101160 in ?? () from /usr/lib/libSDL2-2.0.so.0
#11 0x00007ffff70ec589 in ?? () from /usr/lib/libSDL2-2.0.so.0
#12 0x00007ffff7050f47 in ?? () from /usr/lib/libSDL2-2.0.so.0
#13 0x0000000000400bde in main (argc=<optimized out>, argv=<optimized out>)
    at vplay-init.cpp:13
(gdb)
Comment 4 Peter Hutterer 2016-05-09 00:00:00 UTC
fwiw, I tried reproducing this with 0.5.0 and 0.6.1 (on fedora) and could not get the crash to happen. that's with a count of up to 200 and also with the sleep removed, neither reproduced it here.

given the backtrace shows xkb_keymap_new_from_buffer() and the actual failure is in xkb_context_ref() I would assume this is an SDL bug rather than a libxbkcommon bug. The only thing that can segfault here is the ctx argument which is provided by SDL. This indicates that SDL may not initialize the keymap, or that some memory corruption happens, or that the keymap failed to initialize but everything continued as normal

valgrind may tell you more
Comment 5 Pekka Paalanen 2016-05-09 07:39:26 UTC
Would also be much better to have the debug info installed for both SDL and libwayland libs for the backtrace, so we could see some more function names.
Comment 6 Robert Folland 2016-05-09 12:41:24 UTC
More debug info with gdb:

Init 12
Quit
Init 13

Program received signal SIGSEGV, Segmentation fault.
xkb_context_ref (ctx=ctx@entry=0x0) at src/context.c:156
156         ctx->refcnt++;
(gdb) bt
#0  xkb_context_ref (ctx=ctx@entry=0x0) at src/context.c:156
#1  0x00007ffff5e1cd4c in xkb_keymap_new (ctx=0x0, format=XKB_KEYMAP_FORMAT_TEXT_V1, flags=flags@entry=XKB_KEYMAP_COMPILE_NO_FLAGS) at src/keymap-priv.c:65
#2  0x00007ffff5e1c6cc in xkb_keymap_new_from_buffer (ctx=<optimized out>, 
    buffer=0x7ffff7fd5000 "xkb_keymap {\nxkb_keycodes \"(unnamed)\" {\n\tminimum = 8;\n\tmaximum = 255;\n\t<ESC>", ' ' <repeats 16 times>, "= 9;\n\t<AE01>", ' ' <re
peats 15 times>, "= 10;\n\t<AE02>", ' ' <repeats 15 times>, "= 11;\n\t<AE03>", ' ' <repeats 15 times>, "= 12;\n\t<AE04>", ' ' <repeats 12 times>..., length=48090, 
    format=<optimized out>, flags=<optimized out>) at src/keymap.c:191
#3  0x00007ffff7b8ea4e in keyboard_handle_keymap (data=0x6169b0, keyboard=<optimized out>, format=<optimized out>, fd=5, size=48091)
    at /home/vlab/abs/sdl2/src/SDL2-2.0.4/src/video/wayland/SDL_waylandevents.c:269
#4  0x00007ffff64501f0 in ffi_call_unix64 () from /usr/lib/libffi.so.6
#5  0x00007ffff644fc58 in ffi_call () from /usr/lib/libffi.so.6
#6  0x00007ffff665be3e in wl_closure_invoke (closure=closure@entry=0x61f000, flags=flags@entry=1, target=<optimized out>, target@entry=0x616d20, 
    opcode=opcode@entry=0, data=<optimized out>) at src/connection.c:949
#7  0x00007ffff6658be0 in dispatch_event (display=<optimized out>, queue=<optimized out>) at src/wayland-client.c:1274
#8  0x00007ffff6659db4 in dispatch_queue (queue=0x617398, display=0x6172d0) at src/wayland-client.c:1420
#9  wl_display_dispatch_queue_pending (display=0x6172d0, queue=0x617398) at src/wayland-client.c:1662
#10 0x00007ffff665a0cf in wl_display_roundtrip_queue (display=0x6172d0, queue=0x617398) at src/wayland-client.c:1085
#11 0x00007ffff7b8faa0 in Wayland_VideoInit (_this=<optimized out>) at /home/vlab/abs/sdl2/src/SDL2-2.0.4/src/video/wayland/SDL_waylandvideo.c:302
#12 0x00007ffff7b7aed6 in SDL_VideoInit_REAL (driver_name=<optimized out>, driver_name@entry=0x0) at /home/vlab/abs/sdl2/src/SDL2-2.0.4/src/video/SDL_video.c:513
#13 0x00007ffff7ae0ee7 in SDL_InitSubSystem_REAL (flags=16416) at /home/vlab/abs/sdl2/src/SDL2-2.0.4/src/SDL.c:173
#14 0x0000000000400b24 in main (argc=2, argv=0x7fffffffebb8) at vplay-init.cpp:13
(gdb)
Comment 7 Robert Folland 2016-05-09 12:43:49 UTC
And here is output from valgrind. I could not get it to crash with valgrind with a debug version of libxkbcommon. Only with sdl2 and wayland debug versions.

 Init 41
==446== Invalid read of size 4
==446==    at 0x6FE9CD3: xkb_context_ref (in /usr/lib/libxkbcommon.so.0.0.0)
==446==    by 0x6FEBEB8: ??? (in /usr/lib/libxkbcommon.so.0.0.0)
==446==    by 0x6FEB7FB: xkb_keymap_new_from_buffer (in /usr/lib/libxkbcommon.so.0.0.0)
==446==    by 0x4EF7A4D: keyboard_handle_keymap (SDL_waylandevents.c:269)
==446==    by 0x69C21EF: ffi_call_unix64 (in /usr/lib/libffi.so.6.0.4)
==446==    by 0x69C1C57: ffi_call (in /usr/lib/libffi.so.6.0.4)
==446==    by 0x67B5E3D: wl_closure_invoke (connection.c:949)
==446==    by 0x67B2BDF: dispatch_event.isra.4 (wayland-client.c:1274)
==446==    by 0x67B3DB3: dispatch_queue (wayland-client.c:1420)
==446==    by 0x67B3DB3: wl_display_dispatch_queue_pending (wayland-client.c:1662)
==446==    by 0x67B40CE: wl_display_roundtrip_queue (wayland-client.c:1085)
==446==    by 0x4EF8A9F: Wayland_VideoInit (SDL_waylandvideo.c:302)
==446==    by 0x4EE3ED5: SDL_VideoInit_REAL (SDL_video.c:513)
==446==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==446== 
==446== 
==446== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==446==  Access not within mapped region at address 0x0
==446==    at 0x6FE9CD3: xkb_context_ref (in /usr/lib/libxkbcommon.so.0.0.0)
==446==    by 0x6FEBEB8: ??? (in /usr/lib/libxkbcommon.so.0.0.0)
==446==    by 0x6FEB7FB: xkb_keymap_new_from_buffer (in /usr/lib/libxkbcommon.so.0.0.0)
==446==    by 0x4EF7A4D: keyboard_handle_keymap (SDL_waylandevents.c:269)
==446==    by 0x69C21EF: ffi_call_unix64 (in /usr/lib/libffi.so.6.0.4)
==446==    by 0x69C1C57: ffi_call (in /usr/lib/libffi.so.6.0.4)
==446==    by 0x67B5E3D: wl_closure_invoke (connection.c:949)
==446==    by 0x67B2BDF: dispatch_event.isra.4 (wayland-client.c:1274)
==446==    by 0x67B3DB3: dispatch_queue (wayland-client.c:1420)
==446==    by 0x67B3DB3: wl_display_dispatch_queue_pending (wayland-client.c:1662)
==446==    by 0x67B40CE: wl_display_roundtrip_queue (wayland-client.c:1085)
==446==    by 0x4EF8A9F: Wayland_VideoInit (SDL_waylandvideo.c:302)
==446==    by 0x4EE3ED5: SDL_VideoInit_REAL (SDL_video.c:513)
==446==  If you believe this happened as a result of a stack
==446==  overflow in your program's main thread (unlikely but
==446==  possible), you can try to increase the size of the
==446==  main thread stack using the --main-stacksize= flag.
==446==  The main thread stack size used in this run was 8388608.
==446== 
==446== HEAP SUMMARY:
==446==     in use at exit: 104,282 bytes in 103 blocks
==446==   total heap usage: 66,126 allocs, 66,023 frees, 8,770,031 bytes allocated
==446== 
==446== 1,424 bytes in 1 blocks are definitely lost in loss record 38 of 42
==446==    at 0x4C2C947: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==446==    by 0x6FEBE9D: ??? (in /usr/lib/libxkbcommon.so.0.0.0)
==446==    by 0x6FEB7FB: xkb_keymap_new_from_buffer (in /usr/lib/libxkbcommon.so.0.0.0)
==446==    by 0x4EF7A4D: keyboard_handle_keymap (SDL_waylandevents.c:269)
==446==    by 0x69C21EF: ffi_call_unix64 (in /usr/lib/libffi.so.6.0.4)
==446==    by 0x69C1C57: ffi_call (in /usr/lib/libffi.so.6.0.4)
==446==    by 0x67B5E3D: wl_closure_invoke (connection.c:949)
==446==    by 0x67B2BDF: dispatch_event.isra.4 (wayland-client.c:1274)
==446==    by 0x67B3DB3: dispatch_queue (wayland-client.c:1420)
==446==    by 0x67B3DB3: wl_display_dispatch_queue_pending (wayland-client.c:1662)
==446==    by 0x67B40CE: wl_display_roundtrip_queue (wayland-client.c:1085)
==446==    by 0x4EF8A9F: Wayland_VideoInit (SDL_waylandvideo.c:302)
==446==    by 0x4EE3ED5: SDL_VideoInit_REAL (SDL_video.c:513)
==446== 
==446== LEAK SUMMARY:
==446==    definitely lost: 1,424 bytes in 1 blocks
==446==    indirectly lost: 0 bytes in 0 blocks
==446==      possibly lost: 0 bytes in 0 blocks
==446==    still reachable: 102,858 bytes in 102 blocks
==446==         suppressed: 0 bytes in 0 blocks
==446== Reachable blocks (those to which a pointer was found) are not shown.
==446== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==446== 
==446== For counts of detected and suppressed errors, rerun with: -v
==446== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
Comment 8 Daniel Stone 2016-05-09 12:46:59 UTC
Yes, passing a NULL ctx is an error, so this is an issue in SDL. Please raise an issue with them to fix their lifecycle handling.
Comment 9 Pekka Paalanen 2016-05-09 13:03:06 UTC
FWIW, it's a race, like you probably guessed.

Wayland_VideoInit() in SDL needs to init xkb_context much earlier. By the time it does the two roundtrips, it may or may not have received the keymap event. The backtrace shows it explodes inside the second roundtrip while processing a keymap event (if the source I'm looking at is similar enough).
Comment 10 Robert Folland 2016-05-09 21:22:52 UTC
Thank you!

I tried moving the init of xkb_context up a bit in Wayland_VideoInit(), and I do not get the crashes anymore. I will take it to SDL to bring about an official fix.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.