On ppc64 queue-test fails with: test "queue": queue-test.c:86: Assertion `counter == 1' failed. signal 11, fail. 1 tests, 0 pass, 1 fail FAIL: queue-test wayland version is 1.0.3 OS: openSUSE Factory buildlog available here: https://build.opensuse.org/package/live_build_log?arch=ppc64&package=wayland&project=openSUSE%3AFactory%3APowerPC&repository=standard
log with WAYLAND_DEBUG=1 [3315420.494] -> wl_display@1.get_registry(new id wl_registry@2) [3315420.589] -> wl_display@1.sync(new id wl_callback@3) [3315420.751] wl_display@1.get_registry(new id wl_registry@993740600) queue-test.c:86: Assertion `counter == 1' failed. test "queue": signal 11, fail.
It looks like wl_display_roundtrip() returns before it receives the done event. As of 1.0.3 the only way this can happen is if an error occurs when dispatching the queue. Could you test adding a check to the return value of wl_display_roundtrip() before the assert that fails and check if it is != -1?
wl_display_roundtrip() returns -1 in my case
So queue dispatching fails for some reason. You should be able to check errno for why something failed. Could be a broken pipe or something because the server part crashed. I don't have access to any ppc64 hardware so I cannot debug this myself.
errno is 32, so it is broken pipe right?
It seems so yes.
len = recvmsg(sockfd, msg, flags | MSG_CMSG_CLOEXEC); returns 0, connection is terminanted? Any hints to debug it further?
Looking at the output you provided, the problem seems to be in the server process. Note the "signal 11, fail" i.e. "segmentation fault" message. The queue test consists of two processes; the server process, and the forked client process. It looks like the server process is the one that crashes, resulting in no registry objects being transmitted and counter not reaching 2 when it gets the EPIPE error when reading the socket. Debugging the test cases are not very convenient, but what you can try to do is to add a sleep to the server part, start the test case and attach gdb to it before it continues and then see where it crashes. Be careful not to attach to the client process, or the test runner process.
If this only happens on ppc64, it could be an alignment problem, ie that we write or read a 32 bit value on an address that's not a multiple of 4 bytes (or 64 bit value or pointer value on an address not a multiple of 8 bytes.
Yes. This is only ppc64 (64bit) issue, ppc passes this test.
Most likely this: http://lists.freedesktop.org/archives/wayland-devel/2013-February/007275.html. Copied for posterity: Around line 740 of connection.c, demarshalling an object: id = (uint32_t **) extra; extra += sizeof *id; closure->args[i] = id; *id = p; On 64-bit MIPS, the assignment to *id gets turned into a store-double-word instruction (since pointer 'p' is 64 bits wide), which must be to a 8-byte-aligned address. It's possible for 'extra' to not be 8-byte aligned, and hence for the store to not be aligned. In the particular case I'm hitting, 'extra' is not 8-byte-aligned because the message size is 12, but it also looks like alignment could be changed in other ways; e.g. during handling a 'h'-type argument near the bottom of the function, where 'extra' is incremented by the size of an int.
Created attachment 74394 [details] [review] FIx alignment This patch should fix the 64-bit alignment problems. Care to give it a try?
(In reply to comment #12) > Created attachment 74394 [details] [review] [review] > FIx alignment > > This patch should fix the 64-bit alignment problems. Care to give it a try? Works great for me (on Linux and MIPS)!
The patch doesn't fix queue-test failure on ppc64
(In reply to comment #14) > The patch doesn't fix queue-test failure on ppc64 Could you try running queue-test under gdb and get a stack trace? If you say $ libtool --mode=execute gdb ./queue-test and then type run, gdb should follow the parent process (the server) which is where the segfault happens. When you get the segfault type bt to get a backtrace and attach that here. Thanks.
Got no stack here: (gdb) run Starting program: /home/abuild/rpmbuild/BUILD/wayland-1.0.3/tests/.libs/queue-test warning: Could not load shared library symbols for linux-vdso64.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Detaching after fork from child process 28649. queue-test: test-helpers.c:39: count_open_fds: Assertion `dir && "opening /proc/self/fd failed."' failed. test "queue": signal 6, fail. 1 tests, 0 pass, 1 fail [Inferior 1 (process 28648) exited with code 01] (gdb) bt No stack.
(In reply to comment #16) > Got no stack here: > (gdb) run > Starting program: > /home/abuild/rpmbuild/BUILD/wayland-1.0.3/tests/.libs/queue-test > warning: Could not load shared library symbols for linux-vdso64.so.1. > Do you need "set solib-search-path" or "set sysroot"? > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > Detaching after fork from child process 28649. > queue-test: test-helpers.c:39: count_open_fds: Assertion `dir && "opening > /proc/self/fd failed."' failed. > test "queue": signal 6, fail. > 1 tests, 0 pass, 1 fail > [Inferior 1 (process 28648) exited with code 01] > (gdb) bt > No stack. That's a different bug than the segfault (signal 11) above. The test has a built-in check for leaking fds which needs to read /proc and that's failing for some reason... running in a chroot?
Yes, I'm running in chroot, but with /proc mounted abuild@wolfberry-1:~/rpmbuild/BUILD/wayland-1.0.3/tests> libtool --mode=execute gdb ./queue-test GNU gdb (GDB) SUSE (7.5.1-1.1) Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "ppc64-suse-linux". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/abuild/rpmbuild/BUILD/wayland-1.0.3/tests/.libs/queue-test...done. (gdb) run Starting program: /home/abuild/rpmbuild/BUILD/wayland-1.0.3/tests/.libs/queue-test Missing separate debuginfo for /lib64/ld64.so.1 Try: zypper install -C "debuginfo(build-id)=c5b01adb2370f144d08c65f2e6f2000a715fe708" warning: Could not load shared library symbols for linux-vdso64.so.1. Do you need "set solib-search-path" or "set sysroot"? Missing separate debuginfo for /lib64/libdl.so.2 Try: zypper install -C "debuginfo(build-id)=318d19287fdb90b171b307d748fe5a366548202d" Missing separate debuginfo for /lib64/libc.so.6 Try: zypper install -C "debuginfo(build-id)=8e29c7c7c3bf9106db1d18677425233bddd086b9" Missing separate debuginfo for /usr/lib64/libffi.so.4 Try: zypper install -C "debuginfo(build-id)=db9a86960817b058b8d718b25b14798c76c1951a" Missing separate debuginfo for /lib64/librt.so.1 Try: zypper install -C "debuginfo(build-id)=6596a9d63e16d493af356fa6498322558dfb0b88" Missing separate debuginfo for /lib64/libpthread.so.0 Try: zypper install -C "debuginfo(build-id)=771909dc5849650e92bb91ec07a494046da52c0c" [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Detaching after fork from child process 29463. queue-test: queue-test.c:273: queue: Assertion `ret == 0' failed. test "queue": signal 6, fail. 1 tests, 0 pass, 1 fail [Inferior 1 (process 29459) exited with code 01] (gdb) bt No stack.
(In reply to comment #18) > Yes, I'm running in chroot, but with /proc mounted > > abuild@wolfberry-1:~/rpmbuild/BUILD/wayland-1.0.3/tests> libtool > --mode=execute gdb ./queue-test > GNU gdb (GDB) SUSE (7.5.1-1.1) > Copyright (C) 2012 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "ppc64-suse-linux". > For bug reporting instructions, please see: > <http://www.gnu.org/software/gdb/bugs/>... > Reading symbols from > /home/abuild/rpmbuild/BUILD/wayland-1.0.3/tests/.libs/queue-test...done. > (gdb) run > Starting program: > /home/abuild/rpmbuild/BUILD/wayland-1.0.3/tests/.libs/queue-test > Missing separate debuginfo for /lib64/ld64.so.1 > Try: zypper install -C > "debuginfo(build-id)=c5b01adb2370f144d08c65f2e6f2000a715fe708" > warning: Could not load shared library symbols for linux-vdso64.so.1. > Do you need "set solib-search-path" or "set sysroot"? > Missing separate debuginfo for /lib64/libdl.so.2 > Try: zypper install -C > "debuginfo(build-id)=318d19287fdb90b171b307d748fe5a366548202d" > Missing separate debuginfo for /lib64/libc.so.6 > Try: zypper install -C > "debuginfo(build-id)=8e29c7c7c3bf9106db1d18677425233bddd086b9" > Missing separate debuginfo for /usr/lib64/libffi.so.4 > Try: zypper install -C > "debuginfo(build-id)=db9a86960817b058b8d718b25b14798c76c1951a" > Missing separate debuginfo for /lib64/librt.so.1 > Try: zypper install -C > "debuginfo(build-id)=6596a9d63e16d493af356fa6498322558dfb0b88" > Missing separate debuginfo for /lib64/libpthread.so.0 > Try: zypper install -C > "debuginfo(build-id)=771909dc5849650e92bb91ec07a494046da52c0c" > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > Detaching after fork from child process 29463. > queue-test: queue-test.c:273: queue: Assertion `ret == 0' failed. > test "queue": signal 6, fail. > 1 tests, 0 pass, 1 fail > [Inferior 1 (process 29459) exited with code 01] > (gdb) bt > No stack. This is a different failure that in comment 16. This time the test is failing to create the server socket, which I suspect is another problem with running the test in chroot. Can you try running the test under strace? $ libtool --mode=execute strace -olog.txt ./queue-test and attach the log.txt?
Created attachment 74630 [details] strace output Here is requested info
(In reply to comment #20) > Created attachment 74630 [details] > strace output > > Here is requested info Oh, oops, that's just output for the test runner, which forks to run the actual test case. Try this instead: $ libtool --mode=execute strace -olog.txt ./queue-test queue
Created attachment 74645 [details] new strace
(In reply to comment #22) > Created attachment 74645 [details] > new strace Does the test run outside the chroot for you? I don't see a failure in the strace output, but the queue test itselfs forks, so it could be the child there failing. Can you try adding -f to the strace arguments?
Created attachment 74656 [details] strace -f in non chroot Ok here is starce output from libtool --mode=execute strace -f ./queue-test gdb in non chroot still gives me no stack. Let me know if you need more information.
I've just committed Jason clean up of the connection code and it should remove the source of these alignment problems. Can you try git master again? We'll backport to a 1.0 release if it works out alright. commit 2fc248dc2c877d02694db40aad52180d71373d5a Author: Jason Ekstrand <jason@jlekstrand.net> Date: Tue Feb 26 11:30:51 2013 -0500 Clean up and refactor wl_closure and associated functions
Yes. Master works for me.
(In reply to comment #26) > Yes. Master works for me. Great, thanks for testing the fix. I'll pull the fix back into the 1.0.6 release.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.