Bug 106573 - When kwin_wayland tries to start XWayland, XWayland hangs with endless inet6-related errors
Summary: When kwin_wayland tries to start XWayland, XWayland hangs with endless inet6-...
Status: RESOLVED MOVED
Alias: None
Product: Wayland
Classification: Unclassified
Component: XWayland (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: high major
Assignee: Wayland bug list
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-19 05:23 UTC by Kyle De'Vir
Modified: 2019-05-10 15:53 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Log of nested wayland session before getting ctrl-c killed (527.73 KB, text/plain)
2018-05-30 01:43 UTC, Kotori Itsuka
Details

Description Kyle De'Vir 2018-05-19 05:23:41 UTC
This happens when I start KWin from the VT using startplasmawayland, or starting a nested kwin_wayland using the --xwayland option.

I get this error repeated endlessly:

_XSERVTransSocketOpenCOTSServer: Unable to open socket for inet6
_XSERVTransOpen: transport open failed for inet6/valmar-desktop:15579
_XSERVTransMakeAllCOTSServerListeners: failed to open listener for inet6

I run a custom Arch kernel with IPv6 disabled, because it has caused mysterious DNS resolution slowdown for me in the past, but this didn't cause XWayland any issue.

XWayland should fallback to inet4 if an inet6 socket cannot be opened.
Comment 1 Kotori Itsuka 2018-05-29 13:20:32 UTC
This occurs with me as well.  kwin_wayland is unbootable with ipv6.disable=1 kernel option.

In addition to the user above, I would also argue that using ipv6 or ipv4 for local servers is a bad idea and can be injectible by scripts on a webpage, or other internet clients.
https://cathyjf.com/articles/local-servers-can-get-you-compromised
https://lists.gnu.org/archive/html/guile-user/2016-10/msg00007.html

I would recommend unix sockets instead which are not network-exploitable and faster as well.
https://blog.myhro.info/2017/01/how-fast-are-unix-domain-sockets

Cheers.
Comment 2 Pekka Paalanen 2018-05-29 13:46:07 UTC
(In reply to Kotori Itsuka from comment #1)
> In addition to the user above, I would also argue that using ipv6 or ipv4
> for local servers is a bad idea and can be injectible by scripts on a
> webpage, or other internet clients.

Are you sure it listens on IP? With Weston looking at 'ss -a -p', Xwayland has only unix stream sockets.

It probably depends on how the Wayland compositor launches Xwayland. If the Wayland compositor actually wants to have Xwayland listen on IP, then it should work, so this bug report seems valid.

However, if the Wayland compositor enables IP, and you think that is a mistake, then you should report that to the Wayland compositor project in question.
Comment 3 Olivier Fourdan 2018-05-29 13:52:14 UTC
Weird, I don't see any internet socket domain listed by “lsof” on Xwayland process on neither GNOME nor weston with lsof, and Wayland uses UNIX sockets.

Also I can run both GNOME on Wayland and weston with “ipv6.disable=1” here, no problem at all.

Can you run either “GNOME/Wayland” or “weston --xwayland” on that same setup?
Comment 4 Pekka Paalanen 2018-05-29 14:12:49 UTC
Seeing the command line of the Xwayland process might give a clue.
Comment 5 Kotori Itsuka 2018-05-30 01:41:51 UTC
>Are you sure it listens on IP? With Weston looking at 'ss -a -p', Xwayland has only unix stream sockets.

Oh. Then please excuse the earlier spitballing, the error message (transport open failed for inet6/oniichan:0) looked a lot like a network error.

This morning I tried running running a nested wayland session on a X session (a native session locks up and I have to ssh in to recover) and the server indeed tries to open a unix socket but fails.

I will attach a log.

>Can you run either “GNOME/Wayland” or “weston --xwayland” on that same setup?

Yes, running weston --xwayland does work.
Comment 6 Kotori Itsuka 2018-05-30 01:43:25 UTC
Created attachment 139840 [details]
Log of nested wayland session before getting ctrl-c killed
Comment 7 Pekka Paalanen 2018-05-30 11:18:07 UTC
I've never seen this before:

_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running

Is kwin not implementing the Xorg lock file?
For a sample implementation, see:
https://cgit.freedesktop.org/wayland/weston/tree/xwayland/launcher.c#n148
Xorg has fundamentally the same implementation AFAIK.

Unfortunately the log does not show the Xwayland command line, I would still like to see if that explicitly enables TCP/IP support under kwin.
Comment 8 Olivier Fourdan 2018-05-30 11:29:12 UTC
FWIW, I checked kwin's code to launch Xwayland yesterday, from here:

  https://github.com/KDE/kwin/blob/master/main_wayland.cpp#L375

It doesn't seem to implement the lock file (like weston and mutter do) and it doesn't look like it's explicitly enables tcp support either (no “-listen tcp” on the command line options).
Comment 9 Olivier Fourdan 2018-05-30 13:37:06 UTC
OK, some interesting findings...

I can get Xwayland to listed on inet/inet6 by explicitly building with “--enable-listen-tcp” (which is *disabled* by default) *and* passing *no* “-listen” option to Xwayland on teh command line (that is pretty much exactly what I would expect).

It's also worth noting that kwin does not use any “-listen” in its command line option (https://github.com/KDE/kwin/blob/master/main_wayland.cpp#L382).

So, I reckon this is the case of 1. having enabled explicitly “--enable-listen-tcp” at build time and 2. kwin not using any “-listen” option unlike weston or mutter do, and I reckon this is the expected behavior.
Comment 10 Pekka Paalanen 2018-05-31 09:13:25 UTC
Right, so regardless of whether it's good or bad to enable listening on TCP/IP, however that was done, there is still the original problem, right?

So this bug report becomes: When TCP/IP is enabled, there is endless log spam about failures if kernel IPv6 is disabled. That seems like a legitimate bug to me, unless Xorg has decided to require IPv6.

The other thing: if not using -listen command line option, should Xwayland find a free display number itself? Does it have a way to communicate it back to the parent process? Or is the parent process required to find a free display number, which means the parent process is always required to handle the lock file itself, and there should be a kwin bug report about it?
Comment 11 Olivier Fourdan 2018-06-01 15:10:03 UTC
(In reply to Pekka Paalanen from comment #10)
> Right, so regardless of whether it's good or bad to enable listening on
> TCP/IP, however that was done, there is still the original problem, right?
> 
> So this bug report becomes: When TCP/IP is enabled, there is endless log
> spam about failures if kernel IPv6 is disabled. That seems like a legitimate
> bug to me, unless Xorg has decided to require IPv6.

Possibly, but that wouldn't be an Xwayland bug, more of a wider, general Xserver bug in xserver/os/ code - Basically, I would expect the same to occur with Xorg (I haven't tried though), unless there is something special about Xwayland.

I think the bug here would be that the messages are repeated.

> The other thing: if not using -listen command line option, should Xwayland
> find a free display number itself? Does it have a way to communicate it back
> to the parent process? Or is the parent process required to find a free

Yes to both questions, this is the "-displayfd" option that kwin uses:

> display number, which means the parent process is always required to handle
> the lock file itself, and there should be a kwin bug report about it?

Xorg/Xwayland will acquire the lock as it always do in the "normal" case, I do not see this as a bug in kwin.
Comment 12 Pekka Paalanen 2018-06-01 15:22:52 UTC
(In reply to Olivier Fourdan from comment #11)
> Xorg/Xwayland will acquire the lock as it always do in the "normal" case, I
> do not see this as a bug in kwin.

So the error about unix listening socket failing is just a red herring and it creates another socket that works? Assuming I guess correctly what the messages pointed to in comment #7 mean.
Comment 13 Olivier Fourdan 2018-06-04 07:46:58 UTC
(In reply to Pekka Paalanen from comment #12)
> (In reply to Olivier Fourdan from comment #11)
> > Xorg/Xwayland will acquire the lock as it always do in the "normal" case, I
> > do not see this as a bug in kwin.
> 
> So the error about unix listening socket failing is just a red herring and
> it creates another socket that works? Assuming I guess correctly what the
> messages pointed to in comment #7 mean.

That's what “-displayfd” does, it “tries” to open the socket and if it fails, tries the next one.

So if you have an Xserver already running on :0, you will get those messages once. If you have 2 xservers on :0 and :1, you'll get those messages twice, so on and so forth...

Example:

$ Xwayland -displayfd 2 &
_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
✔ ~/src/wayland/xserver [master ↑·10|…141⚑ 33] 
1  ← This is Xwayland writing to stderr (fd 2) the display number it found, I have already an Xserver running on :0, so it found one available on :1

$ Xwayland -displayfd 2 &
_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
2 ← ditto, it found :2 available after trying :0 and :1, hence twice the messages

$ Xwayland -displayfd 2 &
_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
3 ← ditto, it found :3 available after trying :0, :1 and :2 which were already used...
Comment 14 Daniel Stone 2018-06-04 07:56:59 UTC
It's worth noting that adding -nolisten options from the compositor doesn't work, because if that option isn't available (e.g. '-nolisten tcp6' when you've built without IPv6 support), failure to not listen will be a hard error.
Comment 15 Ernest Hurtado 2018-06-14 10:40:37 UTC
I noticed this bug after update to plasma 5.13

Starting plasma wayland session from sddm or nested session from console console results in black screen and endless errors about inet6:

startplasmacompositor: Starting up...
dbus-daemon[1185]: [session uid=1000 pid=1185] Activating service name='org.freedesktop.systemd1' requested by ':1.1' (uid=1000 pid=1213 comm="dbus-update-activation-environment --systemd --all")
dbus-daemon[1185]: [session uid=1000 pid=1185] Activated service 'org.freedesktop.systemd1' failed: Process org.freedesktop.systemd1 exited with status 1
dbus-update-activation-environment: warning: error sending to systemd: org.freedesktop.DBus.Error.Spawn.ChildExited: Process org.freedesktop.systemd1 exited with status 1
No backend specified through command line argument, trying auto resolution
_XSERVTransSocketOpenCOTSServer: Unable to open socket for inet6
_XSERVTransOpen: transport open failed for inet6/host:0
_XSERVTransMakeAllCOTSServerListeners: failed to open listener for inet6
_XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed
_XSERVTransMakeAllCOTSServerListeners: server already running
_XSERVTransSocketOpenCOTSServer: Unable to open socket for inet6
Comment 16 Ernest Hurtado 2018-06-14 11:31:22 UTC
Rebuilding xorg,xwayland with "ipv6=false" works as temporary fix for this issue.
Comment 17 Ernest Hurtado 2018-06-15 19:04:23 UTC
I noticed that Archlinux since xorg 1.20 uses new meson build system which doesn't have any 'listen' option stated.

https://git.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/xorg-server#n80

https://cgit.freedesktop.org/xorg/xserver/tree/meson_options.txt

I tested it and without '-nolisten tcp' option explicitly stated in xserver arguments it always listens on network socket. This looks like regression in meson build system.
Comment 18 Ernest Hurtado 2018-06-16 12:28:59 UTC
It seems meson build has hardcoded 'LISTEN_TCP', '1' but changing it to '0' doesn't have effect.

https://cgit.freedesktop.org/xorg/xserver/tree/include/meson.build#n157

https://bugs.archlinux.org/task/59025#comment170465
Comment 19 Ernest Hurtado 2018-06-16 13:12:46 UTC
Meson patch was posted on xorg-dev ML: https://lists.x.org/archives/xorg-devel/2018-June/057142.html
Comment 20 GitLab Migration User 2019-05-10 15:53:48 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/xserver/issues/719.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.