Created attachment 19336 [details]
tshark-dump of one keypress with XCB linked in
Some Java applications, such as the trial version in http://www.typingmaster.com/, run unusably slow when used over a remote X connection.
I'm using Ubuntu Hardy 8.04.1 with LTSP5, Linux kernel 2.6.24-19-server, with 32-bit firefox and 32-bit Java-plugin. The relevant package versions are here:
The issue does not appear to be Java-version related (it exists in 1.5 and 1.6 versions). I have not tested the very latest Java-versions though. The reason I'm reporting this as a possible XCB-related issue is that one can workaround the problem by using an X11-library that does not link to XCB.
In current Ubuntu version (Hardy), /usr/lib32/libX11.so.6.2.0 library links to /usr/lib32/libxcb-xlib.so.0 and /usr/lib32/libxcb.so.1 libraries, XCB version is 1.1. In previous Ubuntu version (Gutsy) the X11-library does not do this. When the new library shared object replaced with the old one, the problem disappears, and Java applications that had problems run fine.
There may be some other differences between the X11-libraries, but using XCB as an underlying implementation seems to be a major change, or is it? All other applications do not appear to have these problems, Java applications appear to be the sole source of these problems.
I'm adding two attachments that show tshark-dump of network traffic in both cases, perhaps it helps to analyze the issue. What happens there is one keypress on typingmaster Java-version, on Hardy/XCB case it takes half a minute to process one keypress and switch a screen, on Gutsy/no-XCB case it takes maybe a second.
Created attachment 19337 [details]
tshark-dump of one keypress without XCB
FYI, I have triaged this issue with Ubuntu Launchpad. Please see https://bugs.launchpad.net/libxcb/+bug/277069 for additional submitted comments and information.
Also, see the following threads for more information regarding this issue:
I think this could be related to what I was experiencing for GLX over network.
Running any GLX application over network was horribly slow. I then found out that running GLX application over network loopback on the same machine was event slow. Ie i tested the following.
DISPLAY=localhost:0.0 LIBGL_ALWAYS_INDIRECT=1 xbmc
DISPLAY=:0.0 LIBGL_ALWAYS_INDIRECT=1 xbmc
where xbmc being XBMC Media Center is a opengl application. The first test gave a fps of 10, while the second a fps 40 on my hardware. After some pondering i thought about the nagle algorithm.
After modifying libxcb to disable the nagle algorthim, the above two commands rendered at about the same speed of 40fps.
I'll attach a diff.
Created attachment 26071 [details] [review]
Patch to disable nagle algorithm on XCB network sockets
My patch seems to have solved the issue for the people affected by this bug. There might be an alternate approach that would incure less overhead due to TCP_NODELAY.
One could instead of having TCP_NODELAY enabled all the time, only enable it on the socket on a call to _XFlush(), then disable. I'm not sure how the kernel would like this setting being enabled and disabled all the time thou.
Disabling Nagle sounds pretty reasonable to me. It's also what Xtrans (and thus traditional Xlib) does.
Author: elupus <email@example.com>
Date: Tue May 26 16:14:48 2009 +0200
Disable Nagle on TCP socket
Signed-off-by: Julien Danjou <firstname.lastname@example.org>
(In reply to comment #8)
> commit ee89850e68205a7f8961ace0839b5be86040dade
> Author: elupus <email@example.com>
> Date: Tue May 26 16:14:48 2009 +0200
> Disable Nagle on TCP socket
> Signed-off-by: Julien Danjou <firstname.lastname@example.org>
I can't believe we had Nagle on. :-) Oops.
Thanks much to all for the diagnosis and fix.
Wouldn't it be a better option to use TCP_CORK and to "pull the cork" in _XFlush()? See http://baus.net/on-tcp_cork for a more elaborate description of this feature.