glean --quick on a git master mesa built with --with-driver=xlib fails with:
XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":1.0"
after 529 requests (529 known processed) with 3 events remaining.
with libX11 git master and 1.3.4. Reverting to libX11 1.3.3 allows glean to run successully.
Created attachment 37594 [details]
Testcase to demonstrate the problem
A small testcase, distilled from what seems to be causing the problem in the mesa xlib driver.
Created attachment 37595 [details]
Git bisection log
git bisect on libX11 seems to point to commit 933aee1d5c53b0cc7d608011a29188b594c8d70b "Fix Xlib/XCB for multi-threaded applications (with caveats)." as the first revision showing this problem.
From a brief debugging attempt, it seems the XIO error is raised because we end up in the "it's not an error, but we don't have a reply, so it's an I/O error" case in _XReply(), despite xscope showing the reply having been sent...
I strongly believe that this is the use-after-free problem I reported here:
> After updating to libX11 1.3.4, I started seeing window managers or
> toolbar programs exit without reasons when closing windows or pop-ups.
> After a bit of debugging, I figured out that this is caused by
> a use after free bug in _XReply. Most people running Linux won't see it
> because the data in the just free()'d memory is still there. But
> Using OpenBSD's malloc which fills free()'d memory with a specific
> pattern, you get a different code path.
> The proplem arises in xcb_io.c:582. the 'current' pointer can have
> been free()'d already (by dequeue_pending_request() called at line 562)
> when getting there.
> A simple test program to reproduce the issue is appended below: just
> call XGetWindowProperty on a non-existent window.
> Using his favourite malloc debugger one should be able to see the problem
> on Linux too...
> Unfortunatly I'm not sure of what the fix is...
Thanks for the test case. Jamey and I reproduced this problem on Linux via valgrind, which does indeed show a use-after-free. We've fixed this in current git, commit 4b8ff7db39f2fe7ef12968d462aaf3f9054b6c18:
Fix use-after-free in _XReply on X errors.
_XReply would always call dequeue_pending_request on errors. When it
got an error for the current request, it would call dequeue, then break
out of the loop; then, if it had an error in the event queue, it would
compare it with the sequence number of the now-freed pending request.
_XReply already stored that sequence number in dpy->last_request_read
before freeing it, so look at that instead.
Signed-off-by: Jamey Sharp <firstname.lastname@example.org>
Signed-off-by: Josh Triplett <email@example.com>
This patch on top of 1.3.4 seems to fix the issue I was seeing (XWin would quit suddenly only seconds after launch). Jon?
Could we get this confirmed as the fix, then c-p this commit onto libX11-1.3-branch and release a 1.3.5?
Confirming that this patch fixes my issue.
Thanks for your prompt attention.