Summary: | Assertion while calling XPending() (xcb_io.c:242) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Leonardo Chiquitto <leonardo> | ||||||||
Component: | Server/General | Assignee: | Xorg Project Team <xorg-team> | ||||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||
Severity: | normal | ||||||||||
Priority: | medium | CC: | xcb | ||||||||
Version: | 7.4 (2008.09) | ||||||||||
Hardware: | Other | ||||||||||
OS: | All | ||||||||||
Whiteboard: | |||||||||||
i915 platform: | i915 features: | ||||||||||
Attachments: |
|
Description
Leonardo Chiquitto
2010-02-10 04:40:30 UTC
cc:ing the xcb people in case they can help. I can't seem to reproduce this between Debian sid powerpc/amd64 machines. What versions of xserver, XCB and libX11 are you using? > I can't seem to reproduce this between Debian sid powerpc/amd64 machines. What
> versions of xserver, XCB and libX11 are you using?
On the powerpc machine:
xorg-x11-server-7.4-66.15.ppc
xorg-x11-libX11-7.4-15.2.ppc
xorg-x11-libxcb-7.4-15.2.ppc
On the x86_64 machine:
xorg-x11-server-7.4-67.4.x86_64
xorg-x11-libX11-7.4-16.4.x86_64
xorg-x11-libxcb-7.4-15.4.x86_64
Something that might be relevant: here, the x86_64 machine is a laptop. If I'm running on AC power and disconnect the cable, the browsers will die immediately. Although this is a way to trigger the problem "at will", I have to mention that it also happens during regular use and is not dependent on the laptop being on AC or battery.
This assert means that libX11 got responses from the X server for requests that it doesn't believe have been sent. I don't have a hypothesis yet about how that could happen.
I don't immediately see how the architecture of either machine could matter here. Can you check whether you can reproduce this bug on either the ppc32 machine or the x86-64 machine alone?
I'll probably need you to print the values of dpy->last_request_read and dpy->request at the point where the assertion fails, and it may help if you could attach a capture of the X network traffic in the same failing session using something like wireshark.
Judging by the assert line number, I think your libX11 must be at least version 1.1.99.2, but no later than 1.3. The only more recent change to xcb_io.c is a Cygwin build fix, which had better not matter.
I hope the two commits in between those versions don't matter on a 32-bit client, but I'm not certain that "Avoid datatype overflow on AMD64 and friends" was correct, so it'd be nice to know if that commit is involved. (I hadn't noticed it before today.)
Is OpenSUSE applying any patches to libX11's xcb_io.c? I'd guess not, but if so that would be important to know.
I suspect the versions of the server and libxcb don't matter, which is fortunate since the OpenSUSE version numbers are meaningless to me.
> Something that might be relevant: here, the x86_64 machine is a laptop. If I'm
> running on AC power and disconnect the cable, the browsers will die
> immediately.
Sounds like the problem occurs when an event arrives, which makes sense. I'm curious what event your desktop environment is triggering on the switch to battery, but it doesn't matter.
I'm also curious how you got a Python traceback in the middle of a gdb stack trace...
(In reply to comment #4) > This assert means that libX11 got responses from the X server for requests that > it doesn't believe have been sent. I don't have a hypothesis yet about how that > could happen. Hypothesis: Your X server forgot to swap the sequence number in some event or reply. Eg. A commit similar to 3f2e4b9867 may be required to fix the bug (if it hasn't been already). > I'll probably need you to print the values of dpy->last_request_read and > dpy->request at the point where the assertion fails, and it may help if you > could attach a capture of the X network traffic in the same failing session > using something like wireshark. Definitely take a Wireshark trace. It will prove or disprove my hypothesis. > I don't immediately see how the architecture of either machine could matter > here. Can you check whether you can reproduce this bug on either the ppc32 > machine or the x86-64 machine alone? No, I can't reproduce the problem on ppc32 or x86_64 when running Epiphany/Firefox locally. I can't say for sure about the possibility of an architecture dependent bug, but for me this looks like the case (different endianness). > I'll probably need you to print the values of dpy->last_request_read and > dpy->request at the point where the assertion fails, and it may help if you > could attach a capture of the X network traffic in the same failing session > using something like wireshark. I'll attach the traffic capture. It was collected with: # tcpdump -s 0 -n -w xorg-epiphany.cap -i eth0 port 6000 After I started tcpdump and Epiphany, it took less than one minute for the assertion failure to happen. Please let me know if the values you mentioned above are not in the capture and I'll patch the library to print them. > Judging by the assert line number, I think your libX11 must be at least version > 1.1.99.2, but no later than 1.3. The only more recent change to xcb_io.c is a > Cygwin build fix, which had better not matter. It's libX11 1.2.2. Sorry for not providing useful version numbers before. > Is OpenSUSE applying any patches to libX11's xcb_io.c? I'd guess not, but if so > that would be important to know. We have 13 patches on xorg-x11-libX11, but none of them touch xcb files. > I suspect the versions of the server and libxcb don't matter, which is > fortunate since the OpenSUSE version numbers are meaningless to me. Here are the correct version numbers: libxcb 1.5 xorg-server 1.6.5 Thanks for your prompt responses and attention. Created attachment 33252 [details]
traffic captured on the ppc32 machine
Created attachment 33253 [details]
traffic captured on the ppc32 machine (v2)
While the first attachment was captured without "external interference" (ie, I just started Epiphany and waited for it to crash, without touching keyboard or mouse), this one was captured during the following sequence of events:
1. Started tcpdump and Epiphany (same use case: running on ppc32 with $DISPLAY
pointing to the x86_64 laptop)
2. Unplugged laptop's power cable
3. Assertion failure happened immediately.
I believe the cause is the same and both traffic captures will be similar, but just in case...
Created attachment 33254 [details] [review] Proposed xserver patch Thanks for the wireshark trace. As I suspected, it is a swapping bug in your x server. Please try this patch, and let us know if it fixes the problem (I don't have an MSBFirst machine handy). Moving this bug to the server per Peter's analysis. I confirm that the patch in comment #9 resolves this problem. Thanks a lot Peter and everyone involved for the extremely quick response time and fix! Fixed in xserver master, thanks for the report! commit 97b03037f4d99fcebc7603011f41c3aff9871ce2 Author: Peter Harris <pharris@opentext.com> Date: Fri Feb 12 15:36:30 2010 -0500 Don't double-swap the RandR PropertyNotify event The event is already swapped in randr.c/SRROutputPropertyNotifyEvent, so it should not be swapped here. X.Org Bugzilla #26511: http://bugs.freedesktop.org/show_bug.cgi?id=26511 Tested-by: Leonardo Chiquitto <leonardo@ngdn.org> Acked-by: Adam Jackson <ajax at redhat.com> Reviewed-by: Julien Cristau <jcristau at debian.org> Signed-off-by: Peter Harris <pharris@opentext.com> Signed-off-by: Keith Packard <keithp@keithp.com> |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.