Bug 28320 - Xserver hangs in an infinite loop
Xserver hangs in an infinite loop
Status: RESOLVED DUPLICATE of bug 26980
Product: xorg
Classification: Unclassified
Component: Driver/nouveau
git
x86-64 (AMD64) Linux (All)
: high major
Assigned To: Nouveau Project
Xorg Project Team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-30 04:06 UTC by Pavel S.
Modified: 2010-06-16 12:59 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg - nothing special here (52.06 KB, text/plain)
2010-05-30 04:07 UTC, Pavel S.
no flags Details
[mi] overflowing... (40.53 KB, text/x-log)
2010-05-30 04:08 UTC, Pavel S.
no flags Details
dmesg - this looks pomising... (89.79 KB, text/plain)
2010-05-30 04:18 UTC, Pavel S.
no flags Details
Gdb backtrace of Xorg (63.68 KB, text/plain)
2010-06-08 07:15 UTC, Pico
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Pavel S. 2010-05-30 04:06:56 UTC
"[mi] EQ overflowing. The server is probably stuck in an infinite loop."

This is the only meaningful explanation I get in Xorg.0.log.old after restarting the computer (Lenovo Thinkpad T410 @ nVidia NVS 3100M). After searching on the Internet , I know, that this message is generated by a variety of bugs, so tell me please, which information you need to locate the bug...

I am running X.Org X Server 1.8.1 on x86_64 Gentoo with kernel 2.6.34 but I experienced this bug since 2.6.33 (maybe earlier, too). 

[description]
I first thought, that this bug was caused by my composition manager (xcompmgr) but after disabling it the problem persisted.This bug happens spontaneous  and completely at random. The Xserver stops redrawing anything but the mouse is still moving (hwcursor). I could login through ssh  from another computer and attach  gdb to the server, to find me inside a tight loop in drmIoctl()@xf86drm.c making ioctl() and getting -1 and  errno==EINTR every time. After killing and restarting the server I reliably get a hard lock (ssh server stops answering my requests, kernel hangs).

[This bug happens in standalone and dualhead mode...]

Till now I have not found any way to reliably reproduce this bug.
(I will post some Xserver logs + dmesg soon)

thanks in advance
Comment 1 Pavel S. 2010-05-30 04:07:48 UTC
Created attachment 35950 [details]
dmesg - nothing special here
Comment 2 Pavel S. 2010-05-30 04:08:30 UTC
Created attachment 35951 [details]
 [mi] overflowing...
Comment 3 Pavel S. 2010-05-30 04:18:53 UTC
Created attachment 35953 [details]
dmesg - this looks pomising...

After killing the server I get some messages in the dmesg, maybe this helps...
Comment 4 Pico 2010-06-08 07:15:11 UTC
Created attachment 36145 [details]
Gdb backtrace of Xorg

I think I'm having exactly the same issue.
Would this backtrace of Xorg help?
Comment 5 Matej Cepl 2010-06-15 07:09:29 UTC
That " [mi] EQ overflowing" message is completely meaningless. Please, read http://marc.info/?l=fedora-devel-list&m=124101535025331&w=2 and then take a look at the disaster which happens when people start to file this message as a bug at https://bugzilla.redhat.com/show_bug.cgi?id=465884. 

Much more interesting is this backtrace:

Backtrace:
[  5443.624] 0: /usr/bin/X (xorg_backtrace+0x28) [0x46c3c8]
[  5443.624] 1: /usr/bin/X (mieqEnqueue+0x1eb) [0x45b21b]
[  5443.624] 2: /usr/bin/X (xf86PostMotionEventP+0xc8) [0x472828]
[  5443.624] 3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f29c6ccc000+0x407f) [0x7f29c6cd007f]
[  5443.625] 4: /usr/bin/X (0x400000+0x6f6d7) [0x46f6d7]
[  5443.625] 5: /usr/bin/X (0x400000+0xf647a) [0x4f647a]
[  5443.625] 6: /lib/libpthread.so.0 (0x7f29ca790000+0xedf0) [0x7f29ca79edf0]
[  5443.625] 7: /lib/libc.so.6 (ioctl+0x7) [0x7f29c8f0bf17]
[  5443.625] 8: /usr/lib/libdrm.so.2 (drmIoctl+0x23) [0x7f29c881eeb3]
[  5443.625] 9: /usr/lib/libdrm.so.2 (drmCommandWrite+0x1b) [0x7f29c881f13b]
[  5443.625] 10: /usr/lib/libdrm_nouveau.so.1 (0x7f29c7741000+0x31ed) [0x7f29c77441ed]
[  5443.625] 11: /usr/lib/libdrm_nouveau.so.1 (nouveau_bo_map_range+0xff) [0x7f29c774439f]
[  5443.625] 12: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7f29c7947000+0x5f5d) [0x7f29c794cf5d]
[  5443.625] 13: /usr/lib64/xorg/modules/libexa.so (0x7f29c70de000+0x4637) [0x7f29c70e2637]
[  5443.625] 14: /usr/lib64/xorg/modules/libexa.so (0x7f29c70de000+0x74d2) [0x7f29c70e54d2]
[  5443.625] 15: /usr/lib64/xorg/modules/libexa.so (0x7f29c70de000+0x4817) [0x7f29c70e2817]
[  5443.625] 16: /usr/bin/X (0x400000+0xb6a3a) [0x4b6a3a]
[  5443.625] 17: /usr/bin/X (ValidateGC+0x24) [0x447b54]
[  5443.625] 18: /usr/bin/X (0x400000+0x14758b) [0x54758b]
[  5443.625] 19: /usr/bin/X (0x400000+0x1476b0) [0x5476b0]
[  5443.625] 20: /usr/bin/X (0x400000+0x149d33) [0x549d33]
[  5443.625] 21: /usr/bin/X (ConfigureWindow+0xb22) [0x42a302]
[  5443.625] 22: /usr/bin/X (0x400000+0x378d7) [0x4378d7]
[  5443.625] 23: /usr/bin/X (0x400000+0x3830c) [0x43830c]
[  5443.625] 24: /usr/bin/X (0x400000+0x24de5) [0x424de5]
[  5443.625] 25: /lib/libc.so.6 (__libc_start_main+0xe6) [0x7f29c8e61a26]
[  5443.625] 26: /usr/bin/X (0x400000+0x24999) [0x424999]

Looks like a crash in libdrm (or maybe nouveau driver) to me.
Comment 6 Pavel S. 2010-06-15 07:22:34 UTC
(In reply to comment #5)
> That " [mi] EQ overflowing" message is completely meaningless. Please, read
> http://marc.info/?l=fedora-devel-list&m=124101535025331&w=2 and then take a
> look at the disaster which happens when people start to file this message as a
> bug at https://bugzilla.redhat.com/show_bug.cgi?id=465884. 

Yes, I knew this, that ist why I looked into the GDB where the server was hanging. I am not experienced with X debugging to understand this bug though, so I decided to ask here.

> Much more interesting is this backtrace:

It looks to me that the server causes an deadlock in the kernel, and after that it hopelessly hangs in that tight loop trying to get a response from the kernel....
Comment 7 Marcin Kościelnicki 2010-06-16 12:59:17 UTC
The famous Mysterious NVA3-NVA8 Hang.

*** This bug has been marked as a duplicate of bug 26980 ***