28320 – Xserver hangs in an infinite loop

Bug 28320 - Xserver hangs in an infinite loop

Summary: Xserver hangs in an infinite loop

Status:	RESOLVED DUPLICATE of bug 26980

Alias:	None

Product:	xorg
Classification:	Unclassified
Component:	Driver/nouveau (show other bugs)
Version:	git
Hardware:	x86-64 (AMD64) Linux (All)

Importance:	high major
Assignee:	Nouveau Project
QA Contact:	Xorg Project Team

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2010-05-30 04:06 UTC by Pavel S.
Modified:	2010-06-16 12:59 UTC (History)
CC List:	2 users (show)

See Also:
i915 platform:
i915 features:

Attachments
dmesg - nothing special here (52.06 KB, text/plain) 2010-05-30 04:07 UTC, Pavel S.	no flags	Details
[mi] overflowing... (40.53 KB, text/x-log) 2010-05-30 04:08 UTC, Pavel S.	no flags	Details
dmesg - this looks pomising... (89.79 KB, text/plain) 2010-05-30 04:18 UTC, Pavel S.	no flags	Details
Gdb backtrace of Xorg (63.68 KB, text/plain) 2010-06-08 07:15 UTC, Pico	no flags	Details
View All

Description Pavel S. 2010-05-30 04:06:56 UTC

"[mi] EQ overflowing. The server is probably stuck in an infinite loop."

This is the only meaningful explanation I get in Xorg.0.log.old after restarting the computer (Lenovo Thinkpad T410 @ nVidia NVS 3100M). After searching on the Internet , I know, that this message is generated by a variety of bugs, so tell me please, which information you need to locate the bug...

I am running X.Org X Server 1.8.1 on x86_64 Gentoo with kernel 2.6.34 but I experienced this bug since 2.6.33 (maybe earlier, too). 

[description]
I first thought, that this bug was caused by my composition manager (xcompmgr) but after disabling it the problem persisted.This bug happens spontaneous  and completely at random. The Xserver stops redrawing anything but the mouse is still moving (hwcursor). I could login through ssh  from another computer and attach  gdb to the server, to find me inside a tight loop in drmIoctl()@xf86drm.c making ioctl() and getting -1 and  errno==EINTR every time. After killing and restarting the server I reliably get a hard lock (ssh server stops answering my requests, kernel hangs).

[This bug happens in standalone and dualhead mode...]

Till now I have not found any way to reliably reproduce this bug.
(I will post some Xserver logs + dmesg soon)

thanks in advance

Comment 1 Pavel S. 2010-05-30 04:07:48 UTC

Created attachment 35950 [details]
dmesg - nothing special here

Comment 2 Pavel S. 2010-05-30 04:08:30 UTC

Created attachment 35951 [details]
 [mi] overflowing...

Comment 3 Pavel S. 2010-05-30 04:18:53 UTC

Created attachment 35953 [details]
dmesg - this looks pomising...

After killing the server I get some messages in the dmesg, maybe this helps...

Comment 4 Pico 2010-06-08 07:15:11 UTC

Created attachment 36145 [details]
Gdb backtrace of Xorg

I think I'm having exactly the same issue.
Would this backtrace of Xorg help?

Comment 5 Matej Cepl 2010-06-15 07:09:29 UTC

That " [mi] EQ overflowing" message is completely meaningless. Please, read http://marc.info/?l=fedora-devel-list&m=124101535025331&w=2 and then take a look at the disaster which happens when people start to file this message as a bug at https://bugzilla.redhat.com/show_bug.cgi?id=465884. 

Much more interesting is this backtrace:

Backtrace:
[  5443.624] 0: /usr/bin/X (xorg_backtrace+0x28) [0x46c3c8]
[  5443.624] 1: /usr/bin/X (mieqEnqueue+0x1eb) [0x45b21b]
[  5443.624] 2: /usr/bin/X (xf86PostMotionEventP+0xc8) [0x472828]
[  5443.624] 3: /usr/lib64/xorg/modules/input/evdev_drv.so (0x7f29c6ccc000+0x407f) [0x7f29c6cd007f]
[  5443.625] 4: /usr/bin/X (0x400000+0x6f6d7) [0x46f6d7]
[  5443.625] 5: /usr/bin/X (0x400000+0xf647a) [0x4f647a]
[  5443.625] 6: /lib/libpthread.so.0 (0x7f29ca790000+0xedf0) [0x7f29ca79edf0]
[  5443.625] 7: /lib/libc.so.6 (ioctl+0x7) [0x7f29c8f0bf17]
[  5443.625] 8: /usr/lib/libdrm.so.2 (drmIoctl+0x23) [0x7f29c881eeb3]
[  5443.625] 9: /usr/lib/libdrm.so.2 (drmCommandWrite+0x1b) [0x7f29c881f13b]
[  5443.625] 10: /usr/lib/libdrm_nouveau.so.1 (0x7f29c7741000+0x31ed) [0x7f29c77441ed]
[  5443.625] 11: /usr/lib/libdrm_nouveau.so.1 (nouveau_bo_map_range+0xff) [0x7f29c774439f]
[  5443.625] 12: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7f29c7947000+0x5f5d) [0x7f29c794cf5d]
[  5443.625] 13: /usr/lib64/xorg/modules/libexa.so (0x7f29c70de000+0x4637) [0x7f29c70e2637]
[  5443.625] 14: /usr/lib64/xorg/modules/libexa.so (0x7f29c70de000+0x74d2) [0x7f29c70e54d2]
[  5443.625] 15: /usr/lib64/xorg/modules/libexa.so (0x7f29c70de000+0x4817) [0x7f29c70e2817]
[  5443.625] 16: /usr/bin/X (0x400000+0xb6a3a) [0x4b6a3a]
[  5443.625] 17: /usr/bin/X (ValidateGC+0x24) [0x447b54]
[  5443.625] 18: /usr/bin/X (0x400000+0x14758b) [0x54758b]
[  5443.625] 19: /usr/bin/X (0x400000+0x1476b0) [0x5476b0]
[  5443.625] 20: /usr/bin/X (0x400000+0x149d33) [0x549d33]
[  5443.625] 21: /usr/bin/X (ConfigureWindow+0xb22) [0x42a302]
[  5443.625] 22: /usr/bin/X (0x400000+0x378d7) [0x4378d7]
[  5443.625] 23: /usr/bin/X (0x400000+0x3830c) [0x43830c]
[  5443.625] 24: /usr/bin/X (0x400000+0x24de5) [0x424de5]
[  5443.625] 25: /lib/libc.so.6 (__libc_start_main+0xe6) [0x7f29c8e61a26]
[  5443.625] 26: /usr/bin/X (0x400000+0x24999) [0x424999]

Looks like a crash in libdrm (or maybe nouveau driver) to me.

Comment 6 Pavel S. 2010-06-15 07:22:34 UTC

(In reply to comment #5)
> That " [mi] EQ overflowing" message is completely meaningless. Please, read
> http://marc.info/?l=fedora-devel-list&m=124101535025331&w=2 and then take a
> look at the disaster which happens when people start to file this message as a
> bug at https://bugzilla.redhat.com/show_bug.cgi?id=465884. 

Yes, I knew this, that ist why I looked into the GDB where the server was hanging. I am not experienced with X debugging to understand this bug though, so I decided to ask here.

> Much more interesting is this backtrace:

It looks to me that the server causes an deadlock in the kernel, and after that it hopelessly hangs in that tight loop trying to get a response from the kernel....

Comment 7 Marcin Kościelnicki 2010-06-16 12:59:17 UTC

The famous Mysterious NVA3-NVA8 Hang.

*** This bug has been marked as a duplicate of bug 26980 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.