I've got a bug since 2.20.18, and I'm not the only one: https://bbs.archlinux.org/viewtopic.php?id=156486
I'm running on a netbook with atom n450 and archlinux, using libdrm 2.4.41.
Basically everything goes well until a intensive video app is launched, like mplayer2 or skype, then nothing is rendered properly, characters are written on top of eachothers, half of them are missing, some sections are just black.
Here are some logs (Xorg and dmesg): http://pastebin.mozilla.org/2080842 http://pastebin.mozilla.org/2080843
You can see the repeating error "(EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Resource deadlock avoided."
I have been able to remove this behaviour by downgrading to 2.20.17 (not 2.20.18).
*** Bug 59769 has been marked as a duplicate of this bug. ***
Chris Wilson wanted to know the libdrm on the dup bug 59769. My upgrade, that showed the bug was as follows:
upgraded libdrm (2.4.40-1 -> 2.4.41-1)
Oh, he also asked if it is a "composited wm". I don't know what that is, but I am running LXDE.
I'm using i3wm, and libdrm 2.4.41. Since the combination of libdrm 2.4.41 + xf86-video-intel 2.20.17 is not affected, I don't think libdrm is involved in this bug.
It appears all 3 users reporting the error are on Intel Atom processors. It looks like we are pushing those little things too hard!
(In reply to comment #5)
> It appears all 3 users reporting the error are on Intel Atom processors. It
> looks like we are pushing those little things too hard!
No. Switch to SNA to understand just how wrong that statement is.
I have tried to reproduce this by using mplayer + lxsession on a pineview machine. It remains happy. Are you able to refine your test case to a single application running under a bare X session?
Try connecting an external screen on VGA. I haven't been able to reproduce the bug on LVDS only.
It took some doing but I reproduced the error. I closed all windows in LXDE, opened a single console session, expanded the window to full size, did an "su root", and ran the command "journalctl --no-pager|grep -i error". By the time I got through the log I had seen some blankouts. I got another console going and looked at the X log and sure enough the errors were there.
It seemed more difficult than last time I ran into this, but last time I had Seamonkey going with multiple tabs and other sorts of things running at the same time, along with multiple console windows.
Oh, by the way, as my machine is an Intel D510MO, VGA is all I have. The monitor is running at 1440x900 pixels.
I tried the latest 2.21.0 drivers, on LVDS-1 as usual no problem, but when adding VGA-1 (and if running some video player) it segfault instead of just messing up rendering:
-> Back to 2.20.17
Tried with 2.21.2, same result -> segfault and X crash when playing video.
Back to 2.20.17
Can you please do 'addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so 0x22a0a 0x1df1a ; addr2line -e /usr/bin/X 0x8fb5c 0xd9b55 0x37e51 0x2695a'. Making sure you have the debug symbols.
addr2line -e /usr/lib/xorg/modules/drivers/intel_drv.so 0x22a0a 0x1df1a:
That's all I have for now, didn't yet succeed building X with debug.
intel(0): Failed to submit batch buffer, expect rendering corruption: Resource deadlock avoided.
appears on an Atom N450 netbook here, too, with libdrm2-2.4.42-100, xf86-video-intel-2.21.3-54.1, openSUSE 12.2. It is not necessary to play video to trigger this, and the effect it has in my case is that graphics operations become slow; no corruption, no crashes so far.
*** Bug 61717 has been marked as a duplicate of this bug. ***
Created attachment 75820 [details] [review]
Fix up fence counts
My belief is that the error in fence counting is magnified through the clear_relocs() function, and if true this patch should fix up the leak.
Glyph corruption has been absent for the past two days after I applied the patch, even after suspend to RAM. This seems to be fixed, thanks.
Created attachment 79029 [details] [review]
Avoid overcounting fences for self-relocs
I think this is the root cause of the miscounting issue. Please test without the other patch applied.
Empty windows and the message
(EE) intel(0): Failed to submit batch buffer, expect rendering corruption: Resource deadlock avoided.
just appeared with the second patch applied.
Second patch didn't fix it for me either.
Desktop: GNOME 3.8 (Fedora 19)
Patch applied on libdrm commit 040f6b015e
Has there been any progress? This corruption is happening very frequently for me now, on a Q35 system running Fedora 19, driver version 2.21.12. libdrm version is 2.4.46.
While I haven't seen a pattern in what triggers the bug, I do have a *lot* of Firefox tabs, over two windows.
The fix in libdrm lies unreviewed. In the meantime the default has changed to SNA which renders this code obsolete.
This bug currently affects Debian sid and probably also jessie, since it has the same versions of libdrm and the intel driver as are now in sid. See:
Like some others reporting problems, my system is an Atom n450 netbook and I'm using an external VGA display with it.
I can confirm that the first patch works for me, for details see https://bugs.freedesktop.org/show_bug.cgi?id=78000
After applying the patch to libdrm 2.4.52-1 (current debian unstable package) I immediately get after startup of the X server the expected error on stdout:
Fixing up fence counts; was -1, expected 0
I was however not experiencinf any glyph errors lately, but dri completely stopped working. After this patch it continues to work.
*** Bug 80096 has been marked as a duplicate of this bug. ***
*** Bug 86031 has been marked as a duplicate of this bug. ***
I've reviewed the fix over a year ago, but somehow forgotten to push it. Done now:
Author: Chris Wilson <firstname.lastname@example.org>
Date: Wed May 8 16:30:44 2013 +0100
intel: Avoid overcounting fences when emitting self-referential relocs
(In reply to Daniel Vetter from comment #29)
> I've reviewed the fix over a year ago, but somehow forgotten to push it.
> Done now:
> commit ec65f8d71eb3eb065c7cadf4153138435ac3b388
> Author: Chris Wilson <email@example.com>
> Date: Wed May 8 16:30:44 2013 +0100
> intel: Avoid overcounting fences when emitting self-referential relocs
Please look this: https://bugs.freedesktop.org/show_bug.cgi?id=86378
This is same or not?
(In reply to mikhail.v.gavrilov from comment #30)
> Please look this: https://bugs.freedesktop.org/show_bug.cgi?id=86378
> This is same or not?
No, that's a different bug. Most likely the lack of synchronisation between the compositor and X.