Created attachment 40982 [details] dmesg file System Environment: -------------------------- Arch: x86_64 Platform: piketon Libdrm: (master)2.4.22-21-g537703fd4805e9cd352965fce642670986822d22 Mesa: (master)05e534e6c4395269b1ca3a9694a1f437363dd186 Xserver: (server-1.9-branch)xorg-server-1.9.2.902 Xf86_video_intel: (master)2.13.901-25-g9b967807c2d240488a715509649663aac3583532 Kernel: (drm-intel-fixes) 1b39d6f37622f1da70aa2cfd38bfff9a52c13e05 Bug detailed description: ------------------------ The game hangs when change resolution or exit the game. It's not GPU hang. No issue happens if compiz disabled. It's kernel regression. Backtrace: #0 0x0000003c102d7dd8 in poll () from /lib64/libc.so.6 #1 0x00007fcd39cdb87a in _xcb_conn_wait (c=0x1a19ea0, cond=<value optimized out>, vector=0x0, count=0x0) at xcb_conn.c:306 #2 0x00007fcd39cdd57a in xcb_wait_for_event (c=0x1a19ea0) at xcb_in.c:437 #3 0x00007fcd3a146bc8 in _XReadEvents (dpy=0x1a18980) at xcb_io.c:342 #4 0x00007fcd3a13420f in XMaskEvent (dpy=0x1a18980, mask=131072, event=0x7fffaf26e170) at MaskEvent.c:75 #5 0x00007fcd3a74d3dc in ?? () #6 0x0000000001a25370 in ?? () #7 0x00007fcd3a14685f in _XFreeReplyData (dpy=0x1a18390, rep=0x7fffaf26e170, extra=0, discard=1) at xcb_io.c:490 #8 _XReply (dpy=0x1a18390, rep=0x7fffaf26e170, extra=0, discard=1) at xcb_io.c:648 #9 0x00007fcd3a142013 in XSync (dpy=0x1a23dc0, discard=0) at Sync.c:44 #10 0x0000000090000002 in ?? () #11 0x0000000001a18790 in ?? () #12 0x0000000001a18390 in ?? () #13 0x0000000090000002 in ?? () #14 0x00007fcd3a74fcb6 in ?? () #15 0x0000000000000000 in ?? () Bisect shows the first bad commit is 1b39d6f37622f1da70aa2cfd38bfff9a52c13e05. Author: Chris Wilson <chris@chris-wilson.co.uk> AuthorDate: Mon Dec 6 11:20:45 2010 +0000 Commit: Chris Wilson <chris@chris-wilson.co.uk> CommitDate: Tue Dec 7 22:46:11 2010 +0000 drm/i915/dp: Only apply the workaround if the select is still active As we may try to power down the link at various times, it is not necessarily still coupled with an encoder and so we must be careful not to depend upon an operation that is only valid when the link is still attached to a pipe. Fixes regression in 5bddd17. Reproduce steps: ---------------- 1. run ut2004 2. exit the game or change resolution in game "settins".
You're doing better than me, I'm hitting the glx DrawableGone crash in the xserver first...
Can you please do a 'echo t > /proc/sysrq-trigger' and grab the dmesg? Do you see a similar hang with just changing the resolution using xrandr?
Created attachment 41052 [details] dmesg with 'echo t > /proc/sysrq-trigger' I don't see a similar hang with just changing the resolution using xrandr.
It was a long shot. Lots of silly processes adding to the noise, we only managed to capture that compiz was idle (in poll()) and not what X was doing. Though it should be safe to conclude that ut2004 itself had finised.
I think I uncovered a related bug on drm-intel-next, where we are waiting for an IRQ with interrupts disabled during modesetting. The only question is how much of the fix is also applicable to -fixes and how on it might relate to this bug/bisection?
How widespread is the regression? Have you seen similar failures on the other stable platforms? (After reverting the PIPE_CONTROL removal) Do you see a similar failure on drm-intel-next (which in theory has the related bug fix)?
Tested on 965gm and capella with stable kernel(drm-intel-fixes), I don't see similar failures. Tested on pikteton with unstable kernel(drm-intel-next) that after reverting the PIPE_CONTROL removal, similar failure happens. BTW, use drm-intel-next(8d5203ca62539c6ab36a5bc2402c2de1de460e30) that before reverting the PIPE_CONTROL removal, similar failure also happens.
Thanks, what's the display connected to the piketon? Any DP?
I've not reproduced this so far on any platform, the closest to piketon I have is Arrandale+LVDS.
And I don't see it on SNB either. ;-)
VGA connected on piketon. I see the similar failures on SNB with VGA connected. But with DP connected, it works fine both on piketon and SNB.
* scratches head. I've got a VGA panel hooked up to the SNB as well. The bisection simply makes no sense, can I ask you to double check?
Even just to confirm that a revert of 1b39d6f3 fixes ut2004.
Confirm that the revert fixes ut2004.
Any chance this is related to: commit 541cc966915b6756e54c20eebe60ae957afdb537 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Dec 6 11:24:07 2010 +0000 drm: Don't try and disable an encoder that was never enabled Prevents code that assumes that the encoder is active when asked to be disabled from dying a horrible death. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com> i.e. does reverting that (which has been identified to cause other modesetting failures) help?
After revert 541cc966915b6756e54c20eebe60ae957afdb537, the failures still happens.
Chris, do you need access this machine?
I'm still no wiser as to why a change on what should be an unused code path (for this system) would be causing this regression. Nevertheless there have been the usual bug fixes, and in particular the uninterruptible modesetting fix.
Retest with latest code, it still happens on piketon and SNB.
(In reply to comment #19) > Retest with latest code, it still happens on piketon and SNB. Does Jesse's modesetting checks detect anything amiss? Can you update the dmesg (with drm.debug=0xe) and include one for the SNB? I'm still at a loss as to the cause here - this code should not even be touched for your non-DP system configuration! :|
Created attachment 44643 [details] ut2004.txt is dmesg infomation with drm.debug=0xe on sugarbay
So this is probably the cause of the SNB behaviour: [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... blt ring idle [waiting on 94239, at 94239], missed IRQ? I suspect the SNB has a comppletely different bug to piketon and should be filed separately.
Bo, please file a separate bug for SNB.
(In reply to comment #23) > Bo, please file a separate bug for SNB. I have reported a new Bug.Bug number: 35535
*crosses fingers* Is this fixed by: commit 31acbcc408f412d1ba73765b846c38642be553c3 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Apr 17 06:38:35 2011 +0100 drm/i915/dp: Be paranoid in case we disable a DP before it is attached Given that the hardware may be left in a random condition by the BIOS, it is conceivable that we then attempt to clear the DP_PIPEB_SELECT bit without us ever enabling/attaching the DP encoder to a pipe. Thus causing a NULL deference when we attempt to wait for a vblank on that crtc. Reported-and-tested-by: Bryan Christ <bryan.christ@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36314 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36456 Reported-and-tested-by: Bo Wang <bo.b.wang@intel.com> Cc: stable@kernel.org Signed-off-by: Keith Packard <keithp@keithp.com>
Tested on piketon and huronriver, no issue happens. I need more testing to confirm this.
Closing.
Verified with drm-intel-next commit da3cc9202697a44057c1bd3ad685689375f1fe0c and drm-intel-fixes commit 2fb4e61d9471867677c97bf11dba8f1e9dfa7f7c.
Closing old verified+fixed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.