Summary: | [Arrandale] Hung WAIT_FOR_EVENT when running rss-glx-skyrocket | ||
---|---|---|---|
Product: | DRI | Reporter: | Philippe Troin <phil> |
Component: | DRM/Intel | Assignee: | Jesse Barnes <jbarnes> |
Status: | CLOSED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | ||
Version: | XOrg git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Description
Philippe Troin
2010-07-25 21:24:18 UTC
WAIT_FOR_EVENT hang. Did you notice anything else happening at the time, like a modeset change, dpms on/off, unplugging a monitor or two? Meh, I need to also include a full register dump in the error state. (In reply to comment #1) > WAIT_FOR_EVENT hang. Did you notice anything else happening at the time, like a > modeset change, dpms on/off, unplugging a monitor or two? Yes, during the "run", the DPMS Off kicked in. I don't if the hand occurred at the time the DPMS went Off, or when the DPMS when on (or in between for that matter). > Meh, I need to also include a full register dump in the error state. Do you want the output of intel_reg_dumper next time it happens? Created attachment 37403 [details] [review] trigger scanline wait at pipe off time I wonder if this patch helps? The intent is to trigger any outstanding scanline wait event before shutting off the pipe. When the pipe shuts off, it should end up stopping on the first line of the next frame, so hopefully this register programming is correct. (In reply to comment #3) > Created an attachment (id=37403) [details] > trigger scanline wait at pipe off time > > I wonder if this patch helps? The intent is to trigger any outstanding > scanline wait event before shutting off the pipe. When the pipe shuts off, it > should end up stopping on the first line of the next frame, so hopefully this > register programming is correct. I'm recompiling a kernel right now with this patch. I will report on its effect later. Anything you'd want if I notice a hang again? Thanks! Phil. Created attachment 37613 [details]
intel_reg_dumper after screen saver triggered GPU hang
after running xscreensaver-demo, GPU hangs with,
[drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
[drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -5 (awaiting 802356 at 799091)
and intel_reg_dumper was taken.
(In reply to comment #3) > Created an attachment (id=37403) [details] > trigger scanline wait at pipe off time > > I wonder if this patch helps? The intent is to trigger any outstanding > scanline wait event before shutting off the pipe. When the pipe shuts off, it > should end up stopping on the first line of the next frame, so hopefully this > register programming is correct. no, this patch doesn't help. GPU still hangs on xscreensaver-demo. tested on 2.6.35 kernel, xf86-video-intel-2.12.0, xorg-server-1.8.2 Can you also grab the error state of the hang with the patch applied so we can confirm the bug is identical? It'll be typical if fixing the WAIT_FOR_EVENT hang means we just hit mesa submitting an illegal op... Created attachment 37632 [details]
i915_error_state without i915-clear-scanline-wait.patch
error state after the GPU hang
Created attachment 37633 [details]
i915_error_state with i915-clear-scanline-wait.patch
error state after the GPU hang, with i915-clear-scanline-wait.patch
Created attachment 37634 [details]
intel_reg_dumper without the i915-clear-scanline-wait.patch
intel_reg_dumper after the GPU hang
Created attachment 37635 [details]
intel_reg_dumper with the i915-clear-scanline-wait.patch
intel_reg_dumper after the GPU hang, with the i915-clear-scanline-wait.patch
Created attachment 37688 [details] [review] My variant upon Jesse's idea. (Note this will only apply on top of my for-anholt series of pending patches.) (In reply to comment #12) > Created an attachment (id=37688) [details] > My variant upon Jesse's idea. > > (Note this will only apply on top of my for-anholt series of pending patches.) would you please give me something which applicable to stable kernel 2.6.35? I do not know how to dig up the so-called -anholt series patches. Created attachment 37717 [details]
i915_error_state.txt after chris's patch
Created attachment 37718 [details]
intel_reg_dumper.txt after chris's patch
Ok, this looks mighty dubious: 0x0903c15c: 0x79000002: 3DSTATE_DRAWING_RECTANGLE 0x0903c160: 0x00000000: top left: 0,0 0x0903c164: 0x00000000: bottom right: 0,0 0x0903c168: 0x00000000: origin: 0,0 And the hang is indicative that the batchbuffer is itself the cause. This hang is sufficiently different from the original WAIT_FOR_EVENT hang, and the 0x0 surface could be a vital clue to the original bug. GPU hang happens more often for 2.6.35 kernel. Basically, machine is unusable with 2.6.35. commit 85345517fe6d4de27b0d6ca19fef9d28ac947c4a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Nov 13 09:49:11 2010 +0000 drm/i915: Retire any pending operations on the old scanout when switching An old and oft reported bug, is that of the GPU hanging on a MI_WAIT_FOR_EVENT following a mode switch. The cause is that the GPU is waiting on a scanline counter on an inactive pipe, and so waits for a very long time until eventually the user reboots his machine. We can prevent this either by moving the WAIT into the kernel and thereby incurring considerable cost on every swapbuffers, or by waiting for the GPU to retire the last batch that accesses the framebuffer before installing a new one. As mode switches are much rarer than swap buffers, this looks like an easy choice. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28964 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29252 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.