Bug 51761 - [snb regression] Video playback pauses, missed IRQ on BLT!
Summary: [snb regression] Video playback pauses, missed IRQ on BLT!
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other Linux (All)
: high normal
Assignee: Daniel Vetter
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-05 14:12 UTC by James
Modified: 2017-07-24 23:01 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg log from session (37.47 KB, text/plain)
2012-07-05 14:12 UTC, James
no flags Details
full dmesg, up to "missed IRQ?" (84.31 KB, text/plain)
2012-07-05 14:13 UTC, James
no flags Details
dmesg output from 3.6.2, with [drm:__gen6_gt_force_wake_get] *ERROR* Force wake wait timed out (109.25 KB, text/plain)
2012-10-21 19:05 UTC, James
no flags Details

Description James 2012-07-05 14:12:41 UTC
Created attachment 63870 [details]
Xorg log from session

Occasionally while playing video (either in Totem or a Flash player in a web-page), playback and other elements on the screen pause. A message like

  [drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... blt ring idle [waiting on 651163, at 651163], missed IRQ?

appears in dmesg. Full dmesg and Xorg log attached.

Software:
xorg-x11-drv-intel-2.19.0-5.20120612.fc17.x86_64
libdrm-2.4.33-3.fc17.x86_64
libdrm-2.4.33-3.fc17.i686
mesa-dri-drivers-8.0.3-1.fc17.x86_64
Kernel from kernel-3.4.4-3.fc17 source

The only intel-specific xorg.conf.d entry is


Section "Device"
	Identifier "Card0"
	Driver "intel"
	Option "AccelMethod" "sna"
EndSection


although I've seen this happen with UXA acceleration as well.

Hardware is an Intel i7 2760QM processor, HD 3000 graphics.
Comment 1 James 2012-07-05 14:13:23 UTC
Created attachment 63871 [details]
full dmesg, up to "missed IRQ?"
Comment 2 Chris Wilson 2012-07-05 14:23:17 UTC
Well, well, well.

3.4 should be up to date with all the known workarounds for the missed IRQs. Did this issue only start occurring recently? Have you used any older kernels without this issue?
Comment 3 James 2012-07-05 14:36:55 UTC
(In reply to comment #2)
> Did this issue only start occurring recently? Have you used any older kernels
> without this issue?

Maybe, but I'm not entirely sure. Trawling back through the logs, the earliest I *noted* this was on 3.4.0. However that's as far back as my logs go. I could reinstall a 3.3-series kernel if that might help.
Comment 4 Chris Wilson 2012-07-05 14:44:24 UTC
Please do, it would be very very helpful to know if your system has always suffered, or if the more recent workarounds are not as effective.
Comment 5 James 2012-07-06 11:47:14 UTC
I've not seen it so far playing videos using kernel 3.3.8...
Comment 6 Chris Wilson 2012-08-11 13:12:53 UTC
From 3.3.8 to 3.4.8, things go south. Daniel, Ben does this match your expectations?
Comment 7 Daniel Vetter 2012-08-11 19:08:55 UTC
Well, semaphores=1 should have made these less likely. But since this is libva, which (iirc) still syncs in userspace this might not apply.

Can you please test what happens when you boot 3.4 with i915.semaphores=0 ?
Comment 8 Ben Widawsky 2012-08-12 06:38:37 UTC
Is it always the blt ring?
Comment 9 James 2012-08-12 07:46:25 UTC
Looking back in the logs:

All the i915_hangcheck messages I see in my logs are either of the form

[drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... blt ring idle [waiting on 2195097, at 2195097], missed IRQ?

with 3.4-series kernels, or more recently under the 3.5 series:

[drm:i915_hangcheck_ring_idle] *ERROR* Hangcheck timer elapsed... blitter ring idle

I'm now using kernel 3.5.1, I'll try i915.semaphores=0 at the next opportunity.
Comment 10 Daniel Vetter 2012-08-22 09:15:02 UTC
3.5 should dump the i915_error_state into debugfs if this happens, please attach that?
Comment 11 James 2012-09-12 20:23:22 UTC
(In reply to comment #10)
> 3.5 should dump the i915_error_state into debugfs if this happens, please
> attach that?

Sorry for the delay. After seeing the associated pause and message in dmesg (kernel 3.5.3), i915_error_state reports no error state collected.
Comment 12 Chris Wilson 2012-10-21 18:14:42 UTC
Hmm, perhaps try again with a more recent kernel? It should be dumping the error-state on the missed IRQ!
Comment 13 James 2012-10-21 18:48:48 UTC
(In reply to comment #12)
> Hmm, perhaps try again with a more recent kernel? It should be dumping the
> error-state on the missed IRQ!

Currently on 3.6.2. All I've seen so far is

  [drm:__gen6_gt_force_wake_get] *ERROR* Force wake wait timed out

Don't know if this is the same issue, though. Still nothing in i915_error_state -- do I have to enable a flag somewhere?
Comment 14 Chris Wilson 2012-10-21 19:02:04 UTC
Can you please attach your current dmesg? If that is a warning from only the first forcewake get, then it is harmless and a patch has already been applied.
Comment 15 James 2012-10-21 19:05:59 UTC
Created attachment 68882 [details]
dmesg output from 3.6.2, with [drm:__gen6_gt_force_wake_get] *ERROR* Force wake wait timed out

There are 9 such messages so far.

By the way, I am looking in the right place for i915_error_state (/sys/kernel/debug/dri/*/i915_error_state), yes?
Comment 16 Chris Wilson 2012-10-21 19:53:55 UTC
Ok, that's the more worrying forcewake warning, though I suspect the patch to increase the timeout only landed in 3.7, so maybe not as scary as it appears. Something to keep an eye on though.

However, the good news is that you have yet to see a missed IRQ. /sys/kernel/debug/dri/0/i915_error_state is the right file to read should you see another hang.
Comment 17 Chris Wilson 2012-12-12 16:19:51 UTC
So I think we have the original missed IRQ fixed. Only we now have the rc6 warnings, which are being tracked in bug 50619.

Closing for the original bug, but please do join the bug hunt on #50619.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.