Bug 30073

Summary: Jerkiness when using drm-intel-next and drm-intel-fixes
Product: DRI Reporter: Sitsofe Wheeler <sitsofe>
Component: DRM/IntelAssignee: Chris Wilson <chris>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: medium    
Version: XOrg 6.7.0   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
check the wait request is still pending before generating an error. none

Description Sitsofe Wheeler 2010-09-07 16:16:29 UTC
Description of the problem:
When running ioquake on an EeePC 900 using drm-intel-next shows jerkiness every half a second. drm-intel-fixes shows jerkiness every 10 seconds.

Steps to reproduce:
1. Run ioquake .
2. Skip the intro movie.
3. Press ` to bring down the quake console.

Expected result:
Background of console to move smoothly.

Actual result:
Background of console jerks a lot (drm-intel-next) or jerks every 10 seconds (drm-intel-fixes). When jerking a lot there is no additional output in dmesg. When jerking every 10 seconds output like the following appears in dmesg:
[  206.025010] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU idle, missed IRQ.
[  216.245009] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU idle, missed IRQ.

Version information:
EeePC 900 (915GM)
Ubuntu 10.04
Kernel drm-intel-next: 2.6.36-rc3-00114-g23a0d9e
Kernel drm-intel-fixes: 2.6.36-rc3-00025-g8554048
Comment 1 Sitsofe Wheeler 2010-09-07 16:17:46 UTC
I forgot to mention both trees were from git://anongit.freedesktop.org/~ickle/drm-intel .
Comment 2 Chris Wilson 2010-09-08 00:37:07 UTC
The obvious question is whether hangcheck is doing its job properly and preventing a machine hang?


diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 59457e8..294361b 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1353,8 +1353,8 @@ void i915_hangcheck_elapsed(unsigned long data)
                dev_priv->hangcheck_count = 0;
 
                /* Issue a wake-up to catch stuck h/w. */
-               if (dev_priv->render_ring.waiting_gem_seqno |
-                   dev_priv->bsd_ring.waiting_gem_seqno) {
+               if (0&&(dev_priv->render_ring.waiting_gem_seqno |
+                       dev_priv->bsd_ring.waiting_gem_seqno)) {
                        DRM_ERROR("Hangcheck timer elapsed... GPU idle, missed I
                        if (dev_priv->render_ring.waiting_gem_seqno)
                                DRM_WAKEUP(&dev_priv->render_ring.irq_queue);
Comment 3 Sitsofe Wheeler 2010-09-08 13:49:03 UTC
With the patch in comment #3 the pauses every 10 seconds still occur in the latest drm-intel-fixes (2.6.36-rc3-00028-gc3add4b) but no message about the pause is logged in dmesg. Perhaps each pause is ever so slightly smaller than previously.
Comment 4 Chris Wilson 2010-09-08 14:35:20 UTC
So the periodic stall is due to "hotplug" polling: https://bugs.freedesktop.org/show_bug.cgi?id=29536

The fact that this now causes an error to be displayed is obviously ++bad.
Comment 5 Sitsofe Wheeler 2010-09-08 15:53:53 UTC
As mentioned in the previous comment, using
echo n >  /sys/module/drm_kms_helper/parameters/poll
resolves the problem seen in drm-intel-fixes. However, the regular half second stutters of drm-intel-next are not resolved by doing this.
Comment 6 Chris Wilson 2010-09-08 16:03:57 UTC
Created attachment 38571 [details] [review]
check the wait request is still pending before generating an error.

This should prevent the "missed irq" error when under heavy load (such as the load-detect polling on i8xx/i915).
Comment 7 Chris Wilson 2010-09-08 16:04:46 UTC
[It's in -staging as well.]
Comment 8 Sitsofe Wheeler 2010-09-09 00:50:07 UTC
With the latest -staging (which has the patch mentioned in comment #6 in it) the 10 second jerkiness when drm_kms_helper's poll is y is still there but the hangcheck messages no longer appear in dmesg.
Comment 9 Chris Wilson 2010-09-09 01:12:28 UTC
Perfect testing, just the result I was hoping for.

* moves to -fixes.
Comment 10 Chris Wilson 2010-09-09 01:13:08 UTC
Marking this bug as fixed. Sitsofe, please can you file a new bug for the -next stutter.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.