Bug 30073

Summary:

Jerkiness when using drm-intel-next and drm-intel-fixes

Product:

DRI

Reporter:

Sitsofe Wheeler <sitsofe>

Component:

DRM/Intel

Assignee:

Chris Wilson <chris>

Status:

CLOSED FIXED

QA Contact:

Severity:

normal

Priority:

medium

Version:

XOrg 6.7.0

Hardware:

Other

OS:

All

Whiteboard:

i915 platform:

i915 features:

Attachments:

Description	Flags
check the wait request is still pending before generating an error.	none

Description Sitsofe Wheeler 2010-09-07 16:16:29 UTC

Description of the problem:
When running ioquake on an EeePC 900 using drm-intel-next shows jerkiness every half a second. drm-intel-fixes shows jerkiness every 10 seconds.

Steps to reproduce:
1. Run ioquake .
2. Skip the intro movie.
3. Press ` to bring down the quake console.

Expected result:
Background of console to move smoothly.

Actual result:
Background of console jerks a lot (drm-intel-next) or jerks every 10 seconds (drm-intel-fixes). When jerking a lot there is no additional output in dmesg. When jerking every 10 seconds output like the following appears in dmesg:
[  206.025010] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU idle, missed IRQ.
[  216.245009] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU idle, missed IRQ.

Version information:
EeePC 900 (915GM)
Ubuntu 10.04
Kernel drm-intel-next: 2.6.36-rc3-00114-g23a0d9e
Kernel drm-intel-fixes: 2.6.36-rc3-00025-g8554048

Comment 1 Sitsofe Wheeler 2010-09-07 16:17:46 UTC

I forgot to mention both trees were from git://anongit.freedesktop.org/~ickle/drm-intel .

Comment 2 Chris Wilson 2010-09-08 00:37:07 UTC

The obvious question is whether hangcheck is doing its job properly and preventing a machine hang?


diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 59457e8..294361b 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1353,8 +1353,8 @@ void i915_hangcheck_elapsed(unsigned long data)
                dev_priv->hangcheck_count = 0;
 
                /* Issue a wake-up to catch stuck h/w. */
-               if (dev_priv->render_ring.waiting_gem_seqno |
-                   dev_priv->bsd_ring.waiting_gem_seqno) {
+               if (0&&(dev_priv->render_ring.waiting_gem_seqno |
+                       dev_priv->bsd_ring.waiting_gem_seqno)) {
                        DRM_ERROR("Hangcheck timer elapsed... GPU idle, missed I
                        if (dev_priv->render_ring.waiting_gem_seqno)
                                DRM_WAKEUP(&dev_priv->render_ring.irq_queue);

Comment 3 Sitsofe Wheeler 2010-09-08 13:49:03 UTC

With the patch in comment #3 the pauses every 10 seconds still occur in the latest drm-intel-fixes (2.6.36-rc3-00028-gc3add4b) but no message about the pause is logged in dmesg. Perhaps each pause is ever so slightly smaller than previously.

Comment 4 Chris Wilson 2010-09-08 14:35:20 UTC

So the periodic stall is due to "hotplug" polling: https://bugs.freedesktop.org/show_bug.cgi?id=29536

The fact that this now causes an error to be displayed is obviously ++bad.

Comment 5 Sitsofe Wheeler 2010-09-08 15:53:53 UTC

As mentioned in the previous comment, using
echo n >  /sys/module/drm_kms_helper/parameters/poll
resolves the problem seen in drm-intel-fixes. However, the regular half second stutters of drm-intel-next are not resolved by doing this.

Comment 6 Chris Wilson 2010-09-08 16:03:57 UTC

Created attachment 38571 [details] [review]
check the wait request is still pending before generating an error.

This should prevent the "missed irq" error when under heavy load (such as the load-detect polling on i8xx/i915).

Comment 7 Chris Wilson 2010-09-08 16:04:46 UTC

[It's in -staging as well.]

Comment 8 Sitsofe Wheeler 2010-09-09 00:50:07 UTC

With the latest -staging (which has the patch mentioned in comment #6 in it) the 10 second jerkiness when drm_kms_helper's poll is y is still there but the hangcheck messages no longer appear in dmesg.

Comment 9 Chris Wilson 2010-09-09 01:12:28 UTC

Perfect testing, just the result I was hoping for.

* moves to -fixes.

Comment 10 Chris Wilson 2010-09-09 01:13:08 UTC

Marking this bug as fixed. Sitsofe, please can you file a new bug for the -next stutter.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.